我正在尝试使用多处理器解决方案通过暴力破解 md5,但是当我执行时它总是只使用一个处理器。我究竟做错了什么?希望得到反馈,谢谢。
import hashlib
import itertools
import string
import time
import multiprocessing
def crack(ii, numb_core, char_set, cracked_hashes, solution_found, processor_lower_bound, processor_upper_bound):
# reads hashes from text file of hashes given from passwords in assignment and assembles list with \n removed
fout = open("cracked_matches.txt", 'w')
fin = open("hashes.txt", 'r')
hashes_list = fin.readlines()
hashes_list = [hash.strip() for hash in hashes_list]
# not needed anymore, close for security
fin.close()
# event allows all processes to stop once a solution is found to avoid unnecessary computation
# starts counting time, process_time_ns used to measure strictly computation time and avoid float inaccuracies
time_start = time.process_time_ns() * 1e-9
# loops organized by the string length allotted to a given processor, every character in the character set
# is iterated for every possible length and combination of characters
while len(cracked_hashes) < 8:
for i in range(processor_lower_bound, processor_upper_bound + 1):
print(f"{ii} {numb_core} doing {i} {char_set}")
for char in itertools.product(char_set, repeat=i):
# every char is combined with every char i times to get string of i length and every variation of chars
if solution_found.is_set():
return 0
crack_attempt = ''.join(char)
# initializes md5 hash with the combination of chars encoded as bytes (utf-8 is friendly with all OS)
m = hashlib.md5()
m.update(bytes(crack_attempt, encoding='utf-8'))
# checks the char combination against the hash list, if correct and unique ends the process
# and outputs answer with time
if m.hexdigest() in hashes_list and (crack_attempt, m.hexdigest()) not in cracked_hashes:
time_end = time.process_time_ns() * 1e-9
print("Password Cracked: " + crack_attempt + "\tIn " + str(time_end - time_start) + " seconds"
"\nHash: " + m.hexdigest())
# hash and password pair are saved and written to file
cracked_hashes.add((crack_attempt, m.hexdigest()))
#return cracked_hashes
# in case of failure, processes are still stopped
print("Unable to crack")
if __name__ == "__main__":
# hexdigits chosen because MD5 doesn't include punctuation to minimize iterations
# replace function gets rid of spaces
char_set = string.hexdigits.replace(string.whitespace, '')
# initializes cracked hash list to avoid repeating findings
cracked_hashes = set()
solution_found = multiprocessing.Event()
# file is created to log cracked passwords and matching hashes
# list of processes
processes = []
# loop starts multiple processes per execution of crack function, every execution will crack one password,
# log it, and then end the processes
for num_core in range(multiprocessing.cpu_count()):
# distributes hash lengths across processes to divide labor, when a hash is found, that length
# is skipped to prevent redundancy in computation and concentrate cores on uncracked lengths
processor_lower_bound = (len(cracked_hashes) + num_core + 1)
processor_upper_bound = processor_lower_bound + 1
# initializes process
process = multiprocessing.Process(target=crack,
args=(len(cracked_hashes), num_core, char_set, cracked_hashes, solution_found,
processor_lower_bound, processor_upper_bound,))
processes.append(process)
process.start()
for process in processes:
process.join()
我尝试过使用池和进程,进程运行得更好,请告诉我如何解决这个问题
您在 Python 中使用多处理进行 MD5 破解的方法是正确的,但需要解决几个关键方面,以确保有效利用多个处理器:
共享内存问题:在当前的实现中,每个进程独立工作,不与其他进程共享状态。 hacked_hashes 集不在进程之间共享。这一点至关重要,因为每个进程都需要了解其他进程破解的哈希值,以避免多余的工作。
动态工作分配:在处理器之间分配工作的方式(processor_lower_bound 和processor_upper_bound)似乎是静态的,并且可能无法有效地利用所有内核。这种静态划分可能会导致一些处理器提前完成工作并闲置。
事件处理:您对solution_found使用的multiprocessing.Event是正确的,但要确保它被有效地使用,以在哈希被破解后向所有进程发出停止信号。
文件处理:在多处理环境中打开文件应小心谨慎,以避免冲突。最好将结果返回给主进程,让其处理文件写入。
要解决这些问题,请考虑进行以下修改:
使用 Manager 实现共享状态:使用 multiprocessing.Manager() 为 hacked_hashes 创建共享状态。这允许所有进程查看并更新一组通用的破解哈希值。
动态工作分配:考虑使用队列,而不是静态地划分工作,其中任务(不同的字符长度或组合)在可用时动态分配给处理器。这可确保所有处理器始终被占用。
集中文件写入:将结果返回给主进程并在那里处理所有文件写入,避免冲突并确保线程安全。
这是您的代码的修订版本,其中包含以下注意事项:
import hashlib
import itertools
import string
import time
import multiprocessing
def crack(task_queue, result_queue, char_set, solution_found):
while not solution_found.is_set():
task = task_queue.get()
if task is None: # No more tasks
break
processor_lower_bound, processor_upper_bound = task
for i in range(processor_lower_bound, processor_upper_bound + 1):
for char in itertools.product(char_set, repeat=i):
if solution_found.is_set():
return
crack_attempt = ''.join(char)
m = hashlib.md5()
m.update(bytes(crack_attempt, encoding='utf-8'))
result_queue.put((crack_attempt, m.hexdigest()))
def main():
char_set = string.hexdigits.replace(' ', '')
task_queue = multiprocessing.Queue()
result_queue = multiprocessing.Queue()
solution_found = multiprocessing.Event()
# Create tasks
for i in range(1, 10): # Example range, adjust as needed
task_queue.put((i, i+1)) # Adjust task size as per your requirement
# Start processes
processes = []
for _ in range(multiprocessing.cpu_count()):
p = multiprocessing.Process(target=crack, args=(task_queue, result_queue, char_set, solution_found))
processes.append(p)
p.start()
# Process results and manage termination
cracked_hashes = set()
while len(cracked_hashes) < 8:
try:
attempt, hash_val = result_queue.get(timeout=10) # Adjust timeout as needed
if (attempt, hash_val) not in cracked_hashes:
cracked_hashes.add((attempt, hash_val))
print("Password Cracked: ", attempt, "Hash: ", hash_val)
with open("cracked_matches.txt", 'a') as fout:
fout.write(f"{attempt}: {hash_val}\n")
if len(cracked_hashes) >= 8:
solution_found.set()
break
except queue.Empty:
continue
# Signal processes to stop and wait for them to finish
for _ in range(multiprocessing.cpu_count()):
task_queue.put(None)
for p in processes:
p.join()
if __name__ == "__main__":
main()
此修订版本包括动态任务分配和结果共享队列,这将有助于有效利用所有可用处理器。确保根据您的具体要求测试和调整任务范围和超时。
根据添加的堆栈跟踪进行编辑:
带有消息“不允许跨线程访问控件”的 InvalidOperationException 表示您正在尝试从创建该控件的线程(通常是主 UI 线程)以外的线程访问或修改该控件。当尝试从后台线程更新 UI 元素时,这是 Windows 窗体应用程序中的常见问题。
在您的情况下,异常是由 AForge.Controls.VideoSourcePlayer 组件引发的,特别是在 Dermascope 类的 Disconnect 方法中。当从非 UI 线程的线程调用 VideoSourcePlayer 的 SignalToStop 方法时,会发生这种情况。
要解决此问题,您需要确保与 UI 控件(本例中为 VideoSourcePlayer)交互的代码在 UI 线程上执行。您可以使用控件的 Invoke 方法来封送对 UI 线程的调用。以下是您可以修改 Disconnect 方法来执行此操作的方法:
private void Disconnect()
{
if (this.InvokeRequired)
{
this.Invoke(new MethodInvoker(() => {
DisconnectInternal();
}));
}
else
{
DisconnectInternal();
}
}
private void DisconnectInternal()
{
if (videoSourcePlayer.VideoSource != null)
{
// stop video device
videoSourcePlayer.SignalToStop();
videoSourcePlayer.WaitForStop();
videoSourcePlayer.VideoSource = null;
if (videoDevice.ProvideSnapshots)
{
videoDevice.SnapshotFrame -= new NewFrameEventHandler(videoDevice_SnapshotFrame);
}
}
}
在此修改中,Disconnect 检查它是否是从 UI 线程以外的线程调用的。如果是这样,它使用 Invoke 在 UI 线程上调用 DisconnectInternal。 DisconnectInternal 包含 Disconnect 方法的原始逻辑。
将此模式应用到代码中与后台线程中的 UI 元素交互的任何其他位置。这应该可以解决跨线程操作异常。