我正在尝试提取一个受密码保护的.zip,它有一个.txt文件(对于这种情况,请说Congrats.txt
)。现在Congrats.txt
有文字,因此它的大小不是0kb。为了这个线程,它放在一个.zip(为了线程的名字,这个.zip zipv1.zip
),用密码dominique
。该密码存储在另一个.txt中的其他单词和名称中(为了这个问题,我们将其命名为file.txt
)。现在如果我通过执行python Program.py -z zipv1.zip -f file.txt
运行下面的代码(假设所有这些文件与Program.py
在同一个文件夹中)我的程序将dominique
显示为zipv1.zip
中file.txt
的其他单词/密码的正确密码并提取zipv1.zip
但Congrats.txt
是空的,大小为0kb。
现在我的代码如下:
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of file.txt.")
args = parser.parse_args()
def extract_zip(zip_filename, password):
try:
zip_file = zipfile.ZipFile(zip_filename)
zip_file.extractall(pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
# If the args are not used, it displays how to use them to the user.
print(parser.usage)
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Allows 8 instances of Python to be ran simultaneously.
with multiprocessing.Pool(8) as pool:
# "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
if __name__ == '__main__':
main(args.zip, args.file)
然而,如果我另一个拉链(zipv2.zip
)使用与zipv1.zip
相同的方法,只有差异是Congrats.txt
是在文件夹中,文件夹与Congrats.txt
一起压缩,我得到与zipv1.zip
相同的结果,但这次Congrats.txt
沿着它所在的文件夹提取,和Congrats.txt
完好无损;它中的文字和它的大小是完整的。
所以为了解决这个问题,我尝试阅读zipfile's documentation,在那里我发现如果密码与.zip不匹配,它会抛出一个RuntimeError
。所以我在代码中将except:
更改为except RuntimeError:
并在尝试解压缩zipv1.zip
时出现此错误:
(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv1.zip -f file.txt
[+] Password for the .zip: dominique
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
同样的结果令人愉快;密码在file.txt
中找到,zipv1.zip
被提取,但Congrats.txt
是空的,大小为0kb。所以我再次运行程序,但是对于zipv2.zip
这次并得到了这个结果:
(venv) C:\Users\USER\Documents\Jetbrains\PyCharm\Program>Program.py -z zipv2.zip -f file.txt
[+] Password for the .zip: dominique
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 16, in extract_zip
zip_file.extractall(pwd=password)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1594, in extractall
self._extract_member(zipinfo, path, pwd)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1649, in _extract_member
shutil.copyfileobj(source, target)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\shutil.py", line 79, in copyfileobj
buf = fsrc.read(length)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 876, in read
data = self._read1(n)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 966, in _read1
self._update_crc(data)
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 894, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 38, in <module>
main(args.zip, args.file)
File "C:\Users\USER\Documents\Jetbrains\PyCharm\Program\Program.py", line 33, in main
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\USER\AppData\Local\Programs\Python\Python37\lib\multiprocessing\pool.py", line 657, in get
raise self._value
zipfile.BadZipFile: Bad CRC-32 for file 'Congrats.txt'
同样,结果相同;文件夹被成功提取的地方,Congrats.txt
也用它里面的文字提取,它的大小完好无损。
我确实看了一下this类似的线程,以及this线程,但他们没有帮助。我也检查了zipfile's documentation,但对这个问题没有帮助。
我不确定我的问题是什么原因或如何解决它,并希望得到一些帮助。
编辑
现在实施with zipfile.ZipFile(zip_filename, 'r') as zip_file:
后出于一些未知和奇怪的原因;程序可以读取/处理一个小单词列表/密码列表/字典,但如果它的大(?)则不能。
我的意思是说,一个.txt文件存在于zipv1.zip
;用文本Congrats.txt
命名为You have cracked the .zip!
。 zipv2.zip
中也存在相同的.txt,但这次放在名为ZIP Contents
的文件夹中,然后压缩/密码保护。两个拉链的密码都是dominique
。
请注意,每个.zip都是使用qzxswpoi压缩方法和7zip中的Deflate
加密生成的。
现在密码在ZipCrypto
(35/52行)Line 35
和John The Ripper Jr.txt
用于Line 1968
(1968/3106行)。
现在,如果你在你的CMD(或你选择的IDE)中做John The Ripper.txt
;它将创建一个名为python Program.py -z zipv1 -f "John The Ripper Jr.txt"
的文件夹,并将Extracted
与我们之前设置的句子放在一起。同样适用于Congrats.txt
但zipv2
将位于Congrats.txt
文件夹中的ZIP Contents
文件夹中。在这种情况下提取.zips没有问题。
但是如果你在你的CMD(或你选择的IDE)中用Extracted
,即John The Ripper.txt
尝试相同的东西,它将创建两个拉链的python Program.py -z zipv1 -f "John The Ripper.txt"
文件夹;就像Extracted
一样,但这次John The Ripper Jr.txt
对于他们两个都是空的,原因不明。
我的代码和所有必要的文件如下:
Congrats.txt
import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack.", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()
def extract_zip(zip_filename, password):
try:
with zipfile.ZipFile(zip_filename, 'r') as zip_file:
zip_file.extractall('Extracted', pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
# If the args are not used, it displays how to use them to the user.
print(parser.usage)
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Allows 8 instances of Python to be ran simultaneously.
with multiprocessing.Pool(8) as pool:
# "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.starmap(extract_zip, [(zip, line.strip()) for line in txt_file])
if __name__ == '__main__':
# Program.py - z zipname.zip -f filename.txt
main(args.zip, args.file)
我不确定为什么会发生这种情况,无法在任何地方找到这个问题的答案。它完全不知道我能说什么,我找不到调试或解决这个问题的方法。
无论不同的单词/密码列表如何,这都会继续发生。尝试使用相同的John The Ripper v2.txt生成更多.zips,但使用不同的单词列表/密码列表/词典中的不同密码。同样的方法;使用了更大和更小版本的.txt,并获得了与上述相同的结果。
但我确实发现,如果我在Congrats.txt
中删除前两个单词并创建一个新的.txt;说John The Ripper.txt
;成功提取.zip,出现John The Ripper v2.txt
文件夹,Extracted
与其中的文本一起出现。因此我认为它与密码所在的行有关。所以在这种情况下Congrats.txt
;在Line 1968
之后脚本没有停止的地方?我不知道为什么这会起作用。这不是解决方案,而是我想要迈出解决方案的一步......
任何帮助都会受到重视。
编辑2
所以我尝试使用“池终止”代码:
Line 1968
现在如果我使用它,两个拉链都被成功提取,就像之前的实例一样。但这次import argparse
import multiprocessing
import zipfile
parser = argparse.ArgumentParser(description="Unzips a password protected .zip by performing a brute-force attack using", usage="Program.py -z zip.zip -f file.txt")
# Creates -z arg
parser.add_argument("-z", "--zip", metavar="", required=True, help="Location and the name of the .zip file.")
# Creates -f arg
parser.add_argument("-f", "--file", metavar="", required=True, help="Location and the name of the word list/password list/dictionary.")
args = parser.parse_args()
def extract_zip(zip_filename, password, queue):
try:
with zipfile.ZipFile(zip_filename, "r") as zip_file:
zip_file.extractall('Extracted', pwd=password)
print(f"[+] Password for the .zip: {password.decode('utf-8')} \n")
queue.put("Done") # Signal success
except:
# If a password fails, it moves to the next password without notifying the user. If all passwords fail, it will print nothing in the command prompt.
pass
def main(zip, file):
if (zip == None) | (file == None):
print(parser.usage) # If the args are not used, it displays how to use them to the user.
exit(0)
# Opens the word list/password list/dictionary in "read binary" mode.
txt_file = open(file, "rb")
# Create a Queue
manager = multiprocessing.Manager()
queue = manager.Queue()
with multiprocessing.Pool(8) as pool: # Allows 8 instances of Python to be ran simultaneously.
pool.starmap_async(extract_zip, [(zip, line.strip(), queue) for line in txt_file]) # "starmap" expands the tuples as 2 separate arguments to fit "extract_zip"
pool.close()
queue.get(True) # Wait for a process to signal success
pool.terminate() # Terminate the pool
pool.join()
if __name__ == '__main__':
main(args.zip, args.file) # Program.py -z zip.zip -f file.txt.
的zipv1.zip
完好无损;里面有消息。但同样的事情不能说Congrats.txt
仍然是空的。
很抱歉长时间的停顿......看起来你已经陷入了困境。
这种情况很复杂(我会说,离MCVE相当远),有许多事情可以归咎于这种行为。
从zipv1.zip / zipv2.zip不匹配开始。仔细看看,似乎zipv2也搞砸了。如果zipv1很容易找到(Congrats.txt是唯一的文件),对于zipv2,“ZIP Contents / Black-Large.png”的大小为0。
它可以与任何文件一起重现,甚至更多:它适用于zipv2.zip
返回的第一个条目(不是dir)。
因此,事情开始变得更加清晰:
查看尝试使用错误密码提取文件时抛出的异常,有3种类型(其中最后2种可以组合在一起):
我创建了自己的存档文件。为了保持一致性,我将从现在开始使用它,但一切都适用于任何其他文件。
code.朋友:
zf.namelist
输出:
#!/usr/bin/env python3 import sys import os import zipfile def main(): arc_name = sys.argv[1] if len(sys.argv) > 1 else "./arc0.zip" pwds = [ #b"dominique", #b"dickhead", b"coco", ] pwds = [item.strip() for item in open("orig/John The Ripper.txt.orig", "rb").readlines()] print("Unpacking (password protected: dominique) {:s}," " using a list of predefined passwords ...".format(arc_name)) if not os.path.isfile(arc_name): raise SystemExit("Archive file must exist!\nExiting.") faulty_pwds = list() good_pwds = list() with zipfile.ZipFile(arc_name, "r") as zip_file: print("Zip names: {:}\n".format(zip_file.namelist())) for idx, pwd in enumerate(pwds): try: zip_file.extractall("Extracted", pwd=pwd) except: exc_cls, exc_inst, exc_tb = sys.exc_info() if exc_cls != RuntimeError: print("Exception caught when using password ({:d}): [{:}] ".format(idx, pwd)) print(" {:}: {:}".format(exc_cls, exc_inst)) faulty_pwds.append(pwd) else: print("Success using password ({:d}): [{:}] ".format(idx, pwd)) good_pwds.append(pwd) print("\nFaulty passwords: {:}\nGood passwords: {:}".format(faulty_pwds, good_pwds)) if __name__ == "__main__": print("Python {:s} on {:s}\n".format(sys.version, sys.platform)) main()
看看[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q054532010]> "e:\Work\Dev\VEnvs\py_064_03.06.08_test0\Scripts\python.exe" code.py arc0.zip
Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)] on win32
Unpacking (password protected: dominique) arc0.zip, using a list of predefined passwords ...
Zip names: ['DummyFile0.txt', 'DummyFile1.txt', 'DummyFile2.txt']
Exception caught when using password (1189): [b'mariah']
<class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (1446): [b'zebra']
<class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Exception caught when using password (1477): [b'1977']
<class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Success using password (1967): [b'dominique']
Exception caught when using password (2122): [b'hank']
<class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (2694): [b'solomon']
<class 'zlib.error'>: Error -3 while decompressing data: invalid distance code
Exception caught when using password (2768): [b'target']
<class 'zlib.error'>: Error -3 while decompressing data: invalid block type
Exception caught when using password (2816): [b'trish']
<class 'zlib.error'>: Error -3 while decompressing data: invalid code lengths set
Exception caught when using password (2989): [b'coco']
<class 'zlib.error'>: Error -3 while decompressing data: invalid stored block lengths
Faulty passwords: [b'mariah', b'zebra', b'1977', b'hank', b'solomon', b'target', b'trish', b'coco']
Good passwords: [b'dominique']
代码,它试图提取所有成员。第一个引发异常,所以它开始更清楚为什么它的行为方式。但是,当试图使用2个错误的密码提取项目时,为什么会出现行为差异?
正如在两个不同的抛出异常类型的回溯中所看到的,答案位于ZipFile.extractall
的末尾。
经过更多的调查,事实证明这是因为
根据ZipFile.open
((最后)重点是我的):
3.1传统的PKWARE加密
最初的加密方案,通常称为PKZIP密码,由Roger Schaffely [1]设计。在[5]中,Biham和Kocher表明密码很弱并且证明了攻击需要13个字节的明文。已经开发了进一步的攻击,其中一些根本不需要用户提供的明文[6]。 PKZIP密码本质上是流密码,即通过生成伪随机密钥流并用明文对其进行异或来加密输入。密码的内部状态由三个32位字组成:key0,key1和key2。它们分别初始化为0x12345678,0x23456789和0x34567890。算法的核心步骤是使用单个输入字节更新三个键...
...
在加密存档中的文件之前,首先将12个随机字节预先附加到其压缩内容,然后对得到的字节流进行加密。在解密时,需要丢弃前12个字节。根据该规范,这样做是为了使对数据的明文攻击无效。该规范还规定,在12个前置字节中,只有前11个实际上是随机的,最后一个字节等于文件未压缩内容的CRC-32的高位字节。通过将解密的12字节头的最后一个字节与本地文件头中包含的实际CRC-32值的高位字节进行比较,可以快速验证给定的密码是否正确。这可以在解密文件的其余部分之前完成。
其他参考:
算法的弱点:由于差异仅在一个字节上完成,对于256个不同(并且经过仔细选择)的错误密码,将会有一个(至少)生成与正确密码相同的数字。
该算法丢弃了大多数错误的密码,但有一些却没有。
返回:尝试使用密码提取文件时:
从上面的输出可以看出,对于我的(.zip)文件,有8个密码搞砸了。注意:
这是基于我的.zip文件数据的测试:
[CacheFly.PKWare]: APPNOTE.TXT - .ZIP File Format Specification
我使用其他解包引擎(使用默认参数)进行了一些测试:
@ EDIT0:
我已经提交了>>> import zipfile
>>>
>>> zd_coco = zipfile._ZipDecrypter(b"coco")
>>> zd_dominique = zipfile._ZipDecrypter(b"dominique")
>>> zd_other = zipfile._ZipDecrypter(b"other")
>>> cipher = b'\xd1\x86y ^\xd77gRzZ\xee' # Member (1st) file cipher: 12 bytes starting from archive offset 44
>>>
>>> crc = 2793719750 # Member (1st) file CRC - archive bytes: 14 - 17
>>> hex(crc)
'0xa684c7c6'
>>> for zd in (zd_coco, zd_dominique, zd_other):
... print(zd, [hex(zd(c)) for c in cipher])
...
<zipfile._ZipDecrypter object at 0x0000021E8DA2E0F0> ['0x1f', '0x58', '0x89', '0x29', '0x89', '0xe', '0x32', '0xe7', '0x2', '0x31', '0x70', '0xa6']
<zipfile._ZipDecrypter object at 0x0000021E8DA2E160> ['0xa8', '0x3f', '0xa2', '0x56', '0x4c', '0x37', '0xbb', '0x60', '0xd3', '0x5e', '0x84', '0xa6']
<zipfile._ZipDecrypter object at 0x0000021E8DA2E128> ['0xeb', '0x64', '0x36', '0xa3', '0xca', '0x46', '0x17', '0x1a', '0xfb', '0x6d', '0x6c', '0x4e']
>>> # As seen, the last element of the first 2 arrays (coco and dominique) is 0xA6 (166), which is the same as the first byte of the CRC
,它已经关闭了分支3.6(仅在安全修复模式下)。不确定它的结果(在其他分支机构中),但无论如何,它不会很快就会出现(在接下来的几个月里,让我们说)。
作为替代方案,您可以下载修补程序,并在本地应用更改。检查[GitHub]: python/cpython - [3.6] bpo-36247: zipfile - extract truncates (existing) file when bad password provided (zip encryption weakness)(修补utrunner部分),了解如何在Win上应用补丁(基本上,每行以一个“+”符号开头,每一行以一个“ - ”符号开头)。我正在使用Cygwin,顺便说一句。 你可以将zip文件从Python的dir复制到你的项目(或一些“个人”)目录并修补该文件,如果你想保持你的Python安装原始的话。