zlib decompress()仅解码第一行

问题描述 投票:0回答:1

我有一本名为idf的大词典(超过1000个条目),我想将所有values()都存储在压缩的.txt文件中。这是我的代码

for key in idf:
    data = str(idf[key])
    compressed_index = zlib.compress(data.encode('ISO-8859-1'))
    with open(current_inverted_index, "ab") as my_file:
        my_file.write(compressed_index)

压缩结束后,新的.txt文件的大小为(443MB),前几行如下所示:

xúãÆVw∞∞4422–5000T∑R®6¨≠’Q@76âõa»òÉdå
1ıò[Ädåå±»XÇm16Ø≠ç»5%xúµö;í$7DØ2ûúUI+O7êØ–I∫ª*¿ÍŸ e ÿâÆÆÊH$‡˛˘˜/øˇ1W.%˝öR ø¸ˆıwÓ˘«W˘Ò’r˚Ò’≥>µÙ|Sè΢3ü9?flŸzõ=_}üûØj~Fw[ˇ¸Û„ÎÁ…ã&/Îô/€3$Ø©   ûm<Oœ–RüwVüOYsñ¢üÆtŒdl”‘ûÍ≥â¢aââü+ÈÎTı]}˛ˆŒoµŸÁ≈*œS”j5˜Á'zπ,Ϋú}uΩy˛ïgãıUM;KM¥kç2ôb…ûS6z墰C¶—≤CfºÊ‘Ψ ßû◊ÛzvΩ÷÷üÔÊ–Óµ@≈J˘±ã
Œê5ô∂i3üˆ®≤áu-•a1¥idÎÁœé(œÎ5t»G∂≈pY†>/ ÁÿƱ-„π≠◊pgùXBF¿8≤ZÔ2∏æörÔ‘ÃM ÂC3wY.ëÕ≤∆Á%ØÈI√≥ÜcJπ0∑À'‡ê7ãòM€œ$EP.Cèì¡v^\î"h◊.§Oç∆ûmîcTNÙA>¬äX¸¸⁄øÏŸáfî€eú<RÑ#-)º6ç%ë≤î∆∏_‰flv∆U&hMªl⁄•5·Iß4˘F«`7µ»öz©ïõ&û†l{ àô™–ê˜5C9—ì „<Ò˚“óflxÄ&_ÍÙQv¿jÄÔ∂Ê©og¡£ˆ4N¿d&SZùwêf^5§**MññÛ≤≥¿;V"º-Èg]üÜöZ®]ú∏RÚër ºÍ¬‰ ˜®ÀÎ3>Ÿ’ÆX´:öv£äCKăFÇÏäâ4:µòß≠,‡<Ü9'rîàπ1ê»i|∑πç™∞¥;

我正在尝试测试编码,但我只将字典中第一个键的第一个值返回为b"[{'AP891220-0001': {1}}, {'AP891220-0034': {512}}, {'AP891220-0073': {311}}, {'AP891220-0078': {231}}, {'AP891220-0079': {137}}]",这是我的解码代码:

f = open('inverted_indexes/id_1.txt', 'rb')
decompressed_data = zlib.decompress(f.read())
print(decompressed_data)

我不确定是什么问题,为什么我只解码一小部分.txt文件而不是全部内容,所以解码

python utf-8 compression zlib
1个回答
0
投票

[使用pickle(不安全)或json之类的序列化库一次压缩整个词典)]

import zlib
import pickle

index = 'index.txt'

idf = dict(zip('abcdefghijklmnop',range(16)))

compressed_index = zlib.compress(pickle.dumps(idf))
with open(index, 'wb') as my_file:
    my_file.write(compressed_index)

with open(index, 'rb') as f:
    decompressed_data = zlib.decompress(f.read())
print(pickle.loads(decompressed_data))
import zlib
import json

index = 'index.txt'

idf = dict(zip('abcdefghijklmnop',range(16)))
compressed_index = zlib.compress(json.dumps(idf).encode())
with open(index, 'wb') as my_file:
    my_file.write(compressed_index)

with open(index, 'rb') as f:
    decompressed_data = zlib.decompress(f.read()).decode()
print(json.loads(decompressed_data))
© www.soinside.com 2019 - 2024. All rights reserved.