所以基本上,我在MySQL数据库中有一个base64编码的PDF数据,我想操纵该数据(更新PDF文件数据的表单字段),此后不创建/写入PDF文件,我想存储该操纵/更新的数据。数据存入数据库。如下所示的Python代码。
这里我正在使用PyPDF2并且代码正在运行
import base64, io, PyPDF2
try:
data_dict = '{"firstName": "John", "lastName": "Joe"}'
encodedDataOfPDF = base64.b64decode(data) #base64 encoded data of pdf from database
file = io.BytesIO(encodedDataOfPDF)
pdfReader = PyPDF2.PdfFileReader(file)
pdfWriter = PyPDF2.PdfFileWriter()
pdfWriter.appendPagesFromReader(pdfReader)
#Here form fields of PDF gets updated.
pdfWriter.updatePageFormFieldValues(pdfWriter.getPage(0), data_dict)
#If I uncomment below code then it will create a PDF file with updated data.
#But I Don't want a PDF File,
#I just need the base64 encoded data of that updated/manipulated file which I will store in the Database.
# with open(data[1], 'wb') as f:
# pdfWriter.write(f)
except Exception as e:
app.logger.info(str(e))
注意:请同时阅读代码中的注释
谢谢。
经过大量研究后,我得到了一种正确的方法来获取更新/经操纵的编码数据,称为流。
import base64, io, PyPDF2
try:
tempMemory = io.BytesIO() #Added BytesIO
data_dict = '{"firstName": "John", "lastName": "Joe"}'
encodedDataOfPDF = base64.b64decode(data) #base64 encoded data of pdf from database
file = io.BytesIO(encodedDataOfPDF)
pdfReader = PyPDF2.PdfFileReader(file)
pdfWriter = PyPDF2.PdfFileWriter()
pdfWriter.appendPagesFromReader(pdfReader)
#Here form fields of PDF gets updated.
pdfWriter.updatePageFormFieldValues(pdfWriter.getPage(0), data_dict)
pdfWriter.write(tempMemory)
newFileData = tempMemory.getvalue()
newEncodedPDF= base64.b64encode(newFileData) # Here I get what I want.
except Exception as e:
app.logger.info(str(e))
我得到了base64编码的数据,但没有生成PDF文件。
谢谢