我正在用Java开发一个小程序,以将存储在Oracle数据库中的blob与远程磁盘上的文件进行大量比较。为了进行文件比较,我使用了md5哈希。我程序的奇怪行为是,每次下载同一块文件时,我都会得到不同的md5哈希值。我正在使用JDK 1.8和ojdb6.jar,我的Oracle版本是:Oracle数据库10g版本10.2.0.4.0这是我的md5校验和代码:
公共静态字符串getMD5Sum(String filePath)引发异常{
MessageDigest md = MessageDigest.getInstance("MD5");
try (InputStream is = Files.newInputStream(Paths.get(filePath))) {
DigestInputStream dis = new DigestInputStream(is, md);
int read = 0;
do{
read = dis.read();
}while(read > -1);
}
byte[] digest = md.digest();
digest.toString();
String result = "";
for (int i=0; i < digest.length; i++) {
result += Integer.toString( ( digest[i] & 0xff ) + 0x100, 16).substring( 1 );
}
return result;
}
这是我从数据库中获取Blob的方法:
public static FileBean getBlobAndData() throws Exception
{
Connection con = getConnection();
PreparedStatement pstmt = con.prepareStatement("select
s.doc_testo as path,r.img_referto as BINARY from storicoccsfse s,
soss.rd_refoasis4 r where s.id_doc_esterno = r.cod_centro ||
r.cod_scheda and s.id_doc_esterno = ?");
pstmt.setString(1, "RXC20100010024");
ResultSet rs = pstmt.executeQuery();
String path = "";
FileBean fileBean = new FileBean();
String fileChecksum = "";
File file = null;
while( rs.next() ) {
path = rs.getString("path");
Blob blob = rs.getBlob("BINARY");
long length = blob.length();
String remotefilename = rs.getString("path");
InputStream ins = blob.getBinaryStream();
File targetFile = new File("C:\\Temp\\whatever.pdf");
OutputStream outStream = new FileOutputStream(targetFile);
PdfReader pdfreader;
pdfreader = new PdfReader(ins);
PdfStamper pdfStamper = new PdfStamper(pdfreader, outStream);
pdfStamper.close();
pdfreader.close();
fileBean.setRemotefilename(remotefilename);
fileChecksum = getMD5Sum(targetFile.getPath());
}
fileBean.setDigest(fileChecksum);
return fileBean;
}
我尝试了多种下载blob并将其转换为pdf的方法,但是每次创建校验和时,我都会得到不同的值。使用plsql Developer,我已经使用Acrobat Reader打开了Blob,并且校验和与存储在磁盘上的文件是正确的。有什么想法吗?在此先感谢安德里亚
[将其写入outStream
时,PdfStamper
正在将额外信息添加到每次更改的PDF(例如,当前时间戳记)。这些额外的信息被记录到MD5哈希中,从而使其每次都更改。