如何使用 C# 中的正则表达式从 rtf 字符串中提取二进制图像代码? [已关闭]

问题描述 投票:0回答:1

我有这个 rtf 图像字符串:

{\pict{\*\picprop\shplid1025{\sp{\sn shapeType}{\sv 75}}{\sp{\sn fFlipH}{\sv 0}}{\sp{\sn fFlipV}{\sv 0}}{\sp{\sn fLockRotation}{\sv 0}}{\sp{\sn fLockAspectRatio}{\sv 1}}{\sp{\sn fLockPosition}{\sv 0}}{\sp{\sn fLockAgainstSelect}{\sv 0}}
{\sp{\sn fLockCropping}{\sv 0}}{\sp{\sn fLockVerticies}{\sv 0}}{\sp{\sn fLockAgainstGrouping}{\sv 0}}{\sp{\sn pictureGray}{\sv 0}}{\sp{\sn pictureBiLevel}{\sv 0}}{\sp{\sn fFilled}{\sv 0}}
{\sp{\sn fNoFillHitTest}{\sv 0}}{\sp{\sn fLine}{\sv 0}}{\sp{\sn wzName}{\sv \u1056\'3f\u1080\'3f\u1089\'3f\u1091\'3f\u1085\'3f\u1086\'3f\u1082\'3f 1}}{\sp{\sn dhgt}{\sv 251658240}}{\sp{\sn fHidden}{\sv 0}}{\sp{\sn fLayoutInCell}{\sv 1}}}
\picscalex36\picscaley36\piccropl0\piccropr0\piccropt0\piccropb0\picw6879\pich6964\picwgoal3900\pichgoal3948\pngblip\bliptag-1175992069{\*\blipuid b9e7c8fbb3e14fcb3dc35ca2b0b6a03f}
89504e470d0a1a0a0000000d49484452000001450000014908060000000cb63f26000000017352474200aece1ce90000000467414d410000b18f0bfc61050000
000970485973000012740000127401de661f78000022bf49444154785eeddd0b8c55e5d5fff10741ae8232805ca6828c2297ca188494815186b6d6a216b56049
5b06696a04ad56191a943a686d046da50e9a682ab65adbd1482f83b660064d5404ac9a205651c0a232202072919b5c05deffac93d57f9e9c3efb396bbfef394e
a1df4ff2c4bd76cedefb5cf65938c9f96535fb9f460e00907192fe1700d088a608001e9a220078688a00e0a129028087a608001e9a220078688a00e0a1290280
87a608001e9a220078a2d9e7c99327eb566e478e1c712d5ab4d02ad9ba75eb5cefdebdb58ab33ed67a6db17af56ad7bf7f7fade23efef863d7bd7b77ade2e6ce
9dab5bf993e6fd2fc4f5ad162d5ae49e7efa69ade2d6ae5debfaf4e9a355fee4fbf56fdfbedd5557576b15d7ae5d3b575353a3555c213ed3193366b86ddbb669
15b777ef5ed7be7d7bad926dd9b2c575ebd64dabb834dfa9a6ec13c2f49e4a534c3269d2246998a6d5b66ddbe0feec55565616dc1f5a43870e0dee00101010101000000040000002701ffff030000000000}}}{\rtlch\fcs1 \af1\afs16 \ltrch\fcs0 
\f1\fs16\lang1058\langfe1049\langnp1058\langfenp1049\insrsid13721686 \cell 

我需要从中得到:

89504e470d0a1a0a0000000d49484452000001450000014908060000000cb63f26000000017352474200aece1ce90000000467414d410000b18f0bfc61050000
000970485973000012740000127401de661f78000022bf49444154785eeddd0b8c55e5d5fff10741ae8232805ca6828c2297ca188494815186b6d6a216b56049
5b06696a04ad56191a943a686d046da50e9a682ab65adbd1482f83b660064d5404ac9a205651c0a232202072919b5c05deffac93d57f9e9c3efb396bbfef394e
a1df4ff2c4bd76cedefb5cf65938c9f96535fb9f460e00907192fe1700d088a608001e9a220078688a00e0a129028087a608001e9a220078688a00e0a1290280
87a608001e9a220078a2d9e7c99327eb566e478e1c712d5ab4d02ad9ba75eb5cefdebdb58ab33ed67a6db17af56ad7bf7f7fade23efef863d7bd7b77ade2e6ce
9dab5bf993e6fd2fc4f5ad162d5ae49e7efa69ade2d6ae5debfaf4e9a355fee4fbf56fdfbedd5557576b15d7ae5d3b575353a3555c213ed3193366b86ddbb669
15b777ef5ed7be7d7bad926dd9b2c575ebd64dabb834dfa9a6ec13c2f49e4a534c3269d2246998a6d5b66ddbe0feec55565616dc1f5a43870e0dee00101010101000000040000002701ffff030000000000}

但是可能有不同的图像 rtf 表示形式,所以我需要从通用图像中获取二进制代码。

附注图片的二进制码被剪掉了,因为full的字符太多了

所以,看来我需要一些可以从图像的//pict RTF标签二进制表示中提取的正则表达式。

c# regex rtf
1个回答
-1
投票

一个可能的解决方案是基于文件幻数来检测图像(参见此处

在你的例子中,我们可以看到你的图像是一个

.png
,因为它以
89 50 4e 47
开头,所以你可以编写这个正则表达式
\b(?:89504e47|ffd8ffe0)[a-zA-Z0-9\s]+\b
,它将适用于
png
jpeg
=>在这里测试https ://regex101.com/r/r8sS4E/1

当然,您可以调整正则表达式的第一部分以适应您可能的图像格式

© www.soinside.com 2019 - 2024. All rights reserved.