在docx文件中,我有很多文本可能包含数学对象的特定符号(mml)。这些符号附有一定的分隔符(MATHSTART和MATHEND),以便可以查找它们。
例:
MATHSTART<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" alttext="x equals StartFraction negative b plus-or-minus StartRoot b squared minus 4 a c EndRoot Over 2 a EndFraction">
<semantics>
<mrow>
<mi>x</mi>
<mo>=</mo>
<mrow class="MJX-TeXAtom-ORD">
<mfrac>
<mrow>
<mo>−<!-- − --></mo>
<mi>b</mi>
<mo>±<!-- ± --></mo>
<mrow class="MJX-TeXAtom-ORD">
<msqrt>
<msup>
<mi>b</mi>
<mrow class="MJX-TeXAtom-ORD">
<mn>2</mn>
</mrow>
</msup>
<mo>−<!-- − --></mo>
<mn>4</mn>
<mi>a</mi>
<mi>c</mi>
</msqrt>
</mrow>
</mrow>
<mrow>
<mn>2</mn>
<mi>a</mi>
</mrow>
</mfrac>
</mrow>
</mrow>
<annotation encoding="application/x-tex">x={-b\pm {\sqrt {b^{2}-4ac}} \over 2a}</annotation>
</semantics>
</math>MATHEND
我现在想要使用marcos来获取所有这些部分并从文档中删除它们,然后在没有分隔符(MATHSTART和MATHEND)且没有格式(类似于wdFormatPlainText
)的情况下将它们重新放入。期望的结果是docx中的数学方程式。
到目前为止我有什么:
Dim regex As Object, wholeDocText As String
Set regex = CreateObject("VBScript.RegExp")
Selection.WholeStory
Selection.Copy
wholeDocText = Selection.Text
With regex
.Pattern = "MATHSTART[.\s\S]*?MATHEND"
.Global = True
End With
Set matches = regex.Execute(wholeDocText)
For Each match In matches
s1 = Replace(match.Value, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
'select match.Value in the document
'overwrite the selected string with the new one
'sth like: Selection.Text s1(wdFormatPlainText)
Next match
问题是三行评论,我不知道如何实施。
注意:我使用了answer中的搜索和选择功能
Sub convertMmlToWordField()
Dim StartWord As String, EndWord As String
Dim FindStartRange As Range, FindEndRange As Range
Dim CopyRange As Range, CopyStartRange As Range, CopyEndRange As Range
Set FindStartRange = ActiveDocument.Range
Set FindEndRange = ActiveDocument.Range
Set CopyRange = ActiveDocument.Range
StartWord = "MATHSTART"
EndWord = "MATHEND"
'Starting the Lookup for the starting word
With FindStartRange.find
.Text = StartWord
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
'Execute the Lookup
Do While .Execute
If .Found = True Then
Set CopyStartRange = FindStartRange
CopyStartRange.Select
'Setting the FindEndRange up for the remainder of the document beginning from the end of the StartWord
FindEndRange.Start = CopyStartRange.End
FindEndRange.End = ActiveDocument.Content.End
FindEndRange.Select
'Setting the Find to look for the End Word
With FindEndRange.find
.Text = EndWord
.Execute
If .Found = True Then
Set CopyEndRange = FindEndRange
CopyEndRange.Select
End If
End With
'Selecting the copy range
CopyRange.Start = CopyStartRange.Start
CopyRange.End = CopyEndRange.End
CopyRange.Select
s1 = Replace(CopyRange.Text, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
Selection.Text = s1
Selection.Copy
Selection.PasteAndFormat (wdFormatPlainText)
End If 'Ending the If Find1stRange .Found = True
Loop 'Ending the Do While .Execute Loop
End With 'Ending the Find1stRange.Find With Statement
End Sub
实际工作发生在这部分:
s1 = Replace(CopyRange.Text, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
Selection.Text = s1
Selection.Copy
Selection.PasteAndFormat (wdFormatPlainText)
CopyRange是从MATHSTART到MATHEND(包括)的整个字符串。所以你可以削减这些部分。剩下的是实际的MML,您现在可以将其输入Selection对象(替换文档中的字符串),然后复制并粘贴它而不进行格式化。
如果您只想为每个RegEx.Match删除两个单词,则以下内容应该有效:
Sub TestMe()
Dim regex As Object, wholeDocText As String
Set regex = CreateObject("VBScript.RegExp")
Selection.WholeStory
Selection.Copy
wholeDocText = Selection.Text
With regex
.Pattern = "MATHSTART[.\s\S]*?MATHEND"
.Global = True
End With
Set matches = regex.Execute(wholeDocText)
For Each match In matches
s1 = Replace(match, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")
match = s1
Next match
End Sub
我从第一个.Value
删除了Replace()
,我添加了match = s1
。