VBA替换文本而不格式化

问题描述 投票:0回答:2

在docx文件中,我有很多文本可能包含数学对象的特定符号(mml)。这些符号附有一定的分隔符(MATHSTART和MATHEND),以便可以查找它们。

例:

MATHSTART<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" alttext="x equals StartFraction negative b plus-or-minus StartRoot b squared minus 4 a c EndRoot Over 2 a EndFraction">
  <semantics>
    <mrow>
      <mi>x</mi>
      <mo>=</mo>
      <mrow class="MJX-TeXAtom-ORD">
        <mfrac>
          <mrow>
            <mo>−<!-- − --></mo>
            <mi>b</mi>
            <mo>±<!-- ± --></mo>
            <mrow class="MJX-TeXAtom-ORD">
              <msqrt>
                <msup>
                  <mi>b</mi>
                  <mrow class="MJX-TeXAtom-ORD">
                    <mn>2</mn>
                  </mrow>
                </msup>
                <mo>−<!-- − --></mo>
                <mn>4</mn>
                <mi>a</mi>
                <mi>c</mi>
              </msqrt>
            </mrow>
          </mrow>
          <mrow>
            <mn>2</mn>
            <mi>a</mi>
          </mrow>
        </mfrac>
      </mrow>
    </mrow>
    <annotation encoding="application/x-tex">x={-b\pm {\sqrt {b^{2}-4ac}} \over 2a}</annotation>
  </semantics>
</math>MATHEND

我现在想要使用marcos来获取所有这些部分并从文档中删除它们,然后在没有分隔符(MATHSTART和MATHEND)且没有格式(类似于wdFormatPlainText)的情况下将它们重新放入。期望的结果是docx中的数学方程式。

到目前为止我有什么:

Dim regex As Object, wholeDocText As String
Set regex = CreateObject("VBScript.RegExp")
Selection.WholeStory
Selection.Copy
wholeDocText = Selection.Text

With regex
    .Pattern = "MATHSTART[.\s\S]*?MATHEND"
    .Global = True
End With

Set matches = regex.Execute(wholeDocText)

For Each match In matches
    s1 = Replace(match.Value, "MATHSTART", "")
    s1 = Replace(s1, "MATHEND", "")

    'select match.Value in the document
    'overwrite the selected string with the new one
    'sth like: Selection.Text s1(wdFormatPlainText)
Next match

问题是三行评论,我不知道如何实施。

xml vba replace docx
2个回答
0
投票

注意:我使用了answer中的搜索和选择功能

Sub convertMmlToWordField()

Dim StartWord As String, EndWord As String
Dim FindStartRange As Range, FindEndRange As Range
Dim CopyRange As Range, CopyStartRange As Range, CopyEndRange As Range

Set FindStartRange = ActiveDocument.Range
Set FindEndRange = ActiveDocument.Range
Set CopyRange = ActiveDocument.Range

StartWord = "MATHSTART"
EndWord = "MATHEND"

'Starting the Lookup for the starting word
With FindStartRange.find
    .Text = StartWord
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindAsk
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False

    'Execute the Lookup
    Do While .Execute
        If .Found = True Then
            Set CopyStartRange = FindStartRange
            CopyStartRange.Select

            'Setting the FindEndRange up for the remainder of the document beginning from the end of the StartWord 
            FindEndRange.Start = CopyStartRange.End
            FindEndRange.End = ActiveDocument.Content.End
            FindEndRange.Select

            'Setting the Find to look for the End Word
            With FindEndRange.find
                .Text = EndWord
                .Execute

                If .Found = True Then
                    Set CopyEndRange = FindEndRange
                    CopyEndRange.Select
                End If
            End With

            'Selecting the copy range
            CopyRange.Start = CopyStartRange.Start
            CopyRange.End = CopyEndRange.End
            CopyRange.Select

            s1 = Replace(CopyRange.Text, "MATHSTART", "")
            s1 = Replace(s1, "MATHEND", "")

            Selection.Text = s1
            Selection.Copy
            Selection.PasteAndFormat (wdFormatPlainText)
        End If      'Ending the If Find1stRange .Found = True
    Loop        'Ending the Do While .Execute Loop
End With    'Ending the Find1stRange.Find With Statement

End Sub

实际工作发生在这部分:

s1 = Replace(CopyRange.Text, "MATHSTART", "")
s1 = Replace(s1, "MATHEND", "")

Selection.Text = s1
Selection.Copy
Selection.PasteAndFormat (wdFormatPlainText)

CopyRange是从MATHSTART到MATHEND(包括)的整个字符串。所以你可以削减这些部分。剩下的是实际的MML,您现在可以将其输入Selection对象(替换文档中的字符串),然后复制并粘贴它而不进行格式化。


0
投票

如果您只想为每个RegEx.Match删除两个单词,则以下内容应该有效:

Sub TestMe()

    Dim regex As Object, wholeDocText As String
    Set regex = CreateObject("VBScript.RegExp")
    Selection.WholeStory
    Selection.Copy
    wholeDocText = Selection.Text

    With regex
        .Pattern = "MATHSTART[.\s\S]*?MATHEND"
        .Global = True
    End With

    Set matches = regex.Execute(wholeDocText)

    For Each match In matches
        s1 = Replace(match, "MATHSTART", "")
        s1 = Replace(s1, "MATHEND", "")
        match = s1
    Next match

End Sub

我从第一个.Value删除了Replace(),我添加了match = s1

© www.soinside.com 2019 - 2024. All rights reserved.