React - 使用正则表达式突出显示危险的SetInnerHTML 内的文本.工作不可靠

问题描述 投票:0回答:1

目标是突出显示危险的SetInnerHTML 内的文本部分(字符串)。因此,我尝试匹配 html 中所需的文本部分,并将其包装在具有适当样式的“span”中。我使用的以下代码适用于某些文本(html),完美无缺,但对于某些文本则根本不起作用。请在下面找到一个有效和无效的示例。

我的问题是:为什么正则表达式在某些情况下失败而在其他情况下工作?即使在所有情况下都有文本(“引用”)。

突出显示组件 JSX:

import React from "react";



class HighlightQuote extends React.Component {
  render = () => {

    //zitat is for getting rid of any quotation marks in the beginning or end.
    var zitat = this.props.quotes.map(x => x.replace(/^[“”"’()]+|[“”"’()]+$/g, ""));

    if (this.props.quotes.length === 0) {
      var highlightedHtml = this.props.newcontent

    }
    else {
      var zitat = this.props.quotes.map(x => x.replace(/^[“”"’()]+|[“”"’()]+$/g, ""));
      const regex = new RegExp(`(${zitat.join('|')})`, 'g');
      var highlightedHtml = this.props.content.replace(
          regex,
          '<span class="hl">$1</span>'
        );
       console.log ('highlightedHtml:');
       console.log (highlightedHtml);
    }


    return (
        <div className="reader" ref="test" dangerouslySetInnerHTML={{ __html: highlightedHtml }} />

    );
  };
}

export default HighlightQuote;

工作示例(console.log('突出显示的html')

<div class="post" id="post-17660">
  <p class="postcontents">
    <article>
      <div class="post-inside">
        <p>One of the things I have disliked the most about the crypto sector is the idea that people should &#x201C;hodl&#x201D; or &#x201C;hold on for dear life.&#x201D;</p>
        <p>I have written many times here at AVC that one should take profits when they are available and diversify an investment portfolio.</p>
        <p><span class="hl">The idea that an investor should hold on no matter what has always seemed ridiculous to me.</span></p>
        <p>Now, the crypto markets are in the eighth month of a long and painful bear market and we are starting to see some signs of capitulation, particularly in the assets that went up the most last year.</p>
        <p>Whether this is the long-awaited&#xA0;capitulation of the HODL crowd or not, I can&#x2019;t say.</p>
        <p>But capitulation would be a good thing for the crypto markets, releasing assets into the market that until now have been locked up by long-term&#xA0;holders.</p>
        <p><span class="hl">Until then it is hard to get excited about buying anything in crypto.</span></p>
      </div>
    </article>
  </p>
</div>

按预期突出显示的引文:

"The idea that an investor should hold on no matter what has always seemed ridiculous to me."

"Until then it is hard to get excited about buying anything in crypto."

失败示例(console.log('突出显示的html')

<div><article id="story" class="Story-story--2QyGh css-1j0ipd9"><header class="css-1qcpy3f e345g291"><p class="css-1789nl8 etcg8100"><a class="css-1g7m0tk" href="https://www.nytimes.com/column/new-sentences">New Sentences</a></p><div class="css-30n6iy e345g290"><div class="css-acwcvw"></div></div><figure class="ResponsiveMedia-media--32g1o ResponsiveMedia-sizeSmall--3092U ResponsiveMedia-layoutVertical--1pg1o ResponsiveMedia-sizeSmallNoCaption--n--T0 css-1hzd7ei"><figcaption class="css-pplcdj ResponsiveMedia-caption--1dUVu"></figcaption></figure></header><div class="css-18sbwfn StoryBodyCompanionColumn"><div class="css-1h6whtw"><p class="css-1i0edl6 e2kc3sl0"><em class="css-2fg4z9 ehxkw330">&#x2014; From Keith Gessen&#x2019;s second novel, &#x201C;A Terrible Country&#x201D; (Viking, 2018, Page 4). Gessen is also the author of &#x201C;All the Sad Young Literary Men&#x201D; and a founding editor of the journal n+1.</em></p><p class="css-1i0edl6 e2kc3sl0">All authors have signature sentence structures &#x2014; deep expressive grooves that their minds instinctively find and follow. (That previous sentence is one of mine: a simple declaration that leaps, after the break of a long dash, into an elaborate restatement.)</p><p class="css-1i0edl6 e2kc3sl0">Here is one of Keith Gessen&#x2019;s:</p><p class="css-1i0edl6 e2kc3sl0">&#x201C;As for me, I wasn&#x2019;t really an idiot. But neither was I not an idiot.&#x201D;</p><p class="css-1i0edl6 e2kc3sl0">&#x201C;I hadn&#x2019;t been yelling, I didn&#x2019;t think. But I hadn&#x2019;t not been yelling either.&#x201D;</p><p class="css-1i0edl6 e2kc3sl0">&#x201C;Cute cafes were not the problem, but they were also not, as I&#x2019;d once apparently thought, the opposite of the problem.&#x201D;</p></div><aside class="css-14jsv4e"><span></span></aside></div><div class="css-18sbwfn StoryBodyCompanionColumn"><div class="css-1h6whtw"><p class="css-1i0edl6 e2kc3sl0">Sentence structures are not simply sentence structures, of course &#x2014; they are miniature philosophies. Hemingway, with his blunt verbal bullets, is making a huge claim about the nature of the world. So is James Joyce, with his collages and frippery. So are Nikki Giovanni and Samuel Delany and Ursula K. Le Guin and John McPhee and Missy Elliott and Dr. Seuss and anyone else who converts thoughts into prose.</p><p class="css-1i0edl6 e2kc3sl0">Likewise, Keith Gessen&#x2019;s signature sentence structure &#x2014; &#x201C;not X, but also not not X&#x201D; &#x2014; suggests an entire worldview. It is a universe of in-betweenness, in which the most basic facts of life, the things we absolutely expect to understand, spill and scatter like toast crumbs into the gaps between the floorboards. It is a world of embarrassingly trivial category errors. The sentences above come from Gessen&#x2019;s new novel, &#x201C;A Terrible Country,&#x201D; the story of a 30-something American man who goes to Russia to care for his elderly grandmother. He falls into the gaps between huge concepts: youth and age, purpose and purposelessness, progress and stasis. He is not Russian but also not not Russian, not smart but also not not smart, not heroic but also not not heroic. Such is the way of the world. No matter how much we try, none of us is ever only one thing. None of us is ever pure.</p></div><aside class="css-14jsv4e"><span></span></aside></div><div class="bottom-of-article"><div class="css-k8fkhk"><p>Sam Anderson is a staff writer for the magazine.</p> <p><i>Sign up for </i><a href="http://www.nytimes.com/newsletters/magazine"><i>our newsletter</i></a><i> to get the best of The New York Times Magazine delivered to your inbox every week.</i></p></div><div class="css-3glrhn">A version of this article appears in print on , on Page 11 of the Sunday Magazine with the headline: From Keith Gessen&#x2019;s &#x2018;A Terrible Country&#x2019;<span>. <a href="http://www.nytreprints.com/">Order Reprints</a> | <a href="http://www.nytimes.com/pages/todayspaper/index.html">Today&#x2019;s Paper</a> | <a href="https://www.nytimes.com/subscriptions/Multiproduct/lp8HYKU.html?campaignId=48JQY">Subscribe</a></span></div></div><span></span></article></div>

应突出显示的引用:

"Sentence structures are not simply sentence structures, of course — they are miniature philosophies"
regex reactjs replace highlight
1个回答
0
投票

正则表达式匹配失败的原因是 html 实体。 angerlySetInnerHTML 中的一些解析文本使用了实体引用。在上面的失败示例中,引号包含一个“—”字符,该字符在 html 中被解码为

&#x2014;

为了摆脱 html 实体,我使用了“he”库 https://github.com/mathiasbynens/he 一个用 JavaScript 编写的强大的 HTML 实体编码器/解码器。

 var contentDecoded = he.decode(this.props.content);

 var highlightedHtml = contentDecoded.replace(
    regex,
    '<span class="annotator-hl">$1</span>'
 );
© www.soinside.com 2019 - 2024. All rights reserved.