如何在字符串中查找多个子字符串匹配，更改子字符串附件

Question

我试图用ruby解析一串HTML，这个字符串包含多个<pre></pre>标签，我需要在每个元素之间找到并编码所有<和>括号。

Example: 

string_1_pre = "<pre><h1>Welcome</h1></pre>"

string_2_pre = "<pre><h1>Welcome</h1></pre><pre><h1>Goodbye</h1></pre>"

def clean_pre_code(html_string)
 matched = html_string.match(/(?<=<pre>).*(?=<\/pre>)/)
 cleaned = matched.to_s.gsub(/[<]/, "&lt;").gsub(/[>]/, "&gt;")
 html_string.gsub(/(?<=<pre>).*(?=<\/pre>)/, cleaned)
end

clean_pre_code(string_1_pre) #=> "<pre>&lt;h1&gt;Welcome&lt;/h1&gt;</pre>"
clean_pre_code(string_2_pre) #=> "<pre>&lt;h1&gt;Welcome&lt;/h1&gt;&lt;/pre&gt;&lt;pre&gt;&lt;h1&gt;Goodbye&lt;/h1&gt;</pre>"

只要html_string只包含一个<pre></pre>元素，这就有效，但如果有多个则不然。

我会接受一个利用Nokogiri或类似的解决方案，但无法想象如何让它做我想做的事。

如果您需要任何其他背景，请告诉我。

更新：只有Nokogiri才有可能，请参阅接受的答案。

Answer 1

@ zstrad44是的，你可以使用Nokogiri完成它。这是我从您的版本开发的代码版本，这将为您提供字符串中多个pre标记所需的结果。

def clean_pre_code(html_string)
  doc = Nokogiri::HTML(html_string)
  all_pre = doc.xpath('//pre')
  res = ""
  all_pre.each do |pre|
    pre = pre.to_html
    matched = pre.match(/(?<=<pre>).*(?=<\/pre>)/)
    cleaned = matched.to_s.gsub(/[<]/, "&lt;").gsub(/[>]/, "&gt;")
    res += pre.gsub(/(?<=<pre>).*(?=<\/pre>)/, cleaned)
  end
  res
end

我建议你阅读qazxsw poi，以便更好地理解我在代码中使用的方法。快乐的编码！希望我能提供帮助

如何在字符串中查找多个子字符串匹配，更改子字符串附件

问题描述投票：0回答：1

1个回答

最新问题

如何在字符串中查找多个子字符串匹配，更改子字符串附件

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1