我有一些要抓取的html。
<div class="content">
<strong> This is first content </strong> This is second content
<br />
<small>
<p>Something</p>
</small>
</div>
如何获得这是第二个内容和cheerio >>?
我有一些要抓取的html。
执行此操作:
<div class="content">
<strong>This is first content</strong><span>This is second content</span>
<br>
<small>
<p>Something</p>
</small>
</div>
像这样得到它
console.log($('.content :nth-child(2)').text())
正在工作的演示:https://jsfiddle.net/usmanmunir/vg1dqm3L/17/为您。
<div class="content">
<strong> This is first content </strong> <span class="toBeSelected">This is second content</span>
<br />
<small>
<p>Something</p>
</small>
</div>
此后,您可以通过这种方式选择文本。
$('div .toBeSelected').html()
使用nodeType
属性,即使您在<strong>
标记之前有文字,也可以解决您的问题>
<div class="content"> Before first content <strong> This is first content </strong> This is second content <br /> <small> <p>Something</p> </small> </div>
那么可能是
var cheerio = require("cheerio") const $ = cheerio.load('<div class="content">Before first content<strong> This is first content </strong> This is second content<br /><small><p>Something</p></small></div>'); var $outer = $("div.content").contents().filter(function() { return this.nodeType === 3; }); console.log($outer.text()); //"Before first content This is second content" $outer.each(function() { console.log($(this).text()); }); //"Before first content" //" This is second content"
检查它here
const cheerio = require('cheerio');
const $ = cheerio.load(`<div class="content">
<strong> This is first content </strong> This is second content
<br />
<small>
<p> Something </p>
</small>
</div>
`);
console.log($('div').html().replace(/\n/g, '').match(/<\/strong>(.*)<br>/)[1])
nodeType
属性,即使您在<strong>
标记之前有文字,也可以解决您的问题><div class="content">
Before first content
<strong> This is first content </strong> This is second content
<br />
<small>
<p>Something</p>
</small>
</div>
const cheerio = require('cheerio');
const $ = cheerio.load(`<div class="content">
<strong> This is first content </strong> This is second content
<br />
<small>
<p> Something </p>
</small>
</div>
`);
console.log($('div').html().replace(/\n/g, '').match(/<\/strong>(.*)<br>/)[1])