如果我具有以下HTML结构
<section class="main-gallery homeowner-rating content-block">
<!--content-->
</section>
<section class="homeowner-rating content-block">
<!--content-->
</section>
<section class="homeowner-rating content-block">
<!--content-->
</section>
<section class="homeowner-rating content-block">
<!--content-->
</section>
我如何选择除第一个类别以外的所有homeowner-rating.content-block
类?
为了提供一些背景信息,我使用Nokogiri设置了一个简单的屏幕抓取功能,但它试图从第一节类中获取信息,这将返回空白结果。
def get_testimonials
url = 'http://www.ratedpeople.com/profile/lcc-building-and-construction'
doc = Nokogiri::HTML.parse(open url)
testimonial_section = doc.css('.homeowner-rating.content-block').each do |t|
title = t.css('h4').text.strip
comments = t.css('q').text.strip
author = t.css('cite').text.strip
end
end
使用当前设置,有多种方法:
.homeowner-rating+.homeowner-rating
{
color: red;
}
.homeowner-rating:not(.main-gallery)
{
color: red;
}
演示:http://jsfiddle.net/PKEv5/1/
这仅在主画廊是节点的第一个子项时才有效:
.homeowner-rating:not(:first-child)
{
color: red;
}
使用Nokogiri很容易:
require 'nokogiri'
doc = Nokogiri::HTML::DocumentFragment.parse(<<EOT)
<section class="main-gallery homeowner-rating content-block">
<p>1</p>
</section>
<section class="homeowner-rating content-block">
<p>2</p>
</section>
<section class="homeowner-rating content-block">
<p>3</p>
</section>
<section class="homeowner-rating content-block">
<p>4</p>
</section>
EOT
doc.css('.homeowner-rating')[1..-1].map(&:to_html)
# => ["<section class=\"homeowner-rating content-block\">\n <p>2</p>\n</section>",
# "<section class=\"homeowner-rating content-block\">\n <p>3</p>\n</section>",
# "<section class=\"homeowner-rating content-block\">\n <p>4</p>\n</section>"]
Nokogiri的search
,css
和xpath
方法将返回NodeSet,其行为类似于Array,因此您可以对结果进行切片以获取块。