CSS Selector获取元素属性值

Question

HTML结构是这样的：

<td class='hey'> 
<a href="https://example.com">First one</a>
</td>

这是我的选择器：

m_URL = sel.css("td.hey a:nth-child(1)[href] ").extract()

我的选择器现在将输出<a href="https://example.com">First one</a>，但我只想输出链接本身：https://example.com。

我怎样才能做到这一点？

Answer 1

从::attr(value)标签获取a。

演示（使用Scrapy shell）：

$ scrapy shell index.html
>>> response.css('td.hey a:nth-child(1)::attr(href)').extract()
[u'https://example.com']

其中index.html包含：

<table>
    <tr>
        <td class='hey'>
            <a href="https://example.com">Fist one</a>
        </td>
    </tr>
</table>

Answer 2

你可以试试这个：

m_URL = sel.css("td.hey a:nth-child(1)").xpath('@href').extract()