Element.text数据丢失

问题描述 投票:0回答:2

我正在编写网络抓取工具,该工具可以从运动网站收集数据。有一些表,我想将每个tr的文本写到数组中。在某些行中,无法获得全文。

在t = ...之后在断点处调试时>

element_table = WebDriverWait(driver, 20).until(
                EC.presence_of_all_elements_located((By.XPATH, '//table//tbody//tr')))

for count, e in enumerate(element_table):
    if count > 3:
        line = e.text.splitlines()
        t = e.text

在调试器中,e是

text= {str} 'Salzburg\n4-3-1-2\n57%\n2 1.42\n14/4\n28.57%\n594/489\n82.32%\n66.7\n130\n12/43/75\n108\n38/48/22\n210/85\n40.48%' 

但是当我看t时>

t = {str} 'Salzburg\n4-3-1-2\n2 1.42\n14/4\n594/489\n66.7\n130\n108\n210/85',

那么element.text不会让我得到tr中的所有文本吗?它也只发生在几行上。

无效的行,然后有效的行:

<tr>
<td>Salzburg</td>
            <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>4-3-1-2</em><small>57%</small></span></td>
            <td class="Index__video-cell___s1IHu"><span class="Index__stat-wrapper___n5jnZ">2</span><div class="Index__video-cell-icon___3Pnub"></div></td>
            <td class="Index__simple-cell-widget___1BYWx"><span class="Index__stat-wrapper___n5jnZ">1.42</span></td>
            <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>14/4</em><small> 28.57%</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
            <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>594/489</em><small> 82.32%</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
            <td class="Index__simple-cell-widget___1BYWx"><span class="Index__stat-wrapper___n5jnZ">66.7</span></td>
            <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>130</em><small>12/43/75</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
            <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>108</em><small>38/48/22</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
            <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>210/85</em><small> 40.48%</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
            </tr>

        <tr>
        <td>Sturm Graz</td>
        <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>3-4-3</em><small>80%</small></span></td>
        <td class="Index__video-cell___s1IHu"><span class="Index__stat-wrapper___n5jnZ">3</span><div class="Index__video-cell-icon___3Pnub"></div></td><td class="Index__simple-cell-widget___1BYWx"><span class="Index__stat-wrapper___n5jnZ">1.73</span></td>
        <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>14/7</em><small> 50%</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
        <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>484/400</em><small> 82.64%</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
        <td class="Index__simple-cell-widget___1BYWx"><span class="Index__stat-wrapper___n5jnZ">49.41</span></td><td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>128</em><small>9/50/69</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
        <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>101</em><small>33/50/18</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
        <td class="Index__video-cell-widget___3PDlg"><span class="Index__stat-wrapper___n5jnZ"><em>228/87</em><small> 38.16%</small></span><div class="Index__video-cell-icon___3Pnub"></div></td>
</tr>

我正在编写网络抓取工具,该工具可以从运动网站收集数据。有一些表,我想将每个tr的文本写到数组中。在某些行中,无法获得全文。 ...

python selenium
2个回答
0
投票

好吧,我无法重现您使用python 2.7.10报告的问题。如果我推测的话,您提到过您稍后会在调试器中查看“ t”……其他代码是否操纵了“ t”?

我还建议,如果要拆分每一行的所有不同组件,则应将那些“ em”和“ small”元素称为单独的实体。这是一些演示代码:


0
投票

[我最初以为是Webdriver,所以我使用了Firefox。解决方案是使用InnerHTML。

© www.soinside.com 2019 - 2024. All rights reserved.