[检查属性是否正确，然后从使用Xpath获得文本

Question

我是使用Xpath的新手，我们使用一种软件来读取excel文档，该文档会创建基于HTML的文档。文档代码如下所示，

代码：

<tr height=17 style='height:12.75pt'>
  <td height=17 class=xl153961 style='height:12.75pt'></td>
  <td colspan=2 class=xl773961 dir=LTR width=124 style='width:93pt'>Stat.No.</td>
  <td colspan=2 class=xl773961 dir=LTR width=184 style='width:138pt'>Origin</td>
  <td colspan=3 class=xl773961 dir=LTR width=205 style='width:154pt'>Description</td>
  <td class=xl773961 dir=LTR width=67 style='width:50pt'>Qty</td>
  <td class=xl773961 dir=LTR width=56 style='width:42pt'>kg tot</td>
  <td colspan=2 class=xl773961 dir=LTR width=88 style='width:66pt'>Price</td>
 </tr>
 <tr height=17 style='height:12.75pt'>
  <td height=17 class=xl153961 style='height:12.75pt'></td>
  <td class=xl153961></td>
  <td class=xl153961></td>
  <td class=xl153961></td>
  <td class=xl153961></td>
  <td colspan=3 class=xl773961 width=205 style='width:154pt'>Outdoor clothes</td>
  <td class=xl783961 width=67 style='width:50pt'>3</td>
  <td class=xl793961 width=56 style='width:42pt'>0,09</td>
  <td colspan=2 class=xl793961 width=88 style='width:66pt'>55,50</td>
 </tr>
 <tr height=17 style='height:12.75pt'>
  <td height=17 class=xl153961 style='height:12.75pt'></td>
  <td colspan=2 class=xl773961 width=124 style='width:93pt'>42032990</td>
  <td colspan=2 class=xl773961 width=184 style='width:138pt'>China</td>
  <td colspan=3 class=xl773961 width=205 style='width:154pt'>Outdoor clothes</td>
  <td class=xl783961 width=67 style='width:50pt'>1</td>
  <td class=xl793961 width=56 style='width:42pt'>0,17</td>
  <td colspan=2 class=xl793961 width=88 style='width:66pt'>134,95</td>
 </tr>
 <tr height=17 style='height:12.75pt'>
  <td height=17 class=xl153961 style='height:12.75pt'></td>
  <td colspan=2 class=xl773961 width=124 style='width:93pt'>61033300</td>
  <td colspan=2 class=xl773961 width=184 style='width:138pt'>China</td>
  <td colspan=3 class=xl773961 width=205 style='width:154pt'>Outdoor clothes</td>
  <td class=xl783961 width=67 style='width:50pt'>1</td>
  <td class=xl793961 width=56 style='width:42pt'>0,60</td>
  <td colspan=2 class=xl793961 width=88 style='width:66pt'>110,31</td>
 </tr>

我设法创建了一个Xpath，它将在给定的文本字符串下查找，然后提取数据值。此软件还使用Xpath 1.0。

代码：

/html/body/div/table/tr[position() > count(/html/body/div/table/tr[contains(.,'Description')]/preceding-sibling::tr)+1]/td[position() = count(/html/body/div/table/tr/td[contains(.,'Description')]/preceding-sibling::td)+1]

我有这个问题，文档有时会有拆分的列，如下图所示。

当软件创建HTML文档时，它会像下面的示例一样添加1个额外的空列。

代码：

<tr height=17 style='height:12.75pt'>
<td height=17 class=xl153961 style='height:12.75pt'></td>
<td class=xl153961></td>
<td class=xl153961></td>
<td class=xl153961></td>
<td class=xl153961></td>
<td colspan=3 class=xl773961 width=205 style='width:154pt'>Outdoor clothes</td>
<td class=xl783961 width=67 style='width:50pt'>3</td>
<td class=xl793961 width=56 style='width:42pt'>0,09</td>
<td colspan=2 class=xl793961 width=88 style='width:66pt'>55,50</td>
</tr>

因此，当我使用上述Xpath时，它会在description列下看到一个空单元格，这不是实际情况。如你们在屏幕截图中所见，在描述标题下有一个描述。

因此，我最初想要做的是检查td是否为空，如果不为空，则提取该值。我确实创建了一些东西，但是没有用（我认为这是不正确的）。

代码：

/html/body/div/table/tr[position() > count(/html/body/div/table/tr[contains(.,'Description')]/preceding-sibling::tr)+1]/td [concat(substring(position() = count(/html/body/div/table/tr/td[contains(.,'Description')/preceding-sibling::td)+1),1,number(substring-after(/*/td, 'colspan') * string-length($1))]

然后，我尝试检查等于3的colspan属性。我尝试了以下在SO中找到的以下代码，但它们均无作用。

代码：

string(//*[@colspan="3"])
/table/tr/td[@colspan=3]/following-sibling::td[1]
//tr/td[@colspan=3]/following-sibling::text()[1]

不仅以上我尝试了这里建议的其他许多Xpath，但无法使它们工作。

然后我尝试使用与下面的类似的东西使一次不为空，也没有给我任何幸福的结局;）

代码

/table/tr/td[text()='One']/following-sibling::td[1]

我需要弄清楚这一点，但我目前陷入困境。是否可能有人将我解开或给我一些有关如何做的建议。

Answer 1

如果我理解正确，此xpath将帮助您找到非空元素，对吧？

//td[@style and (text() != '')]

[检查属性是否正确，然后从使用Xpath获得文本

问题描述投票：0回答：1

1个回答

最新问题

[检查属性是否正确，然后从 使用Xpath获得文本

问题描述 投票：0回答：1

1个回答

最新问题

[检查属性是否正确，然后从使用Xpath获得文本

问题描述投票：0回答：1