Python-如何提取包含特定字符的字符串python

问题描述 投票:0回答:2

我正在尝试仅提取一个包含$字符的字符串。基于我使用BeautifulSoup提取的输出的输入。

代码

price = [m.split() for m in re.findall(r"\w+/$(?:\s+\w+/$)*", soup_content.find('blockquote', { "class": "postcontent restore" }).text)]

输入

For Sale is my Tag Heuer Carrera Calibre 6 with box and papers and extras.
39mm
47 ish lug to lug
19mm in between lugs
Pretty thin but not sure exact height. Likely around 12mm (maybe less)
I've owned it for about 2 years. I absolutely love the case on this watch. It fits my wrist and sits better than any other watch I've ever owned. I'm selling because I need cash and other pieces have more sentimental value
I am the second owner, but the first barely wore it.
It comes with barely worn blue leather strap, extra suede strap that matches just about perfectly and I'll include a blue Barton Band Elite Silicone.
I also purchased an OEM bracelet that I personally think takes the watch to a new level. This model never came with a bracelet and it was several hundred $ to purchase after the fact.
The watch was worn in rotation and never dropped or knocked around.
The watch does have hairlines, but they nearly all superficial. A bit of time with a cape cod cloth would take care of a lot it them. The pics show the imperfections in at "worst" possible angle to show the nature of scratches.
The bracelet has a few desk diving marks, but all in all, the watch and bracelet are in very good shape.
Asking $2000 obo. PayPal shipped. CONUS.
It's a big hard to compare with others for sale as this one includes the bracelet.

输出应该是这样。

$2000

谢谢。

python
2个回答
0
投票

我会做类似的事情(提供的输入是您上面编写的字符串)-

price_start = input.find('$')
price = input[price_start:].split(' ')[1]

仅当有'1'而不是2时,才用[1]代替[0]

或者您也可以使用正则表达式-

re.findall('\S*\$\S*\d', input)

0
投票

您不需要正则表达式。相反,您可以遍历行和遍历每个单词以检查是否以'$'开头并提取单词:

[y for x in s.split('\n') for y in x.split() if y.startswith('$') and len(y) > 1]

其中s是您的段落。


0
投票

因为这很简单,您不需要正则表达式解决方案,所以应该满足:

words = text.split()
words_with_dollar = [word for word in words if '$' in word]
print(words_with_dollar)

>>> ['$', '$2000']

如果您不想单独使用美元符号,则可以添加这样的过滤器:

words_with_dollar = [word for word in words if '$' in word and '$' != word]
print(words_with_dollar)

>>> ['$2000']
© www.soinside.com 2019 - 2024. All rights reserved.