python,我正在尝试替换列中正则表达式短语的所有示例,但被卡住了

问题描述 投票:0回答:2

所以框架的头部如下图所示。

我要删除<字符> <字符>等。

Name    Damage  Velocity    Mana    Use Time    Knockback   Sell
NaN Icicle Staff    12  11  6   29 (Average)    2 (Very Weak)   <span data-info="0"> <span data-info="0"> <spa...
NaN Plasma Rod  8   6   10  35 (Slow)   2.5 (Very Weak) <span data-info="0"> <span data-info="0"> <spa...
NaN Sky Glaze   15  15  8   24 (Fast)   3.50 (Weak) <span data-info="0"> <span data-info="0"> <spa...
NaN Wulfrum Staff   10  9   4   19 (Very Fast)  3 (Very Weak)   <span data-info="0"> <span data-info="0"> <spa...
NaN Aquamarine Staff    10  9   3   14 (Very Fast)  2.5 (Very Weak) <span data-info="0"> <span data-info="0"> <spa...

我尝试使用

wand_frame = wand_frame.replace('(<.+>)','')

wand_frame=wand_frame.replace('(\<.+\>)','')

但是它什么也没做。感谢帮助。

python pandas
2个回答
1
投票

替换为字符串,并且不识别REGEX语法,建议使用re.sub:

re.sub(pattern, repl, string, count=0, flags=0)

根据您的情况:

wand_frame = re.sub('(<.+>)','', wand_frame)

0
投票

如果要在<>之间清除字符,则可以使用下面的类似reg的代码,也要按照注释中的说明编号要提取的内容。

  import   re

  wand_frame  = 'NaN Icicle Staff    12  11  6   29 (Average)    2 (Very Weak)   <span 
  data-info="0"> <span data-info="0">\n' \
  'NaN Plasma Rod  8   6   10  35 (Slow)   2.5 (Very Weak) <span data-info="0"> <span 
   data-info="0">\n' \
  'NaN Plasma Rod  8   6   10  35 (Slow)   2.5 (Very Weak) <span data-info="0"> <span 
  data-info="0">'

  wand_frame  =    re.sub('<[a-zA-Z0-9\"\=\-\s\'\@\_\?\*\&\%\$]*>','',wand_frame)

  print (wand_frame)

输出

 NaN Icicle Staff    12  11  6   29 (Average)    2 (Very Weak)    
 NaN Plasma Rod  8   6   10  35 (Slow)   2.5 (Very Weak)  
 NaN Plasma Rod  8   6   10  35 (Slow)   2.5 (Very Weak)
© www.soinside.com 2019 - 2024. All rights reserved.