用定界符分隔列并删除扩展列

问题描述 投票:1回答:2

我想知道是否有一种方法可以通过定界符分割列,然后删除扩展列。目前,这是我正在尝试执行的操作,但未按我想要的方式工作。

import pandas as pd

df = {'ID': [3009, 129,119,120,121 ],
  'temp': ['75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0'],
  'Prob': [1,1,0.8,0.8056,0.9]}

df = pd.DataFrame(df)


       ID    Prob           temp
0    3009  1.0000       75.0~54.0
1     129  1.0000       75.0~54.0  
2     119  0.8000       75.0~54.0  
3     120  0.8056       75.0~54.0  
4     121  0.9000       75.0~54.0  
5     122  0.8050       75.0~54.0  

df['temp','temp2'] = = df['temp'].str.split('~', expand=True)

我的目标是用定界符将其拆分,然后将新列添加到现有数据帧(df):

       ID    Prob        temp   temp2
0    3009  1.0000       75.0    54.0
1     129  1.0000       75.0    54.0  
2     119  0.8000       75.0    54.0  
3     120  0.8056       75.0    54.0  
4     121  0.9000       75.0    54.0  
5     122  0.8050       75.0    54.0  

以便我可以删除temp2列

python pandas dataframe split delimiter
2个回答
2
投票

您可以为拆分创建索引(这样就不必处理temp2列):

df['temp'] = df['temp'].str.split('~', expand=True)[0]
print(df)

打印:

     ID  temp    Prob
0  3009  75.0  1.0000
1   129  75.0  1.0000
2   119  75.0  0.8000
3   120  75.0  0.8056
4   121  75.0  0.9000

1
投票

如果想从数据框中删除列,可以尝试使用str.split(),然后使用.drop()

import pandas as pd
import numpy as np

data = {'ID': [3009, 129,119,120,121 ],
  'temp': ['75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0','75.0~54.0'],
  'Prob': [1,1,0.8,0.8056,0.9]}
df = pd.DataFrame(data)
df['temp~'] = df['temp'].str.split('~')
df['temp_1'] = df['temp~'].str.get(0)
df = df.drop(columns=['temp~'])
print(df)

输出:

     ID       temp    Prob temp_1
0  3009  75.0~54.0  1.0000   75.0
1   129  75.0~54.0  1.0000   75.0
2   119  75.0~54.0  0.8000   75.0
3   120  75.0~54.0  0.8056   75.0
4   121  75.0~54.0  0.9000   75.0
© www.soinside.com 2019 - 2024. All rights reserved.