将字符串列中的数字存储为对象并转换为字符串Python pandas

问题描述 投票:0回答:1

我有一个变量存储为数据框中的对象,该数据框是一个字符串字段,也包含某些行的数字:

ID  Var1
1   abcd
2   eftg
3   -1234-
4   zxct

如何删除ID 3的号码并将其替换为其他字母或留空?期望的输出:

ID  Var1
1   abcd
2   eftg
3   
4   zxct

要么

ID  Var1
1   abcd
2   eftg
3   aaaa
4   zxct

我试图将Var1存储为字符串:

df['Var1'] = df['Var1'].astype(str)

但它不起作用,我错过了什么?

非常感谢你的帮助

python-3.x string type-conversion sklearn-pandas
1个回答
0
投票

您可以使用列表推导来检查每个列条目的类型并替换非字符串。要替换仅包含数字字符的字符串,我建议使用Series.str.isnumeric()。

import pandas as pd

# replace everything but strings with empty strings
df =pd.DataFrame({'Var1':['aa', 'bb', 12, 'cc']}, index=[1,2,3,4]) # create dataframe
is_no_string = [not isinstance(val, str) for val in df.Var1] # check wether value is no string
df.loc[is_no_string] = '' # replace values that contain no strings with empty strings

# replace every string consisting only of numeric characters with empty string
df =pd.DataFrame({'Var1':['aa', 'bb', '12', 'cc']}, index=[1,2,3,4]) # create dataframe
is_numeric = df.Var1.str.isnumeric() # check whether all characters in each string are numeric
df.loc[is_numeric] = '' # replace numeric strings with empty strings

我认为df['Var1'] = df['Var1'].astype(str)可以很好地将数字转换为字符串(查看以下代码的输出)。要获得数据帧列的单个托管类型,请访问这些元素并对它们使用type()。

df =pd.DataFrame({'Var1':['aa', 'bb', 12, 'cc']}, index=[1,2,3,4]) # create dataframe

print(df.Var1.dtype) # columns containing strings are stored as objects 
print(type(df.Var1[1])) # those objects can contain strings and numbers
print(type(df.Var1[3]))
print(df.Var1.to_numpy())

df.Var1 = df.Var1.astype(str)
print(df.Var1.dtype) # column is still object
print(type(df.Var1[1])) 
print(type(df.Var1[3])) # but integer was changed to string
print(df.Var1.to_numpy())

# Output
object
<class 'str'>
<class 'int'>
['aa' 'bb' 12 'cc']

object
<class 'str'>
<class 'str'>
['aa' 'bb' '12' 'cc']
© www.soinside.com 2019 - 2024. All rights reserved.