如果我使用下面的代码,它将保留具有NaN的列(请参见附图)。我有其他类似的列。是否可以保留第二个而不是第一个?
data_final2 = data_final.loc[:, ~data_final.columns.duplicated()]
NaN
如果您只需要修复此特定情况,并且您知道所需的列没有NaN
s:
data_final2 = data_final.dropna(axis=1)
data_final.columns = ['Site_nan', 'Site', 'Dimensions_nan', 'Dimensions']
data_final2 = data_final[['Site', 'Dimensions']].copy()
groupby
列和选择first
值,这将忽略Nulls。
df.groupby(df.columns, 1).first()
import pandas as pd
import numpy as np
df = pd.DataFrame({'0': [1,2,3], '1': [np.NaN]*3, '2': [np.NaN]*3, '3': ['1x1', '2x2', '3x3']})
df.columns= ['Size', 'Size', 'Dims', 'Dims']
# Size Size Dims Dims
#0 1 NaN NaN 1x1
#1 2 NaN NaN 2x2
#2 3 NaN NaN 3x3
df.groupby(df.columns, 1).first()
# Dims Size
#0 1x1 1
#1 2x2 2
#2 3x3 3