如何在python中合并两个整数列

问题描述 投票:1回答:3

我想将2个具有整数的列值与它们之间的'_组合起来,并将其设置为我的输出数据集的索引列。 “ ID”将是我的索引。

样本数据:

inp

import pandas as pd
import numpy as np
import io

data = '''
ID,Ang,1
23,0,0.88905321
23,10,0.962773412
23,20,1.004187813
23,30,1.008301223
105,0,0.334209544
105,10,0.39043363
105,20,0.434241204
105,30,0.460348427
47,0,0.020669404
47,10,0.032299446
47,20,0.050602654
47,30,0.073371391
'''
df = pd.read_csv(io.StringIO(data),index_col=0)

预期输出:

out

python pandas
3个回答
3
投票

将索引和列转换为字符串并通过_进行联接,并且DataFrame.pop也用于提取列,因此DataFrame.pop不必要:

drop

或使用df.index = df.index.astype(str) + '_' + df.pop('Ang').astype(str)

DataFrame.set_index

DataFrame.set_index

如果还要索引名称df = df.set_index(df.index.astype(str) + '_' + df.pop('Ang').astype(str)) 设置为print (df) 1 23_0 0.889053 23_10 0.962773 23_20 1.004188 23_30 1.008301 105_0 0.334210 105_10 0.390434 105_20 0.434241 105_30 0.460348 47_0 0.020669 47_10 0.032299 47_20 0.050603 47_30 0.073371

ID

对于第二个解决方案,请使用df.index.name

df.index = df.index.astype(str) + df.pop('Ang').astype(str)
df.index.name = 'ID'

编辑:

如果存在具有DataFrame.rename_axis值的浮点数,请先尝试将其转换为整数:

DataFrame.rename_axis

如果无法转换为整数,则可能的原因之一是缺少值:

df = (df.set_index(df.index.astype(str) + '_' + df.pop('Ang').astype(str))
        .rename_axis('ID'))
print (df)
               1
ID              
23_0    0.889053
23_10   0.962773
23_20   1.004188
23_30   1.008301
105_0   0.334210
105_10  0.390434
105_20  0.434241
105_30  0.460348
47_0    0.020669
47_10   0.032299
47_20   0.050603
47_30   0.073371

大熊猫0.24+的一种可能的解决方案是将.0转换为df.index = (df.index.astype('int').astype(str) + '_' + df.pop('Ang').astype('int').astype(str))

print (df)
        Ang         1
ID                   
23.0    0.0  0.889053
23.0   10.0  0.962773
23.0   20.0  1.004188
23.0   30.0  1.008301
105.0   0.0  0.334210
105.0  10.0  0.390434
105.0  20.0  0.434241
105.0  30.0  0.460348
47.0    NaN  0.020669
NaN    10.0  0.032299
47.0   20.0  0.050603
NaN     NaN  0.073371

或将缺失值替换为一些整数,例如integer na,然后将所有值转换为整数:

Int64

2
投票

您可以做:

df.index = (df.index.astype('Int64').astype(str) + '_' + 
            df.pop('Ang').astype('Int64').astype(str))

print (df)
                1
23_0     0.889053
23_10    0.962773
23_20    1.004188
23_30    1.008301
105_0    0.334210
105_10   0.390434
105_20   0.434241
105_30   0.460348
47_nan   0.020669
nan_10   0.032299
47_20    0.050603
nan_nan  0.073371

输出

-1

2
投票

让我们尝试使用df.index = (df.index.fillna(-1).astype('int').astype(str) + '_' + df.pop('Ang').fillna(-1).astype('int').astype(str)) print (df) 1 23_0 0.889053 23_10 0.962773 23_20 1.004188 23_30 1.008301 105_0 0.334210 105_10 0.390434 105_20 0.434241 105_30 0.460348 47_-1 0.020669 -1_10 0.032299 47_20 0.050603 -1_-1 0.073371 并使用f字符串列出理解(需要Python 3.6 +):

# this is only needed as you set index_col = 0
df = df.reset_index()

# you could keep the columns by removing the call to drop
df = df.set_index(df[['ID', 'Ang']].astype(str).apply('_'.join, axis=1)).drop(['ID', 'Ang'], axis=1)

print(df)

输出:

               1
23_0    0.889053
23_10   0.962773
23_20   1.004188
23_30   1.008301
105_0   0.334210
105_10  0.390434
105_20  0.434241
105_30  0.460348
47_0    0.020669
47_10   0.032299
47_20   0.050603
47_30   0.073371
© www.soinside.com 2019 - 2024. All rights reserved.