自然排序pandas中的数据框列[重复]

Question

这个问题在这里已有答案：

Naturally sorting Pandas DataFrame 2回答

我想将自然排序顺序应用于pandas DataFrame中的列。我想要排序的列可能包含重复项。我已经看到了相关的Naturally sorting Pandas DataFrame 问题，但它是关于排序索引，而不是任何列。

例

df = pd.DataFrame({'a': ['a22', 'a20', 'a1', 'a10', 'a3', 'a1', 'a11'], 'b': ['b5', 'b2', 'b11', 'b22', 'b4', 'b1', 'b12']})

     a    b
0  a22   b5
1  a20   b2
2   a1  b11
3  a10  b22
4   a3   b4
5   a1   b1
6  a11  b12

自然排序列a：

     a    b
0   a1  b11
1   a1   b1
2   a3   b4
3  a10  b22
4  a11  b12
5  a20   b2
6  a22   b5

自然排序列b：

     a    b
0   a1   b1
1  a20   b2
2   a3   b4 
3  a22   b5
4   a1  b11
5  a11  b12
6  a10  b22

Answer 1

您可以通过qazxsw poi将值转换为已排序类别的有序qazxsw poi，然后使用categorical：

natsorted

sort_values

Answer 2

import natsort as ns

df['a'] = pd.Categorical(df['a'], ordered=True, categories= ns.natsorted(df['a'].unique()))
df = df.sort_values('a')
print (df)
     a    b
5   a1   b1
2   a1  b11
4   a3   b4
3  a10  b22
6  a11  b12
1  a20   b2
0  a22   b5

和

df['b'] = pd.Categorical(df['b'], ordered=True, categories= ns.natsorted(df['b'].unique()))

df = df.sort_values('b')
print (df)
     a    b
5   a1   b1
1  a20   b2
4   a3   b4
0  a22   b5
2   a1  b11
6  a11  b12
3  a10  b22

Answer 3

我们可以使用正则表达式来提取列的文本和整数部分，然后使用它们进行排序。在函数中包装它可以让您轻松地分别为每个列执行此操作：

df.sort_values(by=['a'])

打印：

df.sort_values(by=['b'])

自然排序pandas中的数据框列[重复]

问题描述投票：0回答：3

3个回答

最新问题

自然排序pandas中的数据框列[重复]

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3