将数据帧值转换为python 3中的范围值

问题描述 投票:1回答:3

我有一个包含值的数据框:

3.05
35.97
49.11
48.80
48.02
10.61
25.69
6.02 
55.36
0.42
47.87
2.26
54.43
8.85 
8.75
14.29
41.29
35.69
44.27
1.08

我想将值转换为范围并为每个值赋予新值。从df我们知道最小值是0.42,最大值是55.36。从范围最小到最大,我想分为4组,它是:

0.42  - 14.15 transform to 1 
14.16 - 27.88 transform to 2
27.89 - 41.61 transform to 3
41.62 - 55.36 transform to 4

所以我预期的结果是

1
3
4
4
4
1
2
1
4
1
4
1
4
1
1
2
3
3
4
1
python python-3.x dataframe transform transformation
3个回答
1
投票

这通常称为binning,但是pandas称之为cut。示例代码如下:

import pandas as pd

# Create a list of numbers, with a header called "nums"
data_list = [('nums', [3.05, 35.97, 49.11, 48.80, 48.02, 10.61, 25.69, 6.02, 55.36, 0.42, 47.87, 2.26, 54.43, 8.85, 8.75, 14.29, 41.29, 35.69, 44.27, 1.08])]

# Create the labels for the bin
bin_labels = [1,2,3,4]

# Create the dataframe object using the data_list
df = pd.DataFrame.from_items(data_list)

# Define the scope of the bins
bins = [0.41, 14.16, 27.89, 41.62, 55.37]

# Create the "bins" column using the cut function using the bins and labels
df['bins'] = pd.cut(df['nums'], bins=bins, labels=bin_labels)

这将创建一个具有以下结构的数据框:

print(df)

     nums bins
0    3.05    1
1   35.97    3
2   49.11    4
3   48.80    4
4   48.02    4
5   10.61    1
6   25.69    2
7    6.02    1
8   55.36    4
9    0.42    1
10  47.87    4
11   2.26    1
12  54.43    4
13   8.85    1
14   8.75    1
15  14.29    2
16  41.29    3
17  35.69    3
18  44.27    4
19   1.08    1

0
投票

您可以构建如下所示的函数来完全控制该过程:

def transform(l):
    l2 = []
    for i in l:
        if 0.42 <= i <= 14.15:
            l2.append(1)
        elif i <= 27.8:
            l2.append(2)
        elif i <= 41.61:
            l2.append(3)
        elif i <= 55.36:
            l2.append(4)
    return(l2)

df['nums'] = transform(df['nums'])

© www.soinside.com 2019 - 2024. All rights reserved.