我想尽可能有效地将 Pandas 数据帧列的数量分类为对数刻度的度数。 通过对数刻度,我的意思是在
power(unit, degree)
范围内显示数字的方式:
学位 | 功率(10,度) | 功率(2,度) |
---|---|---|
-2 | 0.001 | 0.25 |
-1 | 0.1 | 0.5 |
0 | 1 | 1 |
1 | 10 | 2 |
2 | 100 | 4 |
给出以下玩具数据集:
df = pandas.DataFrame(data={'number' : [0.478, 897.12, 12.56, 8.89, 1578.45, 0.089, 0.007]})
我想轻松计算列log_10_class:
数字 | log_10_class |
---|---|
0.478 | 0 |
897.12 | 3 |
12.56 | 2 |
8.89 | 1 |
1578.45 | 4 |
0.089 | -1 |
0.007 | -2 |
def bin_log_scale(
numb_sr: pandas.core.series.Series,
log_unit : float,
min_pow_deg: int,
max_pow_deg: int
):
""" Bin a float into a degree of a logarithm scale
Parameters:
- numb_sr : Pandas serie of number
- log_unit : logarithm scale unit
- min_pow_deg : minimal power degree
- max_pow_deg : maximal power degree
Returns:
- a Pandas serie identifying the degree of the logarithm scale
"""
# Initialisation with the minimum class
class_sr = pandas.Series(
data = min_pow_deg,
index = numb_sr.index
)
# Incrementation of the class each time numbers are higher its lower bound
for pow_deg in range(min_pow_deg, max_pow_deg):
class_sr = class_sr + 1*(numb_sr > pow(log_unit, pow_deg))
return class_sr
# Create the new column with the log-10 scale class
df['log_10_class']=bin_log_scale(
numb_sr = df['number'],
log_unit = 10,
min_pow_deg = -2,
max_pow_deg = 4
)
df