使用 stats.boxcox 进行标准化 - ValueError：数据必须是一维的

Question

我试图标准化其值的列是“Precip”（降水量/毫米）。

似乎我将 ndarray 给

stats.boxcox()

作为参数，但我很困惑我使用的数据帧是如何变成 ndarray 的。

Precip 的数据类型是 object，它的 str 值为“T”。所以我做了一些预处理，以便在标准化之前用最小值替换 T。这是代码：

weatherww2_data = pd.read_csv("../input/weatherww2/Summary of Weather.csv")
precip_data = pd.DataFrame(weatherww2_data.Precip)

# select rows whose column value is not T or 0
except_list = ['T', '0', 0]
excluded_precip_data = precip_data[~precip_data['Precip'].isin(except_list)]

# get min
min_precip_except_zero = excluded_precip_data.min(axis=0)

# replace T with minimum value
replaced_precip_data = precip_data.replace(to_replace='T', value=min_precip_except_zero)

这是标准化的代码：

normalized_precip = pd.Series(stats.boxcox(replaced_precip_data)[0], name='Precip', index=replaced_precip_data.index)

这一行给出了错误消息：

/opt/conda/lib/python3.7/site-packages/scipy/stats/morestats.py in boxcox(x, lmbda, alpha, optimizer)
   1053     x = np.asarray(x)
   1054     if x.ndim != 1:
-> 1055         raise ValueError("Data must be 1-dimensional.")
   1056 
   1057     if x.size == 0:

ValueError: Data must be 1-dimensional.

知道可能是什么问题吗？

我检查了

replaced_precip_data['Precip']

的形状，它不是“一维”吗？我想我误解了数组维度的概念。

print(replaced_precip_data['Precip'].shape)

output> (119040,)

Answer 1

好的，这就是我的做法：

df = pd.read_csv("Pakistan_Largest_Ecommerce_Dataset.csv")

price = df['price']  // You can replace price with your column name
price_normalized = stats.boxcox(price)

就是这样。

使用 stats.boxcox 进行标准化 - ValueError：数据必须是一维的

问题描述投票：0回答：1

1个回答

最新问题

使用 stats.boxcox 进行标准化 - ValueError：数据必须是一维的

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1