熊猫数据帧样品

问题描述 投票:0回答:1

有谁知道如何pandas.df.sample标准化的权重:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sample.html

例如,如果我只给权数为每个输入:它只是像做[COUNT1 / sum_counts,COUNT2 / sum_counts,...]?或者它做一些事情,如使用SoftMax? https://en.wikipedia.org/wiki/Softmax_function

python pandas
1个回答
1
投票

根据大熊猫source code为DataFrame.sample,看来你的第一个猜测,权重如何进行归一化([COUNT1 / sum_counts,COUNT2 / sum_counts,...])是正确的:

# Renormalize if don't sum to 1
if weights.sum() != 1:
    if weights.sum() != 0:
        weights = weights / weights.sum()
    else:
        raise ValueError("Invalid weights: weights sum to zero")
© www.soinside.com 2019 - 2024. All rights reserved.