如何为融化功能提取列名? Python

问题描述 投票:0回答:2

我有一个包含以下各列的数据集。

data.columns[1:]
Index(['Fraud (i.e. fabricated or falsified results)',
       'Pressure to publish for career advancement',
       'Insufficient oversight/mentoring by lab principal investigator (e.g. reviewing raw data)',
       'Insufficient peer review of research',
       'Selective reporting of results',
       'Original findings not robust enough because not replicated enough in the lab publishing the work',
       'Original findings obtained with low statistical power/poor statistical analysis',
       'Mistakes or inadequate expertise in reproduction efforts',
       'Raw data not available from original lab',
       'Protocols, computer code or reagent information insufficient or not available from original lab',
       'Methods need 'green fingers' – particular technical expertise that is difficult for others to reproduce',
       'Variability of standard reagents', 'Poor experimental design',
       'Bad luck'],
      dtype='object')

而且我想使用列来执行melt函数,所以我执行以下代码。

data_melt = pd.melt(data, id_vars =['respid'], value_vars =['Fraud (i.e. fabricated or falsified results)',
 'Pressure to publish for career advancement',
 'Insufficient oversight/mentoring by lab principal investigator (e.g. reviewing raw data)',
 'Insufficient peer review of research',
 'Selective reporting of results',
 'Original findings not robust enough because not replicated enough in the lab publishing the work',
 'Original findings obtained with low statistical power/poor statistical analysis',
 'Mistakes or inadequate expertise in reproduction efforts',
 'Raw data not available from original lab',
 'Protocols, computer code or reagent information insufficient or not available from original lab',
 "Methods need 'green fingers' – particular technical expertise that is difficult for others to reproduce",
 'Variability of standard reagents',
 'Poor experimental design','Bad luck'],var_name = 'factor', value_name = 'rate')

基本上,我只是将列名称粘贴到value_vars中。

我的问题是可以编写代码来实现相同的目标吗?

例如,只需编写如下代码。 (我知道这是错误的。)

data_melt = pd.melt(data, id_vars =['respid'], value_vars = data.columns(), ,var_name = 'factor', value_name = 'rate')

谢谢!

python melt
2个回答
0
投票

如果data.columns[1:]是您需要的values_vars,则只需将其作为参数即可:

data_melt = pd.melt(data, id_vars =['respid'], value_vars = data.columns[1:], ,var_name = 'factor', value_name = 'rate')

0
投票

这里是解决方法:

# Create a dummy dataframe with columns similar to yours. 
df = pd.DataFrame({"respid": range(5),
                   "Fraud (i.e. fabricated or falsified results)": range(5,10), 
                   'Pressure to publish for career advancement': range(10, 15), 
                   'Insufficient oversight/mentoring by lab principal investigator (e.g. reviewing raw data)': range(15,20), 
                   'Insufficient peer review of research': range(20,25)
                  })

pd.melt(df, id_vars =['respid'], value_vars=set(df.columns).difference(["respid"]))

结果是:

    respid                                           variable  value
0        0       Fraud (i.e. fabricated or falsified results)      5
1        1       Fraud (i.e. fabricated or falsified results)      6
2        2       Fraud (i.e. fabricated or falsified results)      7
3        3       Fraud (i.e. fabricated or falsified results)      8
4        4       Fraud (i.e. fabricated or falsified results)      9
5        0               Insufficient peer review of research     20
6        1               Insufficient peer review of research     21
7        2               Insufficient peer review of research     22
8        3               Insufficient peer review of research     23
...
© www.soinside.com 2019 - 2024. All rights reserved.