我有一个非常宽的df,看起来像这样:
import pandas as pd
x = pd.DataFrame([
{"letter": "a", "keepme": "alpha", "First": 1, "Second": 2, "Third": 3, "Fourth": 4, "One-Thousandth": 1000},
{"letter": "b", "keepme": "beta", "First": 10, "Second": 20, "Third": 30, "Fourth": 40, "One-Thousandth": 10000},
{"letter": "c", "keepme": "gamma", "First": 100, "Second": 200, "Third": 300, "Fourth": 400, "One-Thousandth": 100000}
])
print(x)
letter keepme First Second Third One-Thousandth
0 a alpha 1 2 3 1000
1 b beta 10 20 30 10000
2 c gamma 100 200 300 100000
我想将列从第一到千分之一,并使其显示的每个数字对应于df中的唯一条目,其中旧列名称现在是名为“名称”的新列的值,像这样
letter keepme Name Value
a alpha First 1
a alpha Second 2
a alpha Third 3
a alpha One-Thousandth 1000
b beta First 10
b beta Second 20
b beta Third 30
b beta One-Thousandth 10000
c gamma First 100
c gamma Second 200
c gamma Third 300
c gamma One-Thousandth 100000
我该怎么做?对于它而言,速度在这里非常重要,因为实际的df大约有4k列和约20万行。
在这里使用融化:
df = x.melt(id_vars=['letter', 'keepme'],
var_name='Name',
value_name='Value')
print(df)
letter keepme Name Value
0 a alpha First 1
1 b beta First 10
2 c gamma First 100
3 a alpha Second 2
4 b beta Second 20
5 c gamma Second 200
6 a alpha Third 3
7 b beta Third 30
8 c gamma Third 300
9 a alpha Fourth 4
10 b beta Fourth 40
11 c gamma Fourth 400
12 a alpha One-Thousandth 1000
13 b beta One-Thousandth 10000
14 c gamma One-Thousandth 100000