我有一个dataframe
,如下所示:
8964_real 8964_imag 8965_real 8965_imag 8966_real 8966_imag 8967_real ... 8984_imag 8985_real 8985_imag 8986_real 8986_imag 8987_real 8987_imag
0 112.653120 0.000000 117.104887 0.000000 127.593406 0.000000 129.522106 ... 0.000000 125.423552 0.000000 127.888477 0.000000 136.160979 0.000000
1 -0.315831 16.363974 -2.083329 22.443628 -2.166950 15.026253 0.110502 ... -26.613220 8.454297 -35.000742 11.871405 -24.914035 7.448329 -16.370041
2 -1.863497 10.672129 -6.152232 15.980813 -5.679352 18.976117 -5.775777 ... -11.131600 -18.990022 -9.520732 -11.947319 -4.641286 -17.104710 -5.691642
3 -6.749938 14.870590 -12.222749 15.012352 -10.501423 9.345518 -9.103459 ... -2.860546 -29.862724 -5.237663 -28.791194 -5.685985 -24.565608 -10.385683
4 -2.991405 -10.332938 -4.097638 -10.204587 -12.056221 -5.684882 -12.861357 ... 0.821902 -8.787235 -1.521650 -3.798446 -2.390519 -6.527762 -1.145998
我必须转换上面的数据框,以使"_real"
列中的值应位于一列下,而"_imag"
之下的值应位于另一列下
那末尾应该总有两列,其中一列是real and other for imag
。最有效的方法是什么?
我将其称为link。但这对于一列是好的,但是我需要两列。我得到的另一个想法是使用regex to select columns containing "real"
并按照上面的链接所述进行操作(对于imag同样如此),但是感觉有些不对头。
感谢任何帮助。
编辑:例如,real
应该类似于
real
112.653120
-0.315831
-1.863497
-6.749938
-2.991405
---------
117.104887
-2.083329
-6.152232
-12.222749
-4.097638
---------
127.593406
-2.166950
-5.679352
-10.501423
-12.056221
我用虚线表示清楚
通过MultiIndex
创建split
,因此可以通过DataFrame.stack
重塑形状:
DataFrame.stack
df.columns = df.columns.str.split('_', expand=True)
print (df.head(10))
8964 8965 8966 \
real imag real imag real imag
0 112.653120 0.000000 117.104887 0.000000 127.593406 0.000000
1 -0.315831 16.363974 -2.083329 22.443628 -2.166950 15.026253
2 -1.863497 10.672129 -6.152232 15.980813 -5.679352 18.976117
3 -6.749938 14.870590 -12.222749 15.012352 -10.501423 9.345518
4 -2.991405 -10.332938 -4.097638 -10.204587 -12.056221 -5.684882
8967 8984 8985 8986 \
real imag real imag real imag
0 129.522106 0.000000 125.423552 0.000000 127.888477 0.000000
1 0.110502 -26.613220 8.454297 -35.000742 11.871405 -24.914035
2 -5.775777 -11.131600 -18.990022 -9.520732 -11.947319 -4.641286
3 -9.103459 -2.860546 -29.862724 -5.237663 -28.791194 -5.685985
4 -12.861357 0.821902 -8.787235 -1.521650 -3.798446 -2.390519
8987
real imag
0 136.160979 0.000000
1 7.448329 -16.370041
2 -17.104710 -5.691642
3 -24.565608 -10.385683
4 -6.527762 -1.145998
编辑:
df = df.stack(0).reset_index(level=0, drop=True).rename_axis('a').reset_index()
print (df.head(10))
a imag real
0 8964 0.000000 112.653120
1 8965 0.000000 117.104887
2 8966 0.000000 127.593406
3 8967 NaN 129.522106
4 8984 0.000000 NaN
5 8985 0.000000 125.423552
6 8986 0.000000 127.888477
7 8987 0.000000 136.160979
8 8964 16.363974 -0.315831
9 8965 22.443628 -2.083329