这是我遇到的问题:
我想绘制一些用 Quantum Espresso 获得的能带。数据位于包含两列的文件中。这些列由空行分隔成块。每个块对应一个带。
这是前两个块的示例:
0.0000 -44.2709
0.0250 -44.2709
0.0500 -44.2709
0.0750 -44.2708
0.1000 -44.2708
0.1250 -44.2707
0.1500 -44.2706
0.1750 -44.2705
0.2000 -44.2703
0.2250 -44.2702
0.2500 -44.2701
0.2750 -44.2700
0.3000 -44.2698
0.3250 -44.2697
0.3500 -44.2696
0.3750 -44.2695
0.4000 -44.2694
0.4250 -44.2694
0.4500 -44.2693
0.4750 -44.2693
0.5000 -44.2693
0.5250 -44.2693
0.5500 -44.2692
0.5750 -44.2692
0.6000 -44.2691
0.6250 -44.2690
0.6500 -44.2689
0.6750 -44.2688
0.7000 -44.2687
0.7250 -44.2686
0.7500 -44.2685
0.7750 -44.2683
0.8000 -44.2682
0.8250 -44.2681
0.8500 -44.2680
0.8750 -44.2679
0.9000 -44.2678
0.9250 -44.2678
0.9500 -44.2677
0.9750 -44.2677
1.0000 -44.2677
1.0354 -44.2677
1.0707 -44.2677
1.1061 -44.2678
1.1414 -44.2680
1.1768 -44.2681
1.2121 -44.2683
1.2475 -44.2686
1.2828 -44.2688
1.3182 -44.2690
1.3536 -44.2693
1.3889 -44.2695
1.4243 -44.2698
1.4596 -44.2700
1.4950 -44.2702
1.5303 -44.2704
1.5657 -44.2706
1.6010 -44.2707
1.6364 -44.2708
1.6718 -44.2709
1.7071 -44.2709
1.7504 -44.2709
1.7937 -44.2708
1.8370 -44.2706
1.8803 -44.2704
1.9236 -44.2702
1.9669 -44.2699
2.0102 -44.2696
2.0535 -44.2692
2.0968 -44.2689
2.1401 -44.2685
2.1834 -44.2681
2.2267 -44.2677
2.2700 -44.2674
2.3133 -44.2671
2.3566 -44.2668
2.3999 -44.2665
2.4432 -44.2663
2.4865 -44.2662
2.5298 -44.2661
2.5731 -44.2661
2.6085 -44.2661
2.6438 -44.2661
2.6792 -44.2662
2.7146 -44.2664
2.7499 -44.2665
2.7853 -44.2667
2.8206 -44.2669
2.8560 -44.2672
2.8913 -44.2674
2.9267 -44.2677
2.9620 -44.2679
2.9974 -44.2682
3.0328 -44.2684
3.0681 -44.2686
3.1035 -44.2688
3.1388 -44.2690
3.1742 -44.2691
3.2095 -44.2692
3.2449 -44.2693
3.2802 -44.2693
3.2802 -44.2677
3.3052 -44.2677
3.3302 -44.2676
3.3552 -44.2676
3.3802 -44.2675
3.4052 -44.2674
3.4302 -44.2673
3.4552 -44.2672
3.4802 -44.2671
3.5052 -44.2670
3.5302 -44.2669
3.5552 -44.2667
3.5802 -44.2666
3.6052 -44.2665
3.6302 -44.2664
3.6552 -44.2663
3.6802 -44.2662
3.7052 -44.2662
3.7302 -44.2661
3.7552 -44.2661
3.7802 -44.2661
0.0000 -20.8317
0.0250 -20.8322
0.0500 -20.8338
0.0750 -20.8364
0.1000 -20.8400
0.1250 -20.8445
0.1500 -20.8497
0.1750 -20.8555
0.2000 -20.8618
0.2250 -20.8684
0.2500 -20.8751
0.2750 -20.8819
0.3000 -20.8884
0.3250 -20.8947
0.3500 -20.9004
0.3750 -20.9055
0.4000 -20.9098
0.4250 -20.9133
0.4500 -20.9159
0.4750 -20.9174
0.5000 -20.9179
0.5250 -20.9179
0.5500 -20.9178
0.5750 -20.9175
0.6000 -20.9172
0.6250 -20.9169
0.6500 -20.9164
0.6750 -20.9159
0.7000 -20.9154
0.7250 -20.9149
0.7500 -20.9143
0.7750 -20.9137
0.8000 -20.9132
0.8250 -20.9126
0.8500 -20.9122
0.8750 -20.9117
0.9000 -20.9113
0.9250 -20.9110
0.9500 -20.9108
0.9750 -20.9107
1.0000 -20.9106
1.0354 -20.9102
1.0707 -20.9089
1.1061 -20.9068
1.1414 -20.9039
1.1768 -20.9003
1.2121 -20.8959
1.2475 -20.8910
1.2828 -20.8855
1.3182 -20.8797
1.3536 -20.8736
1.3889 -20.8673
1.4243 -20.8611
1.4596 -20.8551
1.4950 -20.8495
1.5303 -20.8444
1.5657 -20.8400
1.6010 -20.8365
1.6364 -20.8338
1.6718 -20.8322
1.7071 -20.8317
1.7504 -20.8322
1.7937 -20.8338
1.8370 -20.8365
1.8803 -20.8400
1.9236 -20.8443
1.9669 -20.8492
2.0102 -20.8545
2.0535 -20.8601
2.0968 -20.8659
2.1401 -20.8716
2.1834 -20.8772
2.2267 -20.8826
2.2700 -20.8876
2.3133 -20.8922
2.3566 -20.8962
2.3999 -20.8997
2.4432 -20.9025
2.4865 -20.9045
2.5298 -20.9058
2.5731 -20.9062
2.6085 -20.9063
2.6438 -20.9064
2.6792 -20.9067
2.7146 -20.9071
2.7499 -20.9076
2.7853 -20.9082
2.8206 -20.9089
2.8560 -20.9096
2.8913 -20.9105
2.9267 -20.9114
2.9620 -20.9123
2.9974 -20.9132
3.0328 -20.9142
3.0681 -20.9151
3.1035 -20.9159
3.1388 -20.9166
3.1742 -20.9171
3.2095 -20.9176
3.2449 -20.9178
3.2802 -20.9179
3.2802 -20.9106
3.3052 -20.9106
3.3302 -20.9105
3.3552 -20.9104
3.3802 -20.9102
3.4052 -20.9100
3.4302 -20.9097
3.4552 -20.9094
3.4802 -20.9091
3.5052 -20.9088
3.5302 -20.9084
3.5552 -20.9081
3.5802 -20.9078
3.6052 -20.9074
3.6302 -20.9071
3.6552 -20.9069
3.6802 -20.9066
3.7052 -20.9065
3.7302 -20.9063
3.7552 -20.9063
3.7802 -20.9062
您可能会注意到,第一列一遍又一遍地包含相同的数据,只有第二列包含不同的数据。我想做的只是保留第一个块中的第一列,然后将第二列转换为单独的列。像这样:
0.0000 -44.2709 -20.8317
0.0250 -44.2709 -20.8322
0.0500 -44.2709 -20.8338
0.0750 -44.2708 -20.8364
0.1000 -44.2708 -20.8400
0.1250 -44.2707 -20.8445
0.1500 -44.2706 -20.8497
0.1750 -44.2705 -20.8555
0.2000 -44.2703 -20.8618
0.2250 -44.2702 -20.8684
0.2500 -44.2701 -20.8751
0.2750 -44.2700 -20.8819
0.3000 -44.2698 -20.8884
0.3250 -44.2697 -20.8947
0.3500 -44.2696 -20.9004
0.3750 -44.2695 -20.9055
0.4000 -44.2694 -20.9098
0.4250 -44.2694 -20.9133
0.4500 -44.2693 -20.9159
0.4750 -44.2693 -20.9174
0.5000 -44.2693 -20.9179
0.5250 -44.2693 -20.9179
0.5500 -44.2692 -20.9178
0.5750 -44.2692 -20.9175
0.6000 -44.2691 -20.9172
0.6250 -44.2690 -20.9169
0.6500 -44.2689 -20.9164
0.6750 -44.2688 -20.9159
0.7000 -44.2687 -20.9154
0.7250 -44.2686 -20.9149
0.7500 -44.2685 -20.9143
0.7750 -44.2683 -20.9137
0.8000 -44.2682 -20.9132
0.8250 -44.2681 -20.9126
0.8500 -44.2680 -20.9122
0.8750 -44.2679 -20.9117
0.9000 -44.2678 -20.9113
0.9250 -44.2678 -20.9110
0.9500 -44.2677 -20.9108
0.9750 -44.2677 -20.9107
1.0000 -44.2677 -20.9106
1.0354 -44.2677 -20.9102
1.0707 -44.2677 -20.9089
1.1061 -44.2678 -20.9068
1.1414 -44.2680 -20.9039
1.1768 -44.2681 -20.9003
1.2121 -44.2683 -20.8959
1.2475 -44.2686 -20.8910
1.2828 -44.2688 -20.8855
1.3182 -44.2690 -20.8797
1.3536 -44.2693 -20.8736
1.3889 -44.2695 -20.8673
1.4243 -44.2698 -20.8611
1.4596 -44.2700 -20.8551
1.4950 -44.2702 -20.8495
1.5303 -44.2704 -20.8444
1.5657 -44.2706 -20.8400
1.6010 -44.2707 -20.8365
1.6364 -44.2708 -20.8338
1.6718 -44.2709 -20.8322
1.7071 -44.2709 -20.8317
1.7504 -44.2709 -20.8322
1.7937 -44.2708 -20.8338
1.8370 -44.2706 -20.8365
1.8803 -44.2704 -20.8400
1.9236 -44.2702 -20.8443
1.9669 -44.2699 -20.8492
2.0102 -44.2696 -20.8545
2.0535 -44.2692 -20.8601
2.0968 -44.2689 -20.8659
2.1401 -44.2685 -20.8716
2.1834 -44.2681 -20.8772
2.2267 -44.2677 -20.8826
2.2700 -44.2674 -20.8876
2.3133 -44.2671 -20.8922
2.3566 -44.2668 -20.8962
2.3999 -44.2665 -20.8997
2.4432 -44.2663 -20.9025
2.4865 -44.2662 -20.9045
2.5298 -44.2661 -20.9058
2.5731 -44.2661 -20.9062
2.6085 -44.2661 -20.9063
2.6438 -44.2661 -20.9064
2.6792 -44.2662 -20.9067
2.7146 -44.2664 -20.9071
2.7499 -44.2665 -20.9076
2.7853 -44.2667 -20.9082
2.8206 -44.2669 -20.9089
2.8560 -44.2672 -20.9096
2.8913 -44.2674 -20.9105
2.9267 -44.2677 -20.9114
2.9620 -44.2679 -20.9123
2.9974 -44.2682 -20.9132
3.0328 -44.2684 -20.9142
3.0681 -44.2686 -20.9151
3.1035 -44.2688 -20.9159
3.1388 -44.2690 -20.9166
3.1742 -44.2691 -20.9171
3.2095 -44.2692 -20.9176
3.2449 -44.2693 -20.9178
3.2802 -44.2693 -20.9179
3.2802 -44.2677 -20.9106
3.3052 -44.2677 -20.9106
3.3302 -44.2676 -20.9105
3.3552 -44.2676 -20.9104
3.3802 -44.2675 -20.9102
3.4052 -44.2674 -20.9100
3.4302 -44.2673 -20.9097
3.4552 -44.2672 -20.9094
3.4802 -44.2671 -20.9091
3.5052 -44.2670 -20.9088
3.5302 -44.2669 -20.9084
3.5552 -44.2667 -20.9081
3.5802 -44.2666 -20.9078
3.6052 -44.2665 -20.9074
3.6302 -44.2664 -20.9071
3.6552 -44.2663 -20.9069
3.6802 -44.2662 -20.9066
3.7052 -44.2662 -20.9065
3.7302 -44.2661 -20.9063
3.7552 -44.2661 -20.9063
3.7802 -44.2661 -20.9062
但是有一个问题!我已经设法做一些与
numpy.unique
接近的事情,但我注意到由于某种原因,Quantum Espresso 有时会在第一列块中写入两个或多个相等的值,而第二列中的相应值不同并使用 numpy.uniques
我会丢失数据。
我已经尝试过这种方式:
kp_bands=np.take(bands[:,0],range(0,122),axis=0)
。这里 bands
是我用 numpy.loadtxt
加载数据的地方,122
是每个块中值的数量。问题是,这并不总是一样的。根据所研究的系统可能会有所不同。
我的问题是:
如何在不丢失数据且不知道每个块中有多少行的情况下执行此操作?
就像评论中提到的,pandas 是真的你最好的选择
import pandas as pd
import numpy as np
def read_arrays(filename):
arrays = []
with open(filename,"r") as f:
data = f.read().split("\n\n")
for item in data:
arr = np.fromstring(item,dtype=float,sep=" ")
arr = arr.reshape(len(arr)//2,2)
arrays.append(arr)
return arrays
def all_to_df(numpy_arrays):
return [
pd.DataFrame(data=item) for item in numpy_arrays
]
data = read_arrays("temp.txt")
dataframes = all_to_df(data)
#some check
print(dataframes[0])
print(dataframes[1])
df = dataframes[0] #assuming at least 1 dataframe
for i in range(1,len(dataframes)):
df = pd.merge(df,dataframes[i], on=0)
df = df.rename(columns={"0": 'timestamp',"1_x": 'value_1',"1_y": 'value_2'})
print(df)