我有一个包含的项目和功能的文本文件如下
Item name: Item1
Feature 1: 64.264
Feature 2: 18.071
Feature 3: 188.516
Feature 4: 0.208
Feature 5: 4.711
Feature 6: 0.412
Feature 7: -14.902
Feature 9: -10.435
Feature 10: 18.089
Item name: Item2
Feature 1: 69.990
Feature 2: 19.312
Feature 3: 117.832
Feature 4: 0.419
Feature 5: 5.224
Feature 6: 0.458
Feature 7: -20.500
Feature 8: -12.933
Feature 9: 15.646
Feature 10: 1.751
Item name: Item3
Feature 1: 66.125
Feature 2: 23.067
Feature 3: 133.110
Feature 4: 0.328
Feature 5: 2.854
Feature 6: 0.249
Feature 7: -37.271
Feature 8: -10.310
Feature 9: 13.784
Feature 10: 3.067
我想改变使用Python这个文本文件到有项目名称为列0的数据结构,并从功能特点1以特色10列从1到10,我将不胜感激帮助。
我会用Python来生成词典:
In [11]: a = {}
In [12]: for line in open('file.txt'):
...: if line.startswith(" "):
...: k, v = line.split(':')
...: a[current][k.strip()] = v.strip()
...: else:
...: current = line.split(':')[1].strip()
...: a[current] = {}
...:
In [13]: pd.DataFrame.from_dict(a)
Out[13]:
Item1 Item2 Item3
Feature 1 64.264 69.990 66.125
Feature 10 18.089 1.751 3.067
Feature 2 18.071 19.312 23.067
Feature 3 188.516 117.832 133.110
Feature 4 0.208 0.419 0.328
Feature 5 4.711 5.224 2.854
Feature 6 0.412 0.458 0.249
Feature 7 -14.902 -20.500 -37.271
Feature 8 NaN -12.933 -10.310
Feature 9 -10.435 15.646 13.784
In [14]: pd.DataFrame.from_dict(a, orient='index')
Out[14]:
Feature 1 Feature 2 Feature 3 Feature 4 Feature 5 Feature 6 Feature 7 Feature 9 Feature 10 Feature 8
Item1 64.264 18.071 188.516 0.208 4.711 0.412 -14.902 -10.435 18.089 NaN
Item2 69.990 19.312 117.832 0.419 5.224 0.458 -20.500 15.646 1.751 -12.933
Item3 66.125 23.067 133.110 0.328 2.854 0.249 -37.271 13.784 3.067 -10.310