制作具有重复数字的数据框

问题描述 投票:0回答:1

我有一个包含数据的文件。 数据排序如下。

00:00:08    3.78E-7
00:02:10    3.78E-7
00:05:00    3.78E-7
00:06:00    3.78E-7
...............
...............

等等。右侧所有值都相同,左侧正在更新。左侧表示时间。我需要绘制这些。

节目如下。

import pandas as pd
from collections import defaultdict
import matplotlib.pyplot as plt
from matplotlib import interactive


def main():
with open(r'C:\Users\sarad\leybold\230623.log','r') as line: 
 table = defaultdict(dict)
 for line in line:
   if line:
     entry = line.strip()
     if ':' in entry:
         t = entry
     for item in t:
          t1,t2=t.split(" ")
          table[t].update({t1:float(t2)})
 df=pd.DataFrame(table).T
 print(df)
 df.plot()
 plt.show()  
main()

输出是这样的

15:19:37      15:20:07      15:20:37  ...      23:58:58      23:59:28      23:59:58
15:19:37 7.8E-7  7.800000e-07           NaN           NaN  ...           NaN           NaN           NaN
15:20:07 7.8E-7           NaN  7.800000e-07           NaN  ...           NaN           NaN           NaN
15:20:37 7.8E-7           NaN           NaN  7.800000e-07  ...           NaN           NaN           NaN
15:21:07 7.8E-7           NaN           NaN           NaN  ...           NaN           NaN           NaN
15:21:37 7.8E-7           NaN           NaN           NaN  ...           NaN           NaN           NaN
...                       ...           ...           ...  ...           ...           ...           ...
23:57:58 9.4E-7           NaN           NaN           NaN  ...           NaN           NaN           NaN
23:58:28 9.4E-7           NaN           NaN           NaN  ...           NaN           NaN           NaN
23:58:58 9.4E-7           NaN           NaN           NaN  ...  9.400000e-07           NaN           NaN
23:59:28 9.4E-7           NaN           NaN           NaN  ...           NaN  9.400000e-07           NaN

这是不可能绘制的。情节应该是一条简单的直线。

python dataframe matplotlib plot sequence
1个回答
0
投票

你的循环和缩进有点难以理解。这里有一些建议:

  • 您打开一个文件并将其称为“行”,而它应该类似于“文件”
  • 您没有阅读该文件
  • 然后尝试使用元素和可迭代的相同变量进行 for 循环
    for line in line:
  • 即使这可行,它也会对从文件导入的字符串中的每个字符进行循环
  • 从条件
    if line:
    开始,你试图做什么是无法理解的
  • 另外你的缩进不正确

仅提及代码中的一些设计缺陷。

这是一个有关如何执行此操作的示例如果您的 .log 文件不是制表符分隔的 csv 文件

import pandas as pd


def main():
    with open(r'C:\Users\sarad\leybold\230623.log', 'r') as file:
        content = file.read()
    data = {}
    # list comprehension to get list of data removing whitespace
    parts = [string for string in content.split(' ') if string != '']
    # loop to get key value pairs
    for i in range(int(len(parts)/2)):
        if parts[i*2] not in data.keys():  # timestamp not in keys, so add column with empty list
            data[parts[i*2]] = [float(parts[i*2+1])]
        else:  # timestamp already in keys
            data[parts[i*2]].append(float(parts[i*2+1]))
    df = pd.DataFrame(data).T
    df.plot()
    plt.show()  # you have to show your plot!


if __name__ == '__main__':
    main()

否则你可以简单地使用pandas的

pd.read_csv(myfile, sep='\t')
功能。

import pandas as pd


def plot_log():
    df = pd.read_csv(r'C:\Users\sarad\leybold\230623.log', sep='\t').T
    df.plot()
    plt.show()


if __name__ == '__main__':
    plot_log()
© www.soinside.com 2019 - 2024. All rights reserved.