创建一个循环以处理多个文件

问题描述 投票:1回答:1

[我已经在下面编写了代码,但是目前我需要为每个文件重新键入相同的条件,并且由于有100多个文件,因此不理想。

我无法提出一种使用循环来实现此目的的方法,该循环将读取所有这些文件并过滤掉MP中的值。同时,到目前为止,我唯一知道的唯一方法就是在每个过滤器文件中添加两列新代码,如下所示。我尝试获取具有所有过滤器文件及其条件的新组合数据框

请提出使用循环的方法:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import signal

df1 = pd.read_csv(r'E:\Unmanned Cars\Unmanned Cars\2017040810_052.csv')
df2 = pd.read_csv(r'E:\Unmanned Cars\Unmanned Cars\2017040901_052.csv')
df3 = pd.read_csv(r'E:\Unmanned Cars\Unmanned Cars\2017040902_052.csv')

df1 =df1["MP"].unique()
df1=pd.DataFrame(df1, columns=['MP'])
df1["Dates"] = "2017-04-08"
df1["Inspection"] = "10"
##
df2 =df2["MP"].unique()
df2=pd.DataFrame(df2, columns=['MP'])
df2["Dates"] = "2017-04-09"
df2["Inspection"] = "01"
##
df3 =df3["MP"].unique()
df3=pd.DataFrame(df3, columns=['MP'])
df3["Dates"] = "2017-04-09"
df3["Inspection"] = "02"
Final = pd.concat([df1,df2,df3,df4],axis = 0, sort = False)
python
1个回答
0
投票

也许此示例代码会为您提供帮助。

#!/usr/bin/env python3

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import signal
from os import path
import glob
import re

def process_file(file_path: str):
    result = None
    file_path = file_path.replace("\\","/")
    filename = path.basename(file_path)
    regex = re.compile("^(\\d{4})(\\d{2})(\\d{2})(\\d{2})")
    match = regex.match(filename)
    if match:
        date = "%s-%s-%s" % (match[1] , match[2] , match[3])
        inspection = match[4]

        df1 = pd.read_csv(file_path)
        df1 =df1["MP"].unique()
        df1=pd.DataFrame(df1, columns=['MP'])
        df1["Dates"] = date
        df1["Inspection"] = inspection
        result = df1
    return result


def main():
#    files_list = [
#        r'E:\Unmanned Cars\Unmanned Cars\2017040810_052.csv',
#        r'E:\Unmanned Cars\Unmanned Cars\2017040901_052.csv',
#        r'E:\Unmanned Cars\Unmanned Cars\2017040902_052.csv'
#    ]
    directory = 'E:\\Unmanned Cars\\Unmanned Cars\\'
    files_list =  [f for f in glob.glob(directory + "*_052.csv")]

    result_list = [ process_file(filename) for filename in files_list ]

    Final = pd.concat(result_list, axis = 0, sort = False)

if __name__ == "__main__":
    main()

我已经创建了process_file函数来处理每个文件。使用regular expression从文件名中提取数据。

© www.soinside.com 2019 - 2024. All rights reserved.