和往常一样,我咬的东西比我咀嚼的还多。我有一个文件“ list.xlsx”。该文件有3页,“当前学生”,“完成”和“取消”。这些表都包含以下标题下的数据[StudentId,名字,姓氏,DoB,国籍,CourseID,CoursName,开始日期,结束日期,UnitID,UnitName,UnitCompetency]
我创建了以下令人讨厌的东西,该东西从我需要的开始。
我想要它做的是:
1)根据工作表命名的文件夹中的StudentId(唯一)创建具有FirstName + Lastname.xlsx的文件
2)在该文件中,获取其余各列的所有信息,并将其附加到其文件中
#python 3.8
import pandas as pd
import os
import shutil
file = "list.xlsx"
CS = "current student"
Fin = "finished"
Can = "cancelled"
TheList = {CS, Fin, Can}
CanXlsx = pd.read_excel(file, sheet_name = Can)
FinXlsx = pd.read_excel(file, sheet_name = Fin)
CSXlsx = pd.read_excel(file, sheet_name = CS)
if os.path.exists(CS):
shutil.rmtree(CS)
os.mkdir(CS)
CSDir = '//current student//'
if os.path.exists(Fin):
shutil.rmtree(Fin)
os.mkdir(Fin)
FinDir = '//finished//'
if os.path.exists(Can):
shutil.rmtree(Can)
os.mkdir(Can)
CanDir = '//cancelled//'
CancelID = CanXlsx.StudentId.unique()
FinID = FinXlsx.StudentId.unique()
CSID = CSXlsx.StudentId.unique()
我以为我对for循环之类的东西越来越好,但是似乎无法绕过它们。我可以考虑一下逻辑,但是代码没有附带。
https://drive.google.com/file/d/134fqWx6veF7zp_12GqFYlbmPZnK8ihaV/view?usp=sharing
我认为实现此目的所需的方法是创建3个数据帧(可能只用一个就可以,但我不记得了)。 1)然后,在每个数据帧上,您将需要提取“名字+姓氏”列表,然后,2)您将需要在数据帧上创建掩码以提取信息并保存。 >
import os import shutil file = "list.xlsx" CS = "current student" Fin = "finished" Can = "cancelled" TheList = {CS, Fin, Can} CanXlsx = pd.read_excel(file, sheet_name = Can) FinXlsx = pd.read_excel(file, sheet_name = Fin) CSXlsx = pd.read_excel(file, sheet_name = CS) ## File Creation if os.path.exists(CS): shutil.rmtree(CS) os.mkdir(CS) CSDir = '//current student//' if os.path.exists(Fin): shutil.rmtree(Fin) os.mkdir(Fin) FinDir = '//finished//' if os.path.exists(Can): shutil.rmtree(Can) os.mkdir(Can) CanDir = '//cancelled//' # Create full names CanXlsx["Fullname"] = CanXlsx["StudentId"] + "_" + CanXlsx["First Name"] + "_" + CanXlsx["Last Name"] ## Same for the other dfs # Get a list of ids # canFullNames = list(CanXlsx["Fullname"]) Edit: Preferred approach through student Ids canIds = list(CanXlsx["StudentId"]) ## Same for the other dfs # Loop over the list of full names to create your df for id in canIds: df1 = CanXlsx[CanXlsx["StudenId"] == id] # This will filter the rows by the id you want # Retrieve the full name name = df1.iloc[0]["Fullname"] # Create the filename filename = os.path.join(CanDir,name + ".xlsx") df1.drop(columns = ["First Name", "Last Name"] # I understand that these columns are not required on each file df1.to_excel(filename,header=True,index=False) ## Same for the other dfs
让我知道这是否有帮助,至少这是我了解的您希望通过代码实现的目标。 :D