打开文档时出错 - 列表索引超出范围[已关闭]

Question

我需要从评估 .doc 文件中提取候选人的分数，该文件包含有关候选人接受测试的评估分数的信息。这些文件可以在每个候选人的文件夹中找到。我需要提取“wonderlic”的分数并创建一个包含候选人姓名和分数的数据框。我有以下代码，它打开并读取文件夹，找到正确的文档，然后读取并从 .doc 文件中提取所需的信息。我对此代码进行了测试，它打开了一个特定的 Word .doc 文件，通读它并提取信息，该代码有效。现在，当我尝试使用文件夹，它给了我以下错误：

A LastName07 2022r 1 / 4 B 姓氏文件夹第 2 个（共 4 个） C 姓氏文件夹 3 of 4 D 姓氏文件夹第 4 个（共 4 个）

D 姓氏评估 1.2.doc 打开文档时出错：C:\Users\Mine\OneDrive\Escritorio\CompanyData\Test 文件夹\D LastName\D LastName 1.2.doc 列表索引超出范围 D 姓氏评估.doc 打开文档时出错：C:\Users\Mine\OneDrive\Escritorio\CompanyData\Test 文件夹\D LastName\D LastName Assessment.doc 列表索引超出范围

您能帮我理解“列表索引超出范围”错误的含义以及可以采取哪些措施来修复它吗？

这是我在 Jupyter Notebook 中使用的代码：

导入win32com.client 从 docx 导入文档导入操作系统进口再导入压缩文件导入 xml.dom.minidom 将 pandas 导入为 pd

word = win32com.client.Dispatch("Word.Application")

i = 0 对于目录列表中的文件夹：我 += 1 print("候选文件夹 " + str(i) + " of " + str(tot), end=' '）打印（文件夹）

folder_path = os.path.join(directory, folder)  # Get the full path to the folder

for filename in os.listdir(folder_path):  # Iterate through files in the folder
    file_path = os.path.join(folder_path, filename)  # Get the full path to the file
    if (filename.endswith('.doc') or filename.endswith('.docx')) and "assessment" in filename.lower():
        try:
            print(filename)

            doc = word.Documents.Open(file_path)

            # Extract Wonderlic Scores
            for paragraph in doc.paragraphs:
                text = paragraph.Range.Text

                wonderlic_keyword = "THE WONDERLIC"
                wiesen_keyword = "WIESEN TEST"

                # Set a flag to indicate if we are inside the Wonderlic section
                inside_wonderlic_section = False
                wonderlic_section_lines = []  # Store the lines of the Wonderlic section

                # Check if we are entering the Wonderlic section
                if wonderlic_keyword.lower() in text.lower():
                    inside_wonderlic_section = True
                # Check if we are exiting the Wonderlic section
                elif inside_wonderlic_section and wiesen_keyword.lower() in text.lower():
                    break
                    
                 # Add the lines of the Wonderlic section to the list
                if inside_wonderlic_section:
                    wonderlic_section_lines.append(text)

            WonderlicScore = wonderlic_section_lines[4]
            WonderlicScore = re.sub("[^0-9]", "", WonderlicScore)  # substituting everything that is NOT a digit to nothing

            # creating the dataframe for WONDERLIC
            WonderlicData = {
                "Wonderlic Score": [int(WonderlicScore)]
            }
            Wonderlicdf = pd.DataFrame(WonderlicData)
            
        except Exception as e:
            print(f"Error opening document: {file_path}")
            print(e)

打开文档时出错 - 列表索引超出范围[已关闭]

问题描述投票：0回答：0

这是我在 Jupyter Notebook 中使用的代码：

最新问题

打开文档时出错 - 列表索引超出范围[已关闭]

问题描述 投票：0回答：0

这是我在 Jupyter Notebook 中使用的代码：

最新问题

问题描述投票：0回答：0