如何将Python循环输出保存到Excel文件

Question

我有一个 Excel 文件 (input.xlsx)，其中包含两列（id 和 url）。

我对所有网址进行了网页抓取，并对文本进行了文本分析。

我有计算正分数、负分数、极性等的函数。

我想创建一个包含上述所有结果的输出文件（output.xlsx），但我的脚本在所有行中打印相同的输出，但它在函数内打印正确的输出。

示例：

列：Id、url、正分、负分、极性等

行：行将包含每个函数的输出。

预期输出： 正分（列）：23, 70, 43, 35（行）

实际产量： 正分（列）：35, 35, 35, 35（行）

我的职能：

#CALCULATING POSITIVE SCORES
# Cleaned texts
os.getcwd()
new_texts_folder = os.path.join(os.getcwd(), 'new_texts')

for root, folders, files in os.walk(new_texts_folder):
    for file in files:
        path = os.path.join(root, file)
        with codecs.open(path, encoding='utf-8', errors='ignore') as info:
            new_content = eval(info.read())  # Convert string to list
            
            def positive_score(content):
                #tokens = tokenz(text)
                pos_score = 0
                for token in content:
                    if token in filtered_positive_dictionary:
                        pos_score += 1
                return pos_score

            #positive_result = positive_score(new_content)

上述代码仅当您在函数内打印时才会打印正确的输出。它只在函数之外打印一个输出。

我的Excel函数：

data_collection = {
    'URL_ID': url_ids, #(this is working as expected)
    'URL': urls, #(this is working as expected)
     'POSITIVE SCORE': positive_score(new_content) #(this is not working as expected)
}
excel_data_df = pd.DataFrame(data_collection)
excel_data_df.to_excel("Outputput.xlsx", index = False)

Answer 1

出现您遇到的问题是因为您为每个文件调用一次 Positive_score 函数，但仅对 DataFrame 中的所有条目使用最后一个结果。为了解决这个问题，您需要将每个文件的结果存储在一个列表中，然后在创建 DataFrame 时使用该列表。

试试这个：

import os
import codecs
import pandas as pd

filtered_positive_dictionary = {'good': 1, 'excellent': 1, 'happy': 1}

def positive_score(content):
    pos_score = 0
    for token in content:
        if token in filtered_positive_dictionary:
            pos_score += 1
    return pos_score

url_ids = []
urls = []
positive_scores = []

new_texts_folder = os.path.join(os.getcwd(), 'new_texts')

for root, folders, files in os.walk(new_texts_folder):
    for file in files:
        path = os.path.join(root, file)
        with codecs.open(path, encoding='utf-8', errors='ignore') as info:
            new_content = eval(info.read())
            pos_score = positive_score(new_content)
            url_ids.append(file)
            urls.append(f"file://{path}")
            positive_scores.append(pos_score)

data_collection = {
    'URL_ID': url_ids,
    'URL': urls,
    'POSITIVE SCORE': positive_scores
}

excel_data_df = pd.DataFrame(data_collection)
excel_data_df.to_excel("Output.xlsx", index=False)

如何将Python循环输出保存到Excel文件

问题描述投票：0回答：1

1个回答

最新问题

如何将Python循环输出保存到Excel文件

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1