Python-一次遍历N个记录,然后重新开始

问题描述 投票:0回答:1

[我正在尝试编写一个调用Google Translation API的脚本,以翻译具有1000行的Excel文件中的每一行。

我正在使用pandas加载和读取特定值中的值,然后将数据框附加到列表中,然后使用Google API进行翻译:

import os
from google.cloud import translate_v2 as translate
import pandas as pd
from datetime import datetime

# Variable for GCP service account credentials

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'path to credentials json'

# Path to the file

filepath = r'../file.xlsx'

# Instantiate the Google Translation API Client

translate_client = translate.Client()

# Read all the information from the Excel file within 'test' sheet name

df = pd.read_excel(filepath, sheet_name='test')

# Define an empty list

elements = []

# Loop the data frame and append the list


for i in df.index:
    elements.append(df['EN'][i])

# Loop the list and translate each line
for item in elements:
    output = translate_client.translate(
        elements,
        target_language='fr'
    )


result = [
    element['translatedText'] for element in output
]

print("The values corresponding to key : " + str(result))

在我追加到列表之后,元素的总数将为1000。Google Translation API的问题在于,如果您发送多个段,他们会调用它,它将返回以下错误:

400 POST https://translation.googleapis.com/language/translate/v2:文本段太多

我已经进行了调查,发现发送100条线路(以我的情况为例)将是一个解决方案。现在我有点卡住了。

[我将如何编写循环以一次迭代100行,转换那100行然后对结果做些什么,然后再处理其他100行,依此类推直到结束为止的循环?

python python-3.x google-cloud-platform
1个回答
0
投票

假设您能够将列表传递到单个转换调用中,也许您可​​以执行类似的操作:

# Define a helper to step thru the list in chunks
def chunker(seq, size):
    return (seq[pos : pos + size] for pos in range(0, len(seq), size))

# Then iterate and handle them accordignly
output = []
for chunk in chunker(elements, 100):
    temp = translate_client.translate(
        chunk,
        target_language='fr'
    )
    output.extend(temp)
© www.soinside.com 2019 - 2024. All rights reserved.