从url下载图像并从csv文件中为其指定id

问题描述 投票:1回答:2

我有一个包含列的csv文件:image_id,image_url

我需要从URL下载所有图像并将其保存为相应的image_id作为名称。有办法吗?

我知道你可以使用我在网上看过的代码中的python这样做

import cStringIO # *much* faster than StringIO
import urllib
import Image

try:
    file = 
urllib.urlopen('http://freegee.sourceforge.net/FG_EN/src/teasers_en/t_gee-power_en.gif')
    im = cStringIO.StringIO(file.read()) # constructs a StringIO holding the image
    img = Image.open(im)
    img.save('/home/wenbert/uploaderx_files/test.gif')
except IOError, e:
    raise e

但如果我可以自动化上传到GCP存储桶的过程,如何更好地引用csv中的url和文件名?

感谢我能得到的任何帮助。干杯!

python image download google-cloud-platform google-cloud-storage
2个回答
0
投票

这应该有所帮助。使用csv module解析您的CSV文件。

例如:

# -*- coding: utf-8 -*-

import csv
import cStringIO # *much* faster than StringIO
import urllib
import Image

def downloadFile(imageID, url):
    try:
        file = urllib.urlopen(url)
        im = cStringIO.StringIO(file.read())  # constructs a StringIO holding the image
        img = Image.open(im)
        img.save('/home/wenbert/uploaderx_files/{0}.gif'.format(imageID))
    except IOError, e:
        raise e

with open('PATH_TO_.csv', 'rb') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    next(reader, None)  # skip the headers
    for row in reader:
        print row
        downloadFile(row[0], row[1])

-1
投票

我在下面制作了一个python脚本。我只在python 3.4.3中测试了这个,但是应该这样做。

希望这可以帮助。

import urllib, csv, requests, os
from pathlib import Path


spreadsheetAddress = 'C:\\SOURCE\\CSV\\FILE.csv'
targetDirectory = 'C:\\TARGET\\IMAGE\\SAVE\\LOCATION\\'

def getSpreadsheetContents(spreadsheetAddress):
    with open(spreadsheetAddress) as csvfile:
        readCSV = csv.reader(csvfile, delimiter=',')
        imageSet = {}
        for row in readCSV:
            if 'image_id' not in row:
                imageSet[row[0]] = row[1]
    return imageSet


if __name__ == "__main__":
    if os.path.exists(spreadsheetAddress) and os.path.exists(targetDirectory):
        imageDict = getSpreadsheetContents(spreadsheetAddress)
        for key, value in imageDict.items():
            if requests.get(value).status_code == 200:
                filename, file_extension = os.path.splitext(value)
                address = str(targetDirectory + "\\" + key + file_extension)
                urllib.request.urlretrieve(value, address)
    else:
        raise Exception("File not found")
© www.soinside.com 2019 - 2024. All rights reserved.