快速从数千个图像中检索exif数据 - 优化功能

Question

我编写了脚本，从目录（包括子目录）中的数千个图像中检索exif数据的特定字段，并将信息保存到csv文件中：

import os
from PIL import Image
from PIL.ExifTags import TAGS
import csv
from os.path import join

####SET THESE!###
imgpath = 'C:/x/y' #Path to folder of images
csvname = 'EXIF_data.csv' #Name of saved csv
###

def get_exif(fn):
    ret = {}
    i = Image.open(fn)
    info = i._getexif()
    for tag, value in info.items():
        decoded = TAGS.get(tag, tag)
        ret[decoded] = value
    return ret

exif_list = []
path_list = []
filename_list = []
DTO_list = []
MN_list = []

for root, dirs, files in os.walk(imgpath, topdown=True):
   for name in files:
       if name.endswith('.JPG'):
           pat = join(root, name)
           pat.replace(os.sep,"/")
           exif = get_exif(pat)
           path_list.append(pat)
           filename_list.append(name)
           DTO_list.append(exif['DateTimeOriginal'])
           MN_list.append(exif['MakerNote'])   

zipped = zip(path_list, filename_list, DTO_list, MN_list)

with open(csvname, "w", newline='') as f:
    writer = csv.writer(f)
    writer.writerow(('Paths','Filenames','DateAndTime','MakerNotes'))
    for row in zipped:
        writer.writerow(row)

但是，它很慢。我试图通过使用列表和字典理解来优化脚本的性能+可读性。

import os
from os import walk #Necessary for recursive mode
from PIL import Image #Opens images and retrieves exif
from PIL.ExifTags import TAGS #Convert exif tags from digits to names
import csv #Write to csv
from os.path import join #Join directory and filename for path

####SET THESE!###
imgpath = 'C:/Users/au309263/Documents/imagesorting_testphotos/Finse/FINSE01' #Path to folder of images. The script searches subdirectories as well
csvname = 'PLC_Speedtest2.csv' #Name of saved csv
###

def get_exif(fn): #Defining a function that opens an image, retrieves the exif data, corrects the exif tags from digits to names and puts the data into a dictionary
    i = Image.open(fn)   
    info = i._getexif()
    ret = {TAGS.get(tag, tag): value for tag, value in info.items()} 
    return ret

Paths = [join(root, f).replace(os.sep,"/") for root, dirs, files in walk(imgpath, topdown=True) for f in files if f.endswith('.JPG' or '.jpg')] #Creates list of paths for images
Filenames = [f for root, dirs, files in walk(imgpath, topdown=True) for f in files if f.endswith('.JPG' or '.jpg')] #Creates list of filenames for images
ExifData = list(map(get_exif, Paths)) #Runs the get_exif function on each of the images specified in the Paths list. List converts the map-object to a list.
MakerNotes = [i['MakerNote'] for i in ExifData] #Creates list of MakerNotes from exif data for images
DateAndTime = [i['DateTimeOriginal'] for i in ExifData] #Creates list of Date and Time from exif data for images

zipped = zip(Paths, Filenames, DateAndTime, MakerNotes) #Combines the four lists to be written into a csv.

with open(csvname, "w", newline='') as f: #Writes a csv-file with the exif data
    writer = csv.writer(f)
    writer.writerow(('Paths','Filenames','DateAndTime','MakerNotes'))
    for row in zipped:
        writer.writerow(row)

但是，这并没有改变性能。

我已经定时代码的特定区域，发现专门打开每个图像并从get_exif函数中的每个图像获取exif数据是需要时间的。为了使脚本更快，我想知道：1）是否可以优化函数的性能？，2）可以在不打开图像的情况下检索exif数据？，3）列表（map（fn， x））是应用功能的最快方法吗？

Answer 1

如果以正确的方式阅读文档，PIL.Image.open()不仅会从文件中提取EXIF数据，还会读取和解码整个图像，这可能是此处的瓶颈。我要做的第一件事就是更改为仅适用于EXIF数据且不关心图像内容的库或例程。 ExifRead或piexif可能值得一试。

快速从数千个图像中检索exif数据 - 优化功能

问题描述投票：0回答：1

1个回答

最新问题

快速从数千个图像中检索exif数据 - 优化功能

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1