多边形内的python点(点云数据)

问题描述 投票:0回答:1

点数100,000,000(4GB)

我正在读取CSV文件,并将数据另存为CSV文件。

我正在使用导入csv.reader,它工作正常。但是我发现这段代码花了太多时间。

我如何改善我的任务性能?

[请向我提供其他选择。

性能是这里的主要问题。

from shapely.geometry import Point, Polygon
import csv
import os

req1 = input("path of the CSV file: ")

file_name = os.path.splitext(req1)
file_name = os.path.split(file_name[0])
path = file_name[0]
file_name = file_name[1]

with open(req1, "r") as f:  
    reader = csv.reader(f)
    next(reader) # skip header

    os.makedirs(path + "/" + file_name + "_output", exist_ok=True)
    outpath = path + "/" + file_name + "_output" + "/"

    coords = [[19.803499,15.2265],[-35.293499,33.7495],
            [-49.6675,33.726501],[-48.022499,20.4715],
            [-36.336498,-4.925],[-32.6105,-45.494499],
            [-10.5275,-38.3815],[-11.93835,-20.8235],
            [26.939501,-18.095501],[19.803499,15.2265]]

    poly = Polygon(coords)
    for row in reader:
        geom = Point(float(row[0]),float(row[1])) # Considering the order of elements that you gave

        x = float(row[0])
        y = float(row[1])
        z = float(row[2])
        r = int(row[3])
        g = int(row[4])
        b = int(row[5])
        i = int(row[6])

        result = geom.within(poly)

        if str(result) == 'True':
          with open(outpath + file_name + "_TRUE.csv", "a", newline = "") as file:
            writeData = ([str(x),',',str(y),',',str(z),',',str(r),',',str(g),',',str(b),',',str(i),('\n')])
            file.writelines(writeData)
            print('True', str(x),str(y),str(z))
        else:
          with open(outpath + file_name + "_FALSE.csv", "a", newline = "") as file:
            writeData = ([str(x),',',str(y),',',str(z),',',str(r),',',str(g),',',str(b),',',str(i),('\n')])
            file.writelines(writeData)
            #print('False', str(x),str(y),str(z))
python csv point point-clouds
1个回答
0
投票

我用[pd.read_csv]代替了[import csv.reader]。

所以性能有所提高。

但是,我尝试进行Python多处理,但我不太了解。

处理结果时间(1234秒-> 31秒)

import pandas as pd
from shapely.geometry import *

data = pd.read_csv("/sample.csv")
poly = Polygon([(-0.7655,-22.758499), (17.0525,-21.657499),   (16.5735,-26.269501), (0.4755,-28.6635)])
cord = data.values.tolist()

for i in cord:
    print(poly.intersects(Point(i[0], i[1])), i)

例如Python多处理池的示例代码enter link description here

import time 
from multiprocessing import Pool
def f(x):
  time.sleep(2) # Wait 2 seconds
  print(x*x)
p = Pool(8)
p.map(f, [1, 2, 3, 4])
p.close()
p.join()

我应该如何应用?

© www.soinside.com 2019 - 2024. All rights reserved.