为什么非线程程序的执行速度比线程程序要快,所以要以python下载数据集

问题描述 投票:0回答:1

我想下载印度股票市场公司的数据集,所以我编写了以下代码来下载它,但是这花了太多时间,因为我想下载的公司数量大约是1700家。

首先我以常规方式编写它,而没有使用如下所示的线程,

import pandas_datareader as web
import pandas as pd
import csv
import requests
import time
import concurrent.futures
import datetime
from threading import Thread

start = datetime.date.today() - datetime.timedelta(days=10)
end = yesterday = datetime.date.today() - datetime.timedelta(days=1)

t1 = time.perf_counter()


df = web.DataReader("RELIANCE.NS", 'yahoo', start,end)
df = web.DataReader("TCS.NS", 'yahoo', start,end)
df = web.DataReader("HINDUNILVR.NS", 'yahoo', start,end)
df = web.DataReader("HDFCBANK.NS", 'yahoo', start,end)
df = web.DataReader("HDFC.NS", 'yahoo', start,end)
df = web.DataReader("INFY.NS", 'yahoo', start,end)
df = web.DataReader("KOTAKBANK.NS", 'yahoo', start,end)
df = web.DataReader("BHARTIARTL.NS", 'yahoo', start,end)
df = web.DataReader("ITC.NS", 'yahoo', start,end)
df = web.DataReader("ICICIBANK.NS", 'yahoo', start,end)
df = web.DataReader("SBIN.NS", 'yahoo', start,end)
df = web.DataReader("ASIANPAINT.NS", 'yahoo', start,end)
df = web.DataReader("DMART.NS", 'yahoo', start,end)
df = web.DataReader("BAJFINANCE.NS", 'yahoo', start,end)
df = web.DataReader("MARUTI.NS", 'yahoo', start,end)
df = web.DataReader("HCLTECH.NS", 'yahoo', start,end)
df = web.DataReader("LT.NS", 'yahoo', start,end)
df = web.DataReader("WIPRO.NS", 'yahoo', start,end)
df = web.DataReader("AXISBANK.NS", 'yahoo', start,end)
df = web.DataReader( "ULTRACEMCO.NS", 'yahoo', start,end)
df = web.DataReader("HDFCLIFE.NS", 'yahoo', start,end)
df = web.DataReader("COALINDIA.NS", 'yahoo', start,end)
df = web.DataReader("ONGC.NS", 'yahoo', start,end)
df = web.DataReader("SUNPHARMA.NS", 'yahoo', start,end)
df = web.DataReader("NTPC.NS", 'yahoo', start,end)


t2 = time.perf_counter()

print(f'在{t2-t1}秒内完成”)

输出

Finished in 27.4473087 seconds

然后我在youtube上看到了一些有关线程的视频,我转换了以下相同的程序,

import pandas_datareader as web
import pandas as pd
import csv
import requests
import time
import concurrent.futures
import datetime
from threading import Thread

start = datetime.date.today() - datetime.timedelta(days=10)
end = yesterday = datetime.date.today() - datetime.timedelta(days=1)

t1 = time.perf_counter()


shareSymbols = [
   "RELIANCE.NS", "TCS.NS", "HINDUNILVR.NS", "HDFCBANK.NS", "HDFC.NS", "INFY.NS","KOTAKBANK.NS","BHARTIARTL.NS", "ITC.NS", "ICICIBANK.NS", "SBIN.NS", "ASIANPAINT.NS","DMART.NS", "BAJFINANCE.NS", "MARUTI.NS", "HCLTECH.NS","LT.NS", "WIPRO.NS", "AXISBANK.NS", "ULTRACEMCO.NS", "HDFCLIFE.NS" ,"COALINDIA.NS", "ONGC.NS", "SUNPHARMA.NS", "NTPC.NS"
]
def download_data(shareSymbol):
    df = web.DataReader(shareSymbols, 'yahoo', start,end)


with concurrent.futures.ThreadPoolExecutor() as executor:
    executor.map(download_data, shareSymbols)    

    t2 = time.perf_counter()

    print(f'Finished in {t2-t1} seconds')
输出,Finished in 83.4883162 seconds
为什么第一个程序比第二个程序要花更少的时间?我需要进行任何更改吗?    
python multithreading python-multithreading
1个回答
0
投票
current.futures包具有

[class concurrent.futures.ProcessPoolExecutor(max_workers=None, mp_context=None, initializer=None, initargs=())为此。

© www.soinside.com 2019 - 2024. All rights reserved.