此网页('*https://www.nseindia.com/market-data/top-gainers-losers*')有 2 个表格('gainers' 和 'losers')。
我想要一个代码来读取网页的内容并将这两个表下载到两个单独的数据框中。我如何实现这一目标?
此页面使用
JavaScript
来生成页面,因此通常需要 Selenium 来控制可以运行 JavaScript
的真实网络浏览器
但是有按钮
Download .csv
可以将表格下载为 CSV
。
但它没有
URL
- 只有 onclick="downloadCSVFile('loosers')"
- 也许使用 downloadCSVFile('loosers')
中的 Selenium
你可以下载它。
但是我在 Firefox 中下载了这个文件,然后在 Firefox 中打开了
Download Manager
,然后我选择了下载的文件并使用 Copy Download Link
我获得了该文件的链接:
获胜者:
https://www.nseindia.com/api/live-analysis-variations?index=gainers&type=NIFTY&csv=true
失败者:
https://www.nseindia.com/api/live-analysis-variations?index=loosers&type=NIFTY&csv=true
现在我测试是否可以使用
requests
下载它
import requests
session = requests.Session()
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:125.0) Gecko/20100101 Firefox/125.0'}
url = 'https://www.nseindia.com/market-data/top-gainers-losers'
response = session.get(url, headers=headers)
#print(response.text)
url = 'https://www.nseindia.com/api/live-analysis-variations?index=gainers&type=NIFTY&csv=true'
response = session.get(url, headers=headers)
response.encoding = "utf-8-sig"
print(response.text)
首先:它需要标头
User-Agent
才能从服务器下载任何内容。如果您没有此标头,那么它就会挂起。
第二:下载之前需要获取主页 - 可能是为了获取一些cookie。 第三:它给出开头带有

的文本 - 它是 BOM (Byte Order Mark
)。您需要使用编码utf-8-sig
来跳过它
这给了我:
Symbol","Open","High","Low","Prev. Close","LTP","%chng","Volume","Value","CA "
"BAJFINANCE",6840.05,7150,6810.05,6893.2,7110,3.15,1218375,8520583725,"30-Jun-2023"
"M&M",2034,2087,1998.2,2024.95,2084,2.92,3253248,6691117824,"14-Jul-2023"
"HDFCBANK",1486.55,1534.95,1480.25,1494.7,1534.2,2.64,17288217,26183869057.35,"16-May-2023"
"MARUTI",12399.9,12759.4,12225,12405,12690,2.3,635535,7949392531.65,"03-Aug-2023"
"JSWSTEEL",842.85,867.3,833.2,844.8,864,2.27,3157898,2697823840.38,"11-Jul-2023"
"BHARTIARTL",1280,1296.5,1253.35,1265.75,1289.95,1.91,13103862,16751846142.18,"11-Aug-2023"
"GRASIM",2220,2290.75,2201.35,2226.05,2266.9,1.84,1061064,2394651677.76,"10-Jan-2024"
"WIPRO",440,453.9,437,444.35,452.1,1.74,10235053,4572612278.28,"24-Jan-2024"
"BAJAJFINSV",1587,1628.75,1568.7,1593.9,1618.6,1.55,1242066,1990398344.34,"30-Jun-2023"
"APOLLOHOSP",6140,6199,6050,6074.15,6155,1.33,560178,3437179384.86,"20-Feb-2024"
"ITC",418,426.25,416,418.85,424,1.23,16582634,7025067067.76,"08-Feb-2024"
"ICICIBANK",1052.95,1072,1048.1,1055.45,1068,1.19,11284433,11984406378.99,"09-Aug-2023"
"ADANIPORTS",1280,1316,1270,1295.55,1310.8,1.18,3899281,5057640406.67,"28-Jul-2023"
"TATASTEEL",160,162.5,157.3,160.05,161.9,1.16,60078229,9640753407.63,"22-Jun-2023"
"TECHM",1163.05,1204.85,1162.95,1179.65,1192.5,1.09,2572144,3057173194.08,"02-Nov-2023"
"TITAN",3525.1,3571.2,3478.25,3525.1,3562,1.05,1507940,5329859168.2,"13-Jul-2023"
"AXISBANK",1015,1036.95,995.7,1024,1031.7,0.75,21598007,21821762352.52,"07-Jul-2023"
"SBIN",734.5,752,732.05,744.8,750,0.7,10886554,8092302184.82,"31-May-2023"
"HINDUNILVR",2220,2243.75,2196,2214.8,2230,0.69,2337694,5205834145.54,"02-Nov-2023"
"INDUSINDBK",1466,1490.25,1444.4,1474.4,1483,0.58,4311650,6341790402.5,"02-Jun-2023"
使用
io
我可以将其加载到 pandas
import pandas as pd
import io
df = pd.read_csv(io.StringIO(response.text))
print(df)
结果:
0 BAJFINANCE 6840.05 7150.00 6810.05 6893.20 7110.00 3.15 1218375 8.520584e+09 30-Jun-2023
1 M&M 2034.00 2087.00 1998.20 2024.95 2084.00 2.92 3253248 6.691118e+09 14-Jul-2023
2 HDFCBANK 1486.55 1534.95 1480.25 1494.70 1534.20 2.64 17288217 2.618387e+10 16-May-2023
3 MARUTI 12399.90 12759.40 12225.00 12405.00 12690.00 2.30 635535 7.949393e+09 03-Aug-2023
4 JSWSTEEL 842.85 867.30 833.20 844.80 864.00 2.27 3157898 2.697824e+09 11-Jul-2023
5 BHARTIARTL 1280.00 1296.50 1253.35 1265.75 1289.95 1.91 13103862 1.675185e+10 11-Aug-2023
6 GRASIM 2220.00 2290.75 2201.35 2226.05 2266.90 1.84 1061064 2.394652e+09 10-Jan-2024
7 WIPRO 440.00 453.90 437.00 444.35 452.10 1.74 10235053 4.572612e+09 24-Jan-2024
8 BAJAJFINSV 1587.00 1628.75 1568.70 1593.90 1618.60 1.55 1242066 1.990398e+09 30-Jun-2023
9 APOLLOHOSP 6140.00 6199.00 6050.00 6074.15 6155.00 1.33 560178 3.437179e+09 20-Feb-2024
10 ITC 418.00 426.25 416.00 418.85 424.00 1.23 16582634 7.025067e+09 08-Feb-2024
11 ICICIBANK 1052.95 1072.00 1048.10 1055.45 1068.00 1.19 11284433 1.198441e+10 09-Aug-2023
12 ADANIPORTS 1280.00 1316.00 1270.00 1295.55 1310.80 1.18 3899281 5.057640e+09 28-Jul-2023
13 TATASTEEL 160.00 162.50 157.30 160.05 161.90 1.16 60078229 9.640753e+09 22-Jun-2023
14 TECHM 1163.05 1204.85 1162.95 1179.65 1192.50 1.09 2572144 3.057173e+09 02-Nov-2023
15 TITAN 3525.10 3571.20 3478.25 3525.10 3562.00 1.05 1507940 5.329859e+09 13-Jul-2023
16 AXISBANK 1015.00 1036.95 995.70 1024.00 1031.70 0.75 21598007 2.182176e+10 07-Jul-2023
17 SBIN 734.50 752.00 732.05 744.80 750.00 0.70 10886554 8.092302e+09 31-May-2023
18 HINDUNILVR 2220.00 2243.75 2196.00 2214.80 2230.00 0.69 2337694 5.205834e+09 02-Nov-2023
19 INDUSINDBK 1466.00 1490.25 1444.40 1474.40 1483.00 0.58 4311650 6.341790e+09 02-Jun-2023