CSV的Json问题

问题描述 投票:0回答:4

我正在尝试从NBA统计信息页面获取一些统计信息。我正在遵循本教程思想https://towardsdatascience.com/using-python-pandas-and-plotly-to-generate-nba-shot-charts-e28f873a99cb

基本思想是将数据放入一个csv文件中。

所以我尝试这段代码,以从nba网站获取数据,尝试获取json文件并将其转换为csv:

import requests
import json
import pandas as pd
from pandas import DataFrame as df
import urllib.request



shot_data_url_start="https://stats.nba.com/events/?flag=3&CFID=33&CFPARAMS=2017-18&PlayerID="
player_id="202695"
shot_data_url_end="&ContextMeasure=FGA&Season=2017-18&section=player&sct=plot"

def shoy_chart(player_id):
   full_url = shot_data_url_start + str(player_id) + shot_data_url_end
   json = requests.get(full_url, headers=headers).json()
return(json)



data = json['resultSets'][0]['rowSets']
columns = json['resultSets'][0]['headers']


df = pd.DataFrame.from_records(data, columns=columns)

这是笔记本电脑显示给我的错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-42-a3452c3a4fc8> in <module>
 18 
 19 
 ---> 20 data = json['resultSets'][0]['rowSets']
 21 columns = json['resultSets'][0]['headers']
 22 

 TypeError: 'module' object is not subscriptable

任何人都可以帮助我,或者知道另一种将数据保存到.csv或excel文件的方法吗?

python json csv export-to-csv export-to-excel
4个回答
1
投票
[与import json一起导入时,名称json指的是Python标准库的JSON模块。您不能将其用作常规变量名称。如果将变量重命名为其他名称,例如response_json,则此部分代码将起作用。

关于其余代码,页面https://stats.nba.com/events/不返回任何JSON文本,它是包含图像,菜单,视频播放器等的常规网页。如果您想访问返回的API JSON格式的镜头,则必须使用https://stats.nba.com/stats/shotchartdetail(带有正确的查询字符串)。本教程在“ Chrome XHR标签和通过URL链接的结果json”图片中提到了该API端点。


0
投票
确定,我已经更改了这样的代码:

import requests import json import pandas as pd from pandas import DataFrame as df import urllib.request def shot_chart(player_id): full_url = "https://stats.nba.com/stats/shotchartdetail?AheadBehind=&CFID=33&CFPARAMS=2017-18&ClutchTime=&Conference=&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&Division=&EndPeriod=10&EndRange=28800&GROUP_ID=&GameEventID=&GameID=&GameSegment=&GroupID=&GroupMode=&GroupQuantity=5&LastNGames=0&LeagueID=00&Location=&Month=0&OnOff=&OpponentTeamID=0&Outcome=&PORound=0&Period=0&PlayerID=202695&PlayerID1=&PlayerID2=&PlayerID3=&PlayerID4=&PlayerID5=&PlayerPosition=&PointDiff=&Position=&RangeType=0&RookieYear=&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StartPeriod=1&StartRange=0&StarterBench=&TeamID=0&VsConference=&VsDivision=&VsPlayerID1=&VsPlayerID2=&VsPlayerID3=&VsPlayerID4=&VsPlayerID5=&VsTeamID=" response_json = requests.get(full_url, headers=headers) return(response_json) data = response_json['resultSets'][0]['rowSets'] columns = response_json['resultSets'][0]['headers'] df = pd.DataFrame.from_records(data, columns=columns)


0
投票
import requests import json import pandas as pd from pandas import DataFrame as df import urllib.request shot_data_url_start="https://stats.nba.com/stats/shotchartdetail?AheadBehind=&CFID=33&CFPARAMS=2019-20&ClutchTime=&Conference=&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&Division=&EndPeriod=10&EndRange=28800&GROUP_ID=&GameEventID=&GameID=&GameSegment=&GroupID=&GroupMode=&GroupQuantity=5&LastNGames=0&LeagueID=00&Location=&Month=0&OnOff=&OpponentTeamID=0&Outcome=&PORound=0&Period=0&PlayerID=" player_id="202330" shot_data_url_end="&PlayerID1=&PlayerID2=&PlayerID3=&PlayerID4=&PlayerID5=&PlayerPosition=&PointDiff=&Position=&RangeType=0&RookieYear=&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StartPeriod=1&StartRange=0&StarterBench=&TeamID=0&VsConference=&VsDivision=&VsPlayerID1=&VsPlayerID2=&VsPlayerID3=&VsPlayerID4=&VsPlayerID5=&VsTeamID=" def shot_chart(player_id): full_url = shot_data_url_start + str(player_id) + shot_data_url_end response_json = requests.get(full_url).json() return(response_json) data = response_json['resultSets'][0]['rowSets'] columns = response_json['resultSets'][0]['headers'] df = pd.DataFrame.from_records(data, columns=columns) shot_chart("202330")
现在发生了什么?笔记本塞进去了吧

0
投票
尝试一下

import pandas as pd from pandas import DataFrame as df shot_data_url_start = "https://stats.nba.com/stats/shotchartdetail?AheadBehind=&CFID=33&CFPARAMS=2017-18&ClutchTime=&Conference=&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&Division=&EndPeriod=10&EndRange=28800&GROUP_ID=&GameEventID=&GameID=&GameSegment=&GroupID=&GroupMode=&GroupQuantity=5&LastNGames=0&LeagueID=00&Location=&Month=0&OnOff=&OpponentTeamID=0&Outcome=&PORound=0&Period=0&PlayerID=" player_id = "204001" shot_data_url_end = "&PlayerID1=&PlayerID2=&PlayerID3=&PlayerID4=&PlayerID5=&PlayerPosition=&PointDiff=&Position=&RangeType=0&RookieYear=&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StartPeriod=1&StartRange=0&StarterBench=&TeamID=0&VsConference=&VsDivision=&VsPlayerID1=&VsPlayerID2=&VsPlayerID3=&VsPlayerID4=&VsPlayerID5=&VsTeamID=" def get_shot_data(player_id): full_url = shot_data_url_start + player_id + shot_data_url_end data = requests.get( full_url, headers = { "User-Agent": "PostmanRuntime/7.4.0" } ) return data.json() shot_results = get_shot_data(player_id) result_sets = shot_results['resultSets'] first_result_set = result_sets[0] row_set = first_result_set['rowSet'] set_headers = first_result_set['headers'] df = pd.DataFrame.from_records(row_set, columns=set_headers)

我知道您对那个中等职位感到困惑。您缺少headers,并且NBA API的网址不正确。这就是@pierre在回应中要说的话。您使用的网址不正确。如果您重新阅读了所关注的文章,您会发现作者说他必须深入研究开发工具才能找到要使用的实际URL,以获取JSON。

编辑:忘记了,当我没有在User-Agent中传递headers时,请求将超时。如果您不通过,您将不会获得成功的回复。

© www.soinside.com 2019 - 2024. All rights reserved.