我有6个大数据tsv文件,正在读取它们到Google Collab的数据框中。但是,文件太大,Google Colab无法处理。
#Crew data
downloaded = drive.CreateFile({'id':'16'})
downloaded.GetContentFile('title.crew.tsv')
df_crew = pd.read_csv('title.crew.tsv',header=None,sep='\t',dtype='unicode')
#Ratings data
downloaded = drive.CreateFile({'id':'15'})
downloaded.GetContentFile('title.ratings.tsv')
df_ratings = pd.read_csv('title.ratings.tsv',header=None,sep='\t',dtype='unicode')
#Episode data
downloaded = drive.CreateFile({'id':'14'})
downloaded.GetContentFile('title.episode.tsv')
df_episode = pd.read_csv('title.episode.tsv',header=None,sep='\t',dtype='unicode')
#Name Basics data
downloaded = drive.CreateFile({'id':'13'})
downloaded.GetContentFile('name.basics.tsv')
df_name = pd.read_csv('name.basics.tsv',header=None,sep='\t',dtype='unicode')
#Principals data
downloaded = drive.CreateFile({'id':'12'})
downloaded.GetContentFile('title.pricipals.tsv')
df_principals = pd.read_csv('title.pricipals.tsv',header=None,sep='\t',dtype='unicode')
#Title Basics data
downloaded = drive.CreateFile({'id':'11'})
downloaded.GetContentFile('title.basics.tsv')
df_title = pd.read_csv('title.basics.tsv',header=None,sep='\t',dtype='unicode')
错误:使用所有可用RAM后,您的会话崩溃。运行时日志这样说:
Google Collab如何更好地处理Ram?我所有的tsv文件的总大小为2,800 MB。请指教!
这是有关如何升级内存的有趣技巧:
https://towardsdatascience.com/upgrade-your-memory-on-google-colab-for-free-1b8b18e8791d
祝你好运〜!