Memory pycharm python

问题描述 投票:0回答:1

下面的代码用于读取csv文件并将输出写入csv文件。这段代码工作得很好。但是,当csv文件大小(行数)增加时,会出现错误。我尝试将Xms更改为512m,将Xmx更改为2024m,将XX:ReservedCodeCacheSize更改为480m。但是仍然出现内存错误。

Traceback (most recent call last):
File "/root/PycharmProjects/AppAct/statfile.py", line 5, in <module>
   df = df.astype(float)
File "pandas/core/generic.py", line 5691, in astype
   **kwargs)
File "pandas/core/internals/managers.py", line 531, in astype
   return self.apply('astype', dtype=dtype, **kwargs)
File "pandas/core/internals/managers.py", line 402, in apply
   bm._consolidate_inplace()
File "pandas/core/internals/managers.py", line 929, in _consolidate_inplace
   self.blocks = tuple(_consolidate(self.blocks))
File "pandas/core/internals/managers.py", line 1899, in _consolidate
   _can_consolidate=_can_consolidate)
File "pandas/core/internals/blocks.py", line 3149, in _merge_blocks
   new_values = new_values[argsort]
MemoryError

import pandas as pd

all_df = pd.read_csv("/root/Desktop/Time-20ms/AllDataNew20ms.csv")
df = all_df.loc[:, all_df.columns != "activity"]
df = df.astype(float)
mask = (df != 0).any(axis=1)
df = df[mask]
recover_lines_of_activity_column = all_df["activity"][mask]
final_df = pd.concat([recover_lines_of_activity_column, df], axis=1)
final_df.to_csv("/root/Desktop/Dataset.csv", index=False)
python pandas pycharm
1个回答
0
投票

更改您的PyCharm内存限制(如-Xms和其他JVM设置一样,对实际运行您的Python代码的Python解释器绝对没有任何影响。

简单明了,将整个数据帧转换为浮点数(df = df.astype(float))时,系统内存不足。

除了更改代码以更有效地执行操作之外,还可以添加物理内存或启用交换。

(还可以确定您使用的是64位Python吗?]

一种简单的优化方法是减少复制和转换数据的工作–将dtype=...直接传递到pd.read_csv()。例如,请参见this answer

© www.soinside.com 2019 - 2024. All rights reserved.