嗨,我想为pandas数据框创建子类,但该数据框的子类也将从我自己的自定义类继承。我想这样做是因为我想制作多个子类化的数据框,以及其他将共享此基类的属性和方法的子类(不是数据框)。
开始我的基础课程是
class thing(object):
def __init__(self, item_location, name):
self.name = name
self.file = item_location
self.directory = os.path.join(*item_location.split(os.path.sep)[0:-1])
@property
def name(self):
return self._name
@name.setter
def name(self,val):
self._name = val
@property
def file(self):
return self._file
@file.setter
def file(self,val):
self._location = val
@property
def directory(self):
return self._directory
@directory.setter
def directory(self,val):
self._directory = val
现在是我将从熊猫和事物继承的子类之一
class custom_dataframe(thing,pd.DataFrame):
def __init__(self, *args, **kwargs):
super(custom_dataframe,self).__init__(*args,**kwargs)
@property
def _constructor(self):
return custom_dataframe
我只是尝试制作一个空白的数据框,只给它命名文件位置
custom_dataframe('/foobar/foobar/foobar.html','name')
我得到一个错误
((我无法将整个堆栈跟踪信息发布到未连接到互联网的计算机上)
File "<stdin>", line 1, in <module>
File "<path to file with classes>", line x, in __init__
self.name = name
<a bunch of stuff going through pandas library>
File "<path to pandas generic.py>", line 4372, in __getattr__
return object.__getattribute__(self,name)
RecursionError: maximum recursion depth exceeded while calling a Python object
我正在使用熊猫0.23.4
编辑:
item_location.split(os.pathsep)[0:-1]
更改为*item_location.split(os.path.sep)[0:-1]
您在注释部分I've read that
中说过。但是,您没有。这就是问题的根源。由于that
描述了对熊猫数据框进行子类化的步骤,包括定义原始属性的方法。
考虑代码的followig修改。关键部分是_metadata
。我从thing
类中删除了所有属性,因为它们会增加大量原始属性名称-它们都必须添加到_metadata
中。我还添加了__repr__
方法来修复另一个RecursionError
。最后,我删除了directory
属性,因为它赋予了我TypeError
。
import pandas as pd
class thing(object):
def __init__(self, item_location, name):
self.name = name
self.file = item_location
def __repr__(self):
return 'dummy_repr'
class custom_dataframe(thing, pd.DataFrame):
_metadata = ['name', 'file', 'directory']
def __init__(self, *args, **kwargs):
super(custom_dataframe, self).__init__(*args, **kwargs)
@property
def _constructor(self):
return custom_dataframe
if __name__ == '__main__':
cd = custom_dataframe('/foobar/foobar/foobar.html', 'name')
编辑。有点增强的版本-实施效果很差。
import pandas as pd
class thing:
_metadata = ['name', 'file']
def __init__(self, item_location, name):
self.name = name
self.file = item_location
class custom_dataframe(thing, pd.DataFrame):
def __init__(self, *args, **kwargs):
item_location = kwargs.pop('item_location', None)
name = kwargs.pop('name', None)
thing.__init__(self, item_location, name)
pd.DataFrame.__init__(self, *args, **kwargs)
@property
def _constructor(self):
return custom_dataframe
if __name__ == '__main__':
cd = custom_dataframe(
{1: [1, 2, 3], 2: [1, 2, 3]},
item_location='/foobar/foobar/foobar.html',
name='name')