给定一个数据帧列表,我想迭代地合并它们并返回单个数据帧。输入:frames
(pandas数据框列表)和on_columns
(包含要合并的列名的字符串或字符串列表)。我如何使用df.merge
来实现这一目标? “”给定数据帧列表,迭代合并它们并返回单个数据帧
"""HINT: Use slice on frames when iterating and merging.
Arguments:
frames {list} -- a list of pandas DataFrames
on_columns {string or list} -- a string or list of strings
containing the column names on which to join
Returns:
df -- a pandas.DataFrame containing a merged version of the
two provided dataframes. If frames is None or an empty list return None
"""
def merge(frames, on_columns):
#implementation here
df = #merged df
return df
编辑:我想也许我可以使用df.concat但不确定如何?
像这样的东西应该工作,
def merge(frames, on_columns):
#implementation here
if not frames:
return None
if len(frames) == 1:
return frames[0]
out = frames[0]
for df in frames[1:]:
out = out.merge(df, on=on_columns)
return out
import pandas as pd
df = next(dfs)
for records in dfs:
df = df.append(records)
# the above is equivalent to
df = pd.concat(dfs)
注意事项:
dfs
是pandas.DataFrame对象的迭代器dfs
都有相同的列pd.concat
期望迭代)并不明显;无论如何,pd.concat
实用程序做了减少pd.concat
和其他公用事业
http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.merge.html#pandas.merge
http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html#pandas.concat附:不要创建库提供的功能,乐于阅读文档并重新阅读文档,尤其是。因为大熊猫文档是卷