如何根据 pandas python 中的特定列合并两个数据框?

问题描述 投票:0回答:4

我必须合并两个数据框:

df1

company,standard
tata,A1
cts,A2
dell,A3

df2

company,return
tata,71
dell,78
cts,27
hcl,23

我必须将两个数据帧统一为一个数据帧。我需要像这样的输出:

company,standard,return
tata,A1,71
cts,A2,27
dell,A3,78
python pandas python-2.7 merge
4个回答
173
投票

使用

merge

print (pd.merge(df1, df2, on='company'))

样品:

print (df1)
  company standard
0    tata       A1
1     cts       A2
2    dell       A3

print (df2)
  company  return
0    tata      71
1    dell      78
2     cts      27
3     hcl      23

print (pd.merge(df1, df2, on='company'))
  company standard  return
0    tata       A1      71
1     cts       A2      27
2    dell       A3      78

5
投票

我想我们也可以使用

df1.merge(df2,on='Company')

0
投票

以下解决方案在 MSA(微服务)架构上非常适合我。

所需套件

pip install mysql-connector-python SQLAlchemy pandas

解决方案

import pandas as pd
from main.db.sql_alchemy import (
    engine_siaw,
    engine_sip,
)

altas_periodo_id = 10
df_altas = pd.read_sql(
    SQL_ALTAS_EST.format(altas_periodo_id=altas_periodo_id),
    con=engine_sip,
)
df_tiendas = pd.read_sql(SQL_ALTAS_TIENDAS, con=engine_siaw)
df_empresas = pd.read_sql(SQL_ALTAS_EMPRESAS, con=engine_siaw)

df_altas_tiendas = df_altas.merge(
    df_tiendas, left_on="tienda", right_on="nombre"
)
df_file = df_altas_tiendas.merge(
    df_empresas, left_on="empresa", right_on="razon_social"
)

包含原始 SQL 的 SQL 变量:

SQL_ALTAS_EST = """
select
    td.codigo as tipo_doc,
    ad.numero_documento as numero_doc,
    cna.codigo as nacionalidad,
    ad.tienda,
    ad.empresa
from sip_registroaltadetalle as ad
    inner join sip_personal sp on sp.codigo = ad.codigo
    left join sip_tipodocumentoidentidad as td on td.pkid = ad.tipo_documento
    left join sip_multitabladetalle as cna on cna.pkid = sp.nacionalidad_id
where ad.codigo is not null and ad.parent_id = '{altas_periodo_id}'
""".strip()

SQL_ALTAS_TIENDAS = """
select codigo as codigo_establecimiento, nombre from catalogo_tienda ct
""".strip()

SQL_ALTAS_EMPRESAS = """
select razon_social, ruc from catalogo_empresa ce
""".strip()

数据库连接:

from django.conf import settings
from sqlalchemy import create_engine

cn_sip = f"mysql+mysqlconnector://{settings.DB_USER}:{settings.DB_PASSWORD}@{settings.DB_HOST}:{settings.DB_PORT}/{settings.DB_NAME}"  # noqa
cn_siaw = f"mysql+mysqlconnector://{settings.SIAW_DB_USER}:{settings.SIAW_DB_PASSWORD}@{settings.SIAW_DB_HOST}:{settings.SIAW_DB_PORT}/{settings.SIAW_DB_NAME}"  # noqa

engine_siaw = create_engine(cn_siaw)
engine_sip = create_engine(cn_sip)

我真的希望你觉得它有帮助。


-4
投票

为了成功合并基于公共列的两个数据框,两个数据框中公共列的数据类型必须相同!列的 dtype 可以通过以下方式更改:

df['commonCol'] = df['commonCol'].astype(int)
© www.soinside.com 2019 - 2024. All rights reserved.