我必须合并两个数据框:
df1
company,standard
tata,A1
cts,A2
dell,A3
df2
company,return
tata,71
dell,78
cts,27
hcl,23
我必须将两个数据帧统一为一个数据帧。我需要像这样的输出:
company,standard,return
tata,A1,71
cts,A2,27
dell,A3,78
merge
:
print (pd.merge(df1, df2, on='company'))
样品:
print (df1)
company standard
0 tata A1
1 cts A2
2 dell A3
print (df2)
company return
0 tata 71
1 dell 78
2 cts 27
3 hcl 23
print (pd.merge(df1, df2, on='company'))
company standard return
0 tata A1 71
1 cts A2 27
2 dell A3 78
我想我们也可以使用
df1.merge(df2,on='Company')
以下解决方案在 MSA(微服务)架构上非常适合我。
pip install mysql-connector-python SQLAlchemy pandas
import pandas as pd
from main.db.sql_alchemy import (
engine_siaw,
engine_sip,
)
altas_periodo_id = 10
df_altas = pd.read_sql(
SQL_ALTAS_EST.format(altas_periodo_id=altas_periodo_id),
con=engine_sip,
)
df_tiendas = pd.read_sql(SQL_ALTAS_TIENDAS, con=engine_siaw)
df_empresas = pd.read_sql(SQL_ALTAS_EMPRESAS, con=engine_siaw)
df_altas_tiendas = df_altas.merge(
df_tiendas, left_on="tienda", right_on="nombre"
)
df_file = df_altas_tiendas.merge(
df_empresas, left_on="empresa", right_on="razon_social"
)
包含原始 SQL 的 SQL 变量:
SQL_ALTAS_EST = """
select
td.codigo as tipo_doc,
ad.numero_documento as numero_doc,
cna.codigo as nacionalidad,
ad.tienda,
ad.empresa
from sip_registroaltadetalle as ad
inner join sip_personal sp on sp.codigo = ad.codigo
left join sip_tipodocumentoidentidad as td on td.pkid = ad.tipo_documento
left join sip_multitabladetalle as cna on cna.pkid = sp.nacionalidad_id
where ad.codigo is not null and ad.parent_id = '{altas_periodo_id}'
""".strip()
SQL_ALTAS_TIENDAS = """
select codigo as codigo_establecimiento, nombre from catalogo_tienda ct
""".strip()
SQL_ALTAS_EMPRESAS = """
select razon_social, ruc from catalogo_empresa ce
""".strip()
数据库连接:
from django.conf import settings
from sqlalchemy import create_engine
cn_sip = f"mysql+mysqlconnector://{settings.DB_USER}:{settings.DB_PASSWORD}@{settings.DB_HOST}:{settings.DB_PORT}/{settings.DB_NAME}" # noqa
cn_siaw = f"mysql+mysqlconnector://{settings.SIAW_DB_USER}:{settings.SIAW_DB_PASSWORD}@{settings.SIAW_DB_HOST}:{settings.SIAW_DB_PORT}/{settings.SIAW_DB_NAME}" # noqa
engine_siaw = create_engine(cn_siaw)
engine_sip = create_engine(cn_sip)
我真的希望你觉得它有帮助。
为了成功合并基于公共列的两个数据框,两个数据框中公共列的数据类型必须相同!列的 dtype 可以通过以下方式更改:
df['commonCol'] = df['commonCol'].astype(int)