我有一个列表和一个数据框。
import pandas as pd
work_station = ['A','B','C']
name = ['Mike','Tom','Scott','Tracy']
salary = ['60000','50000','100000','90000']
df = pd.DataFrame({'name':name,'salary':salary})
我想将work_station和df交叉连接在一起,因此输出如下所示:
station Name salary
A Mike 60000
A Tom 50000
A Scott 100000
A Tracy 90000
B Mike 60000
B Tom 50000
B Scott 100000
B Tracy 90000
C Mike 60000
C Tom 50000
C Scott 100000
C Tracy 90000
我尝试使用*功能
df1 = work_station * salary
但是它不能正常工作
TypeError: can't multiply sequence by non-int of type 'list'
有任何建议吗?谢谢!
轻松自如,将concat
与keys
参数一起使用:
(pd.concat([df] * len(work_station), keys=work_station)
.reset_index(level=1, drop=True)
.rename_axis('station')
.reset_index()
)
station name salary
0 A Mike 60000
1 A Tom 50000
2 A Scott 100000
3 A Tracy 90000
4 B Mike 60000
5 B Tom 50000
6 B Scott 100000
7 B Tracy 90000
8 C Mike 60000
9 C Tom 50000
10 C Scott 100000
11 C Tracy 90000
您还可以使用笛卡尔积乘merge
路线:
(pd.DataFrame(work_station, columns=['station'])
.assign(foo=1)
.merge(df.assign(foo=1))
.drop('foo', 1)
)
station name salary
0 A Mike 60000
1 A Tom 50000
2 A Scott 100000
3 A Tracy 90000
4 B Mike 60000
5 B Tom 50000
6 B Scott 100000
7 B Tracy 90000
8 C Mike 60000
9 C Tom 50000
10 C Scott 100000
11 C Tracy 90000