当我使用导入的 CSV 数据对seaborn 使用 regplot 和 lmplot 时,没有出现回归线

问题描述 投票:0回答:1

我想为东京多年的租金创建一个线性回归模型。到目前为止,我已经成功地用seaborn 绘制了散点图。但是,当我尝试进行线性回归时,错误 UFuncTypeError: ufunc 'multiply' did not contains a loop with signature matches types (dtype(' dtype('

我的代码

df = pd.read_csv('Tokyo rent (One Bedroom apartment in the city centre).csv', thousands=',')
g = sns.lmplot(x='Date 2.0', y='Rent(Local Currency)', data=df)
g.figure.autofmt_xdate()

我的 CSV 文件

Year    Rent(USD)   Date    Rent(Local Currency)    Date 2.0
2017    1223.4  17/03/17    129,594.59              2017-03-17
2017    1070.24 28/10/17    121,656.25              2017-10-28
2018    1104.51 23/01/18    121,689.66              2018-01-23
2018    1030.61 22/10/18    116,270.83              2018-10-22
2019    1124.33 14/06/19    122,062.50              2019-06-14
2019    1129.6  20/06/19    121,255.32              2019-06-20
2019    1129.9  21/06/19    121,255.32              2019-06-21
2020    1198.53 23/03/20    128,701.75              2020-03-23
2020    1183.66 01/07/20    127,195.65              2020-07-01
2020    1213.38 17/09/20    127,466.67              2020-09-17
2020    1168.37 05/10/20    123,578.95              2020-10-05
2020    1192.5  11/11/20    125,525.00              2020-11-11
2020    1228.34 02/12/20    128,312.50              2020-12-02
2021    1220    06/03/21    132,200.00              2021-03-06
2021    1342.84 29/08/21    147,524.40              2021-08-29
2021    1284.65 14/10/21    145,696.54              2021-10-14

我的散点图结果(抱歉日期被破坏了)

python jupyter-notebook seaborn linear-regression scatter-plot
1个回答
2
投票

在本例中,x 轴是时间序列,因此一旦我们将其转换为 matplotlib 的日期数字格式,我们就可以显示回归线。然后,您可以将 x 轴显示更改为时间序列。

import pandas as pd
import numpy as np
import io
import matplotlib.dates as mdates

data = '''
Year    Rent(USD)   Date    "Rent(Local Currency)"    "Date 2.0"
2017    1223.4  17/03/17    129,594.59              2017-03-17
2017    1070.24 28/10/17    121,656.25              2017-10-28
2018    1104.51 23/01/18    121,689.66              2018-01-23
2018    1030.61 22/10/18    116,270.83              2018-10-22
2019    1124.33 14/06/19    122,062.50              2019-06-14
2019    1129.6  20/06/19    121,255.32              2019-06-20
2019    1129.9  21/06/19    121,255.32              2019-06-21
2020    1198.53 23/03/20    128,701.75              2020-03-23
2020    1183.66 01/07/20    127,195.65              2020-07-01
2020    1213.38 17/09/20    127,466.67              2020-09-17
2020    1168.37 05/10/20    123,578.95              2020-10-05
2020    1192.5  11/11/20    125,525.00              2020-11-11
2020    1228.34 02/12/20    128,312.50              2020-12-02
2021    1220    06/03/21    132,200.00              2021-03-06
2021    1342.84 29/08/21    147,524.40              2021-08-29
2021    1284.65 14/10/21    145,696.54              2021-10-14
'''

df = pd.read_csv(io.StringIO(data), delim_whitespace=True, thousands=',')

df['Date 2.0'] = pd.to_datetime(df['Date 2.0'])
df['Date 2.0'] = mdates.date2num(df['Date 2.0'])

import seaborn as sns

g = sns.lmplot(x='Date 2.0', y='Rent(Local Currency)', data=df)

locator = mdates.AutoDateLocator()
formatter = mdates.ConciseDateFormatter(locator)

g.ax.xaxis.set_major_locator(locator)
g.ax.xaxis.set_major_formatter(formatter)

# while running in python code other than jupyter
import matplotlib.pyplot as plt  
plt.show()

© www.soinside.com 2019 - 2024. All rights reserved.