导入CSV,重塑变量的数组以进行逻辑回归

问题描述 投票:0回答:1

我希望在COVID-19大流行中每个人都保持安全。我是Python的新手,对于将数据从CSV导入Python进行简单的逻辑回归分析(其中因变量是二进制且自变量是连续的)有一个快速问题。

我导入了一个CSV文件,然后希望使用一个变量(Active)作为自变量,另一个变量(Smoke)作为响应变量。我能够将CSV文件加载到Python中,但是每次尝试生成一个逻辑回归模型来预测“运动”中的冒烟时,都会收到一个错误,指出“运动”必须重塑为一列(二维),因为它目前是一列尺寸。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
data = pd.read_csv('Pulse.csv') # Read the data from the CSV file
x = data['Active'] # Load the values from Exercise into the independent variable
x = np.array.reshape(-1,1)
y = data['Smoke'] # The dependent variable is set as Smoke

我一直收到以下错误消息:

ValueError:预期的2D数组,取而代之的是1D数组:array = [97. 82. 88. 106. 78. 109. 66. 68. 100. 70. 98. 140. 105. 84。134. 117. 100. 108. 76. 86. 110. 65. 85. 80. 87. 133. 125. 61。117. 90. 110. 68. 102. 67. 112. 86. 85. 66. 73. 85. 110.97。93. 86. 80. 96. 74. 124. 78. 93. 80. 80. 92. 69. 82. 88。74. 74. 75. 120. 105. 104. 99. 113. 67. 125. 133. 98. 80. 91。76. 78. 94. 150. 92. 96. 68. 82. 102. 69. 65. 84. 86. 84。116. 88. 65. 101. 89. 128. 68. 90. 80. 80. 98. 90. 82. 97。90. 98. 88. 94. 92. 96. 80. 66. 110. 87. 88. 94. 96. 89。74. 111. 81. 98. 99. 65. 95. 127. 76. 102. 88. 125. 72.76。112. 69. 101. 72. 112. 81. 90. 96. 66. 114. 71. 75. 102. 138。85. 80. 107. 119. 98. 95. 95. 76. 96. 102. 82. 99. 80. 83。102. 102. 106. 79. 80. 79. 110. 144. 80. 97. 60. 80. 108. 107。51. 68. 80. 80. 60. 64. 87. 110. 110. 82. 154. 139. 86. 95。112. 120. 79. 64. 84. 65. 60. 79. 79. 70. 75. 107. 78. 74。80. 121. 120. 96. 75. 106. 88. 91. 98. 63. 95. 85. 83.92。81. 89. 103. 110. 78. 122. 122. 71. 65. 92. 93. 88. 90. 56。95. 83. 97. 105. 82. 102. 87. 81.]。如果数据具有单个功能,则使用array.reshape(-1,1)调整数据的形状;如果包含单个样本,则使用array.reshape(1,-1)调整数据的形状。

python numpy statistics regression reshape
1个回答
0
投票

尝试一下:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix

data = pd.read_csv('Pulse.csv') # Read the data from the CSV file
x = data['Active'] # Load the values from Exercise into the independent variable
y = data['Smoke'] # The dependent variable is set as Smoke

lr = LogisticRegression().fit(x.values.reshape(-1,1), y)
© www.soinside.com 2019 - 2024. All rights reserved.