我有一个数据框df。我已经将其相关,然后找出了前四个高度相关的值。这些值我已命名为相关功能。我希望访问这些相关特征的值(相关特征是系列对象)
correlation_matrix=df.corr() #taking correlation of the df
cor_target=abs(correlation_matrix['median_house_value']) #finding the correlation of all variables\
#against median housing value
#Selecting 4 of the most correlated features
relevant_features = cor_target.sort_values(ascending=False).head(4)
relevant_features
输出:
median_house_value 1.000000
median_income 0.688075
income_cat 0.553377
latitude 0.144160
Name: median_house_value, dtype: float64
:,4
我将获得以上4个特征,这些特征与上述中位数房屋价值具有最大的相关性。现在,我想访问值1.000、0.688075、0.553377等,基本上是第一列。
我尝试了以下代码:
[IN] relevant_features[:,4]
[OUT]ValueError: Can only tuple-index with a MultiIndex
[IN] relevant_features.iloc[:,1]
[OUT]IndexingError: Too many indexers
[IN] relevant_features.loc[[0,1,2,3]]
[OUT]KeyError: "None of [Int64Index([0, 1, 2, 3], dtype='int64')] are in the [index]"
[IN] relevant_features[:,3]
[OUT]ValueError: Can only tuple-index with a MultiIndex
我阅读了许多问题,答案和文章,但这无济于事。
[IN]type(relevant_features)
[OUT]pandas.core.series.Series
您真的很亲密,需要:
relevant_features.iloc[:4]