我对pandas是个超级菜鸟,我按照一个明显过时的教程来做。
我有这个简单的脚本,当我运行时,我得到了这个错误。
ValueError: Array conditional must be same shape as self
# loading the class data from the package pandas_datareader
import pandas as pd
from pandas_datareader import data
import matplotlib.pyplot as plt
# Adj Close:
# The closing price of the stock that adjusts the price of the stock for corporate actions.
# This price takes into account the stock splits and dividends.
# The adjusted close is the price we will use for this example.
# Indeed, since it takes into account splits and dividends, we will not need to adjust the price manually.
# First day
start_date = '2014-01-01'
# Last day
end_date = '2018-01-01'
# Call the function DataReader from the class data
goog_data = data.DataReader('GOOG', 'yahoo', start_date, end_date)
goog_data_signal = pd.DataFrame(index=goog_data.index)
goog_data_signal['price'] = goog_data['Adj Close']
goog_data_signal['daily_difference'] = goog_data_signal['price'].diff()
goog_data_signal['signal'] = 0.0
# this line produces the error
goog_data_signal['signal'] = pd.DataFrame.where(goog_data_signal['daily_difference'] > 0, 1.0, 0.0)
goog_data_signal['positions'] = goog_data_signal['signal'].diff()
print(goog_data_signal.head())
我试图通过实践来理解理论、库和方法论,所以如果太明显的话,请原谅我... :] 。
该 where
方法总是从一个数据框架中调用,但是在这里,你只需要检查一个系列的条件,所以我发现了两种方法来解决这个问题。
where
方法不支持为条件为真(在你的例子中为1.0)的行设置一个值,但仍然支持为假行设置一个值(称为 other
中的参数 文档). 所以你以后可以按照下面的方法手动设置1.0的。goog_data_signal['signal'] = goog_data_signal.where(goog_data_signal['daily_difference'] > 0, other=0.0)
# the true rows will retain their values and you can set them to 1.0 as needed.
goog_data_signal['signal'] = (goog_data_signal['daily_difference'] > 0).astype(int)
第二种方法对我来说是一种输出
price daily_difference signal positions
Date
2014-01-02 554.481689 NaN 0 NaN
2014-01-03 550.436829 -4.044861 0 0.0
2014-01-06 556.573853 6.137024 1 1.0
2014-01-07 567.303589 10.729736 1 0.0
2014-01-08 568.484192 1.180603 1 0.0