a
,尺寸为1762行×9列。在列中 ema
除了第13个元素,其他都是 NaN
. 该 ind
列包含相应行的索引。a.head(20)
>>>
date symbol open close low high volume ema ind
2010-01-04 YHOO 16.940001 17.100000 16.879999 17.200001 16587400.0 NaN 0
2010-01-05 YHOO 17.219999 17.230000 17.000000 17.230000 11718100.0 NaN 1
2010-01-06 YHOO 17.170000 17.170000 17.070000 17.299999 16422000.0 NaN 2
2010-01-07 YHOO 16.809999 16.700001 16.570000 16.900000 31816300.0 NaN 3
2010-01-08 YHOO 16.680000 16.700001 16.620001 16.760000 15470000.0 NaN 4
2010-01-11 YHOO 16.770000 16.740000 16.480000 16.830000 16181900.0 NaN 5
2010-01-12 YHOO 16.650000 16.680000 16.600000 16.860001 15672400.0 NaN 6
2010-01-13 YHOO 16.879999 16.900000 16.650000 16.980000 16955600.0 NaN 7
2010-01-14 YHOO 16.809999 17.120001 16.799999 17.230000 16715600.0 NaN 8
2010-01-15 YHOO 17.250000 16.820000 16.750000 17.250000 18415000.0 NaN 9
2010-01-19 YHOO 16.780001 16.750000 16.639999 16.959999 15182600.0 NaN 10
2010-01-20 YHOO 16.650000 16.379999 16.250000 16.680000 14419500.0 NaN 11
2010-01-21 YHOO 16.389999 16.200001 16.100000 16.580000 21858400.0 16.884166 12
2010-01-22 YHOO 16.080000 15.880000 15.810000 16.209999 25132800.0 NaN 13
2010-01-25 YHOO 16.070000 15.860000 15.740000 16.110001 19683700.0 NaN 14
2010-01-26 YHOO 15.820000 15.990000 15.700000 16.170000 43979400.0 NaN 15
2010-01-27 YHOO 16.459999 15.980000 15.770000 16.490000 41701000.0 NaN 16
2010-01-28 YHOO 15.930000 15.440000 15.440000 15.960000 30159500.0 NaN 17
2010-01-29 YHOO 15.510000 15.010000 14.900000 15.670000 39664600.0 NaN 18
2010-02-01 YHOO 15.140000 15.050000 14.870000 15.300000 29865700.0 NaN 19
ema
列,从第14行开始(即在 ind
13以后的列),我想把它们改为 0.84*(ema value in previous row) + 0.16*(value of 'open' in previous row)
通过使用以下方式 apply
函数。a['ema']=a.apply(lambda x: (a.loc[x['ind']-1,'open']*0.16 + a.loc[x['ind']-1, 'ema']*0.84) if x['ind']>12 else x['ema'] ,axis=1)
NaN
. a.head(20)
>>>
date symbol open close low high volume ema ind
2010-01-04 YHOO 16.940001 17.100000 16.879999 17.200001 16587400.0 NaN 0
2010-01-05 YHOO 17.219999 17.230000 17.000000 17.230000 11718100.0 NaN 1
2010-01-06 YHOO 17.170000 17.170000 17.070000 17.299999 16422000.0 NaN 2
2010-01-07 YHOO 16.809999 16.700001 16.570000 16.900000 31816300.0 NaN 3
2010-01-08 YHOO 16.680000 16.700001 16.620001 16.760000 15470000.0 NaN 4
2010-01-11 YHOO 16.770000 16.740000 16.480000 16.830000 16181900.0 NaN 5
2010-01-12 YHOO 16.650000 16.680000 16.600000 16.860001 15672400.0 NaN 6
2010-01-13 YHOO 16.879999 16.900000 16.650000 16.980000 16955600.0 NaN 7
2010-01-14 YHOO 16.809999 17.120001 16.799999 17.230000 16715600.0 NaN 8
2010-01-15 YHOO 17.250000 16.820000 16.750000 17.250000 18415000.0 NaN 9
2010-01-19 YHOO 16.780001 16.750000 16.639999 16.959999 15182600.0 NaN 10
2010-01-20 YHOO 16.650000 16.379999 16.250000 16.680000 14419500.0 NaN 11
2010-01-21 YHOO 16.389999 16.200001 16.100000 16.580000 21858400.0 16.884166 12
2010-01-22 YHOO 16.080000 15.880000 15.810000 16.209999 25132800.0 16.805099 13
2010-01-25 YHOO 16.070000 15.860000 15.740000 16.110001 19683700.0 NaN 14
2010-01-26 YHOO 15.820000 15.990000 15.700000 16.170000 43979400.0 NaN 15
2010-01-27 YHOO 16.459999 15.980000 15.770000 16.490000 41701000.0 NaN 16
2010-01-28 YHOO 15.930000 15.440000 15.440000 15.960000 30159500.0 NaN 17
2010-01-29 YHOO 15.510000 15.010000 14.900000 15.670000 39664600.0 NaN 18
2010-02-01 YHOO 15.140000 15.050000 14.870000 15.300000 29865700.0 NaN 19
ema
,一次一个,为后面的行。x['ind']>12 else x['ema']
无出其右 ind
12是要改变的。a.loc[x['ind']-1,'ema']
你在计算 ema
根据之前的数值 open
和 ema
.ema
所以只有下一行被填充。apply
import numpy as np
import pandas as pd
updated_ema = np.nan
def test(x):
global updated_ema
if x['ind'] > 12:
prev_ema = df.loc[x['ind']-1, 'ema']
prev_open = df.loc[x['ind']-1, 'open'] * 0.16
if not np.isnan(prev_ema):
updated_ema = prev_open + prev_ema * 0.84
else:
updated_ema = prev_open + updated_ema * 0.84
return updated_ema
else:
return x['ema']
df.ema = df.apply(lambda x: test(x), axis=1)
问题是 a.apply
完全是在计算新的列,只有在最后你才会分配结果。
这意味着所有的计算都将基于原来未修改的数据,这也解释了为什么只有一行被更新。
一个解决方法就是在行上循环,每次更新一行单元格(顺便说一句,这种方法没有理由会慢)。