使用groupby从特定行中减去数值

问题描述 投票:0回答:1

我是一个Python的初学者.我试图通过下面的代码来获取SP500 1年的价值。

import yfinance as yf
import pandas as pd
import csv
import os
import glob


table=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
df.to_csv('S&P500-Info.csv')
df.to_csv("S&P500-Symbols.csv", columns=["Symbol"])

source_files = sorted(Path('SP500_update/').glob('*.csv'))

dataframes = []
for file in source_files:
    df = pd.read_csv(file) # additional arguments up to your need
    df['source'] = file.name
    dataframes.append(df)

all = pd.concat(dataframes)
all = all.set_index("Date")
              Open          High      Low         Close     Adj Close    Volume     source
Date                            
2019-05-28  68.430000   68.860001   66.959999   67.080002   66.479050   2984700.0   A.csv
2019-05-29  66.589996   67.989998   66.589996   67.300003   66.697075   3722100.0   A.csv
2019-05-30  67.589996   67.900002   66.730003   66.889999   66.290756   2947900.0   A.csv
2019-05-31  66.239998   67.559998   66.070000   67.050003   66.449326   2829300.0   A.csv
2019-06-03  67.040001   68.099998   66.820000   66.989998   66.389854   2560600.0   A.csv
... ... ... ... ... ... ... ...
2020-05-19  131.050003  135.759995  130.080002  134.339996  134.339996  3335300.0   ZTS.csv
2020-05-20  136.199997  137.070007  133.039993  133.339996  133.339996  2303400.0   ZTS.csv
2020-05-21  133.789993  133.889999  129.899994  130.330002  130.330002  1413100.0   ZTS.csv
2020-05-22  129.600006  130.779999  128.880005  130.110001  130.110001  1602400.0   ZTS.csv
2020-05-26  131.419998  132.880005  130.160004  130.619995  130.619995  1760775.0   ZTS.csv

例如,让2019-05-31的ZTS.csv的['Close']的值=x。

             Open           High       Low        Close     Adj Close    Volume     source    diff
Date                            
2019-05-28  68.430000   68.860001   66.959999   67.080002   66.479050   2984700.0   A.csv    67.080002-67.050003
2019-05-29  66.589996   67.989998   66.589996   67.300003   66.697075   3722100.0   A.csv    67.300003-67.050003
2019-05-30  67.589996   67.900002   66.730003   66.889999   66.290756   2947900.0   A.csv    66.889999-67.050003
2019-05-31  66.239998   67.559998   66.070000   67.050003   66.449326   2829300.0   A.csv    67.050003-67.050003
2019-06-03  67.040001   68.099998   66.820000   66.989998   66.389854   2560600.0   A.csv    66.989998-67.050003
... ... ... ... ... ... ... ...
2020-05-19  131.050003  135.759995  130.080002  134.339996  134.339996  3335300.0   ZTS.csv  134.339996-x
2020-05-20  136.199997  137.070007  133.039993  133.339996  133.339996  2303400.0   ZTS.csv  133.339996-x
2020-05-21  133.789993  133.889999  129.899994  130.330002  130.330002  1413100.0   ZTS.csv  130.330002-x
2020-05-22  129.600006  130.779999  128.880005  130.110001  130.110001  1602400.0   ZTS.csv  130.110001-x
2020-05-26  131.419998  132.880005  130.160004  130.619995  130.619995  1760775.0   ZTS.csv  130.619995-x

我想找出2019-05-31的all['Close']groupby all['source]'的值,然后再向all['Close']groupby all['source']这一列进行分式,得到新的行all['diff']。 如果all['diff']的值小于0.我想把它找出来.我想把all['diff']小于0的日期打印出来,并显示该日期的来源.谁能告诉我怎样才能得到结果?

python pandas csv group-by yahoo-finance
1个回答
0
投票

我可以试试这样的方法来过滤你的数据框架。

all_filtered = all.query('diff < 0')

然后打印你想要的信息

© www.soinside.com 2019 - 2024. All rights reserved.