每一行代表一次国债拍卖:
>>> df[(df['issue_date'] >= '2024-03-01') & (df['issue_date'] <= '2024-03-20')][['issue_date', 'maturity_date', 'security_type', 'total_accepted']]
issue_date maturity_date security_type total_accepted
9995 2024-03-05 2024-04-02 Bill 95096576100
9996 2024-03-05 2024-07-02 Bill 60060634300
9997 2024-03-05 2024-04-30 Bill 90090860300
9998 2024-03-07 2024-09-05 Bill 70275894100
9999 2024-03-07 2024-06-06 Bill 79311242700
10000 2024-03-07 2024-04-18 CMB 80000045200
10001 2024-03-12 2024-07-09 Bill 60060669400
10002 2024-03-12 2024-04-09 Bill 95094517100
10003 2024-03-12 2024-05-07 Bill 90091153900
10004 2024-03-14 2024-09-12 Bill 70293933600
10005 2024-03-14 2024-06-13 Bill 79331123700
10006 2024-03-14 2024-04-25 CMB 80001086200
10007 2024-03-15 2027-03-15 Note 56000018600
10008 2024-03-15 2054-02-15 Bond 22000030500
10009 2024-03-15 2034-02-15 Note 39000004400
10010 2024-03-19 2024-07-16 Bill 60061959000
10011 2024-03-19 2024-05-14 Bill 90091935800
10012 2024-03-19 2024-04-16 Bill 95097858000
issue_date
:国债发行给买方的日期maturity_date
:国债到期的日期(买方收回资金)security_type
:无论是票据、票据、债券等total_accepted
:拍卖带来的总金额。因此,在任何一天,对于特定的证券类型(例如票据),可能会有一定数量的发行。但是,也可能有一些金额到期。 净发行量为:
net issuance = issued - maturing
这是一个计算净发行量表的程序
import pandas as pd
import treasury_gov_pandas
df = treasury_gov_pandas.update_records('https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/auctions_query', lookback=10)
df['issue_date'] = pd.to_datetime(df['issue_date'])
df['maturity_date'] = pd.to_datetime(df['maturity_date'])
df['total_accepted'] = pd.to_numeric(df['total_accepted'], errors='coerce')
# group by 'issue_date' and 'security_type' and sum 'total_accepted'
issued = df.groupby(['issue_date', 'security_type'])['total_accepted'].sum().reset_index()
# group by 'maturity_date' and 'security_type' and sum 'total_accepted'
maturing = df.groupby(['maturity_date', 'security_type'])['total_accepted'].sum().reset_index()
# join issued and maturing on 'issue_date' = 'maturity_date' and 'security_type' = 'security_type'
merged = pd.merge(issued, maturing, how='outer', left_on=['issue_date', 'security_type'], right_on=['maturity_date', 'security_type'])
merged.rename(columns={'total_accepted_x': 'issued', 'total_accepted_y': 'maturing'}, inplace=True)
merged['change'] = merged['issued'].fillna(0) - merged['maturing'].fillna(0)
merged['date'] = merged['issue_date'].combine_first(merged['maturity_date'])
tmp = merged
agg = tmp.groupby(['date', 'security_type'])['change'].sum().reset_index()
pivot_df = agg.pivot(index='date', columns='security_type', values='change').fillna(0)
它使用以下库来检索数据:
https://github.com/dharmatech/treasury-gov-pandas.py
任何一天:
因此,在代码中,我使用了外连接。
>>> pivot_df
security_type Bill Bond CMB FRN Note Note TIPS Bond TIPS Note
date
1979-11-15 0.000000e+00 2.315000e+09 0.0 0.0 2.401000e+09 0.000000e+00 0.0
1980-01-03 6.606165e+09 0.000000e+00 0.0 0.0 0.000000e+00 0.000000e+00 0.0
1980-01-08 4.007825e+09 0.000000e+00 0.0 0.0 0.000000e+00 0.000000e+00 0.0
1980-01-10 6.402625e+09 1.501000e+09 0.0 0.0 0.000000e+00 0.000000e+00 0.0
1980-01-17 6.403760e+09 0.000000e+00 0.0 0.0 0.000000e+00 0.000000e+00 0.0
... ... ... ... ... ... ... ...
2053-02-15 0.000000e+00 -6.635744e+10 0.0 0.0 0.000000e+00 -1.987600e+10 0.0
2053-05-15 0.000000e+00 -6.272132e+10 0.0 0.0 0.000000e+00 0.000000e+00 0.0
2053-08-15 0.000000e+00 -7.160462e+10 0.0 0.0 0.000000e+00 0.000000e+00 0.0
2053-11-15 0.000000e+00 -6.645674e+10 0.0 0.0 0.000000e+00 0.000000e+00 0.0
2054-02-15 0.000000e+00 -7.121181e+10 0.0 0.0 0.000000e+00 -9.389377e+09 0.0
我们可以看到每种证券类型都有一列。给定日期的值显示该证券类型的净发行量。
这种方法似乎有效。但我想知道,这被认为是惯用的 pandas 代码吗?还有比这个更好的方法吗?
你的代码都很好,但是如果你想进一步压缩它,这里有一个方法。由于我没有安装您所需的模块,所以我附加了所需的功能:
import pandas as pd
import os
import time
def download_records_after(url, date, page_size=10000):
data = []
page = 1
url_params = f'?filter=record_date:gt:{date}&page[size]={page_size}'
response = requests.get(url + url_params)
if response.status_code == 200:
result_json = response.json()
data.extend(result_json['data'])
while True:
if result_json['links']['next'] is None:
break
else:
response = requests.get(url + url_params + result_json['links']['next'])
if response.status_code == 200:
result_json = response.json()
data.extend(result_json['data'])
page = page + 1
print(f'page {page} of {result_json["meta"]["total-pages"]}')
time.sleep(2)
else:
print(f'status_code: {response.status_code}')
break
else:
print(f'status_code: {response.status_code}')
return pd.DataFrame(data)
def update_records(url, start_date='1900-01-01', page_size=10000, path=None, lookback=2):
if path is None:
ls = url.split('/')[3:]
path = '-'.join(ls).replace('_', '-') + '.pkl'
if os.path.isfile(path):
print(f'Found {path}. Importing.')
df = pd.read_pickle(path)
recent_record_date = df['record_date'].unique()[-lookback]
print(f'recent_record_date: {recent_record_date} lookback: {lookback}')
new_records = download_records_after(url, recent_record_date, page_size)
df = df[df['record_date'] <= recent_record_date]
df = pd.concat([df, new_records], ignore_index=True)
df.to_pickle(path)
return df
else:
recent_record_date = start_date
print(f'Using recent_record_date: {recent_record_date}')
df = download_records_after(url, recent_record_date, page_size)
df.to_pickle(path)
return df
api_url = 'https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/auctions_query'
df = update_records(api_url,start_date='1900-01-01', page_size=10000, path=None, lookback=2)
df['issue_date'] = pd.to_datetime(df['issue_date'])
df['maturity_date'] = pd.to_datetime(df['maturity_date'])
df['total_accepted'] = pd.to_numeric(df['total_accepted'], errors='coerce')
net_issuance = (
pd.concat([
df.assign(date=pd.to_datetime(df['issue_date'])).groupby(['date', 'security_type'])['total_accepted'].sum().rename('issued'),
df.assign(date=pd.to_datetime(df['maturity_date'])).groupby(['date', 'security_type'])['total_accepted'].sum().rename('maturing')
], axis=1)
.fillna(0)
.eval('net_issuance = issued - maturing')
.reset_index()
)
pivot_df = net_issuance.pivot_table(index='date', columns='security_type', values='net_issuance', fill_value=0)
print(pivot_df)
这给出了您的预期输出:
Found services-api-fiscal-service-v1-accounting-od-auctions-query.pkl. Importing.
recent_record_date: 2024-04-18 lookback: 2
security_type Bill Bond CMB FRN Note Note \
date
1979-11-15 0.000000e+00 2.315000e+09 0.0 0.0 2.401000e+09
1980-01-03 6.606165e+09 0.000000e+00 0.0 0.0 0.000000e+00
1980-01-08 4.007825e+09 0.000000e+00 0.0 0.0 0.000000e+00
1980-01-10 6.402625e+09 1.501000e+09 0.0 0.0 0.000000e+00
1980-01-17 6.403760e+09 0.000000e+00 0.0 0.0 0.000000e+00
... ... ... ... ... ...
2053-02-15 0.000000e+00 -6.635744e+10 0.0 0.0 0.000000e+00
2053-05-15 0.000000e+00 -6.272132e+10 0.0 0.0 0.000000e+00
2053-08-15 0.000000e+00 -7.160462e+10 0.0 0.0 0.000000e+00
2053-11-15 0.000000e+00 -6.645674e+10 0.0 0.0 0.000000e+00
2054-02-15 0.000000e+00 -7.121181e+10 0.0 0.0 0.000000e+00
...
2053-11-15 0.000000e+00 0.0
2054-02-15 -9.389377e+09 0.0
[4365 rows x 7 columns]
所以,我的贡献很简单
df['issue_date'] = pd.to_datetime(df['issue_date'])
df['maturity_date'] = pd.to_datetime(df['maturity_date'])
df['total_accepted'] = pd.to_numeric(df['total_accepted'], errors='coerce')
net_issuance = (
pd.concat([
df.assign(date=pd.to_datetime(df['issue_date'])).groupby(['date', 'security_type'])['total_accepted'].sum().rename('issued'),
df.assign(date=pd.to_datetime(df['maturity_date'])).groupby(['date', 'security_type'])['total_accepted'].sum().rename('maturing')
], axis=1)
.fillna(0)
.eval('net_issuance = issued - maturing')
.reset_index()
)
pivot_df = net_issuance.pivot_table(index='date', columns='security_type', values='net_issuance', fill_value=0)
print(pivot_df)