国库税数据:数据框中的堆积条形图

问题描述 投票:0回答:1

节目

这是一个通过 treasury.gov API 获取税务数据的小型 Python 程序:

import pandas as pd
import treasury_gov_pandas
# ----------------------------------------------------------------------
df = treasury_gov_pandas.update_records(
    url = 'https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/dts/deposits_withdrawals_operating_cash')

df['record_date'] = pd.to_datetime(df['record_date'])

df['transaction_today_amt'] = pd.to_numeric(df['transaction_today_amt'])

tmp = df[(df['transaction_type'] == 'Deposits') &   ((df['transaction_catg'].str.contains('Tax'))   |   (df['transaction_catg'].str.contains('FTD')))   ]

程序使用以下库来下载数据:

https://github.com/dharmatech/treasury-gov-pandas.py

数据框

结果数据如下所示:

>>> tmp.tail(20).drop(columns=['table_nbr', 'table_nm', 'src_line_nbr', 'record_fiscal_year', 'record_fiscal_quarter', 'record_calendar_year', 'record_calendar_quarter', 'record_calendar_month', 'record_calendar_day', 'transaction_mtd_amt', 'transaction_fytd_amt', 'transaction_catg_desc', 'account_type', 'transaction_type'])

       record_date                          transaction_catg  transaction_today_amt
371266  2024-04-03    DHS - Customs and Certain Excise Taxes                     84
371288  2024-04-03                  Taxes - Corporate Income                    237
371289  2024-04-03                   Taxes - Estate and Gift                     66
371290  2024-04-03       Taxes - Federal Unemployment (FUTA)                     10
371291  2024-04-03  Taxes - IRS Collected Estate, Gift, misc                     23
371292  2024-04-03              Taxes - Miscellaneous Excise                     41
371293  2024-04-03  Taxes - Non Withheld Ind/SECA Electronic                   1786
371294  2024-04-03       Taxes - Non Withheld Ind/SECA Other                   2315
371295  2024-04-03               Taxes - Railroad Retirement                      3
371296  2024-04-03          Taxes - Withheld Individual/FICA                  12499
371447  2024-04-04    DHS - Customs and Certain Excise Taxes                     82
371469  2024-04-04                  Taxes - Corporate Income                    288
371470  2024-04-04                   Taxes - Estate and Gift                     59
371471  2024-04-04       Taxes - Federal Unemployment (FUTA)                      8
371472  2024-04-04  Taxes - IRS Collected Estate, Gift, misc                    127
371473  2024-04-04              Taxes - Miscellaneous Excise                     17
371474  2024-04-04  Taxes - Non Withheld Ind/SECA Electronic                   1905
371475  2024-04-04       Taxes - Non Withheld Ind/SECA Other                   1092
371476  2024-04-04               Taxes - Railroad Retirement                      1
371477  2024-04-04          Taxes - Withheld Individual/FICA                   2871

数据框包含可追溯到 2005 年的数据:

>>> tmp.drop(columns=['table_nbr', 'table_nm', 'src_line_nbr', 'record_fiscal_year', 'record_fiscal_quarter', 'record_calendar_year', 'record_calendar_quarter', 'record_calendar_month', 'record_calendar_day', 'transaction_mtd_amt', 'transaction_fytd_amt', 'transaction_catg_desc', 'account_type', 'transaction_type'])

       record_date                                   transaction_catg  transaction_today_amt
2       2005-10-03                   Customs and Certain Excise Taxes                    127
7       2005-10-03                              Estate and Gift Taxes                     74
10      2005-10-03                          FTD's Received (Table IV)                   2515
12      2005-10-03  Individual Income and Employment Taxes, Not Wi...                    353
21      2005-10-03                          FTD's Received (Table IV)                  15708
...            ...                                                ...                    ...
371473  2024-04-04                       Taxes - Miscellaneous Excise                     17
371474  2024-04-04           Taxes - Non Withheld Ind/SECA Electronic                   1905
371475  2024-04-04                Taxes - Non Withheld Ind/SECA Other                   1092
371476  2024-04-04                        Taxes - Railroad Retirement                      1
371477  2024-04-04                   Taxes - Withheld Individual/FICA                   2871

问题

我想将此数据绘制为堆积条形图。

  • x 轴应为“记录日期”。
  • y 轴应该是“transaction_today_amt”。
  • “transaction_catg”值应用于堆叠的项目。

使用散景实现此功能的好方法是什么?

bokeh
1个回答
0
投票

这是一种方法:

import pandas as pd
import treasury_gov_pandas

from bokeh.plotting import figure, show
from bokeh.models   import NumeralTickFormatter, HoverTool
import bokeh.models

import bokeh.palettes
import bokeh.transform
# import matplotlib.pyplot as plt
# import matplotlib
# ----------------------------------------------------------------------
df = treasury_gov_pandas.update_records(
    url = 'https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/dts/deposits_withdrawals_operating_cash')

df['record_date'] = pd.to_datetime(df['record_date'])

df['transaction_today_amt'] = pd.to_numeric(df['transaction_today_amt'])

# ----------------------------------------------------------------------

tmp = df[(df['transaction_type'] == 'Deposits') &   ((df['transaction_catg'].str.contains('Tax'))   |   (df['transaction_catg'].str.contains('FTD')))   ]

# tmp.drop(columns=['table_nbr', 'table_nm', 'src_line_nbr', 'record_fiscal_year', 'record_fiscal_quarter', 'record_calendar_year', 'record_calendar_quarter', 'record_calendar_month', 'record_calendar_day', 'transaction_mtd_amt', 'transaction_fytd_amt', 'transaction_catg_desc', 'account_type', 'transaction_type'])

# tmp.tail(20).drop(columns=['table_nbr', 'table_nm', 'src_line_nbr', 'record_fiscal_year', 'record_fiscal_quarter', 'record_calendar_year', 'record_calendar_quarter', 'record_calendar_month', 'record_calendar_day', 'transaction_mtd_amt', 'transaction_fytd_amt', 'transaction_catg_desc', 'account_type', 'transaction_type'])

# ----------------------------------------------------------------------

tmp_agg = tmp.groupby(['record_date', 'transaction_catg'])['transaction_today_amt'].sum().reset_index()

tmp_agg['record_date'] = tmp_agg['record_date'].dt.date

pivot_df = tmp_agg.pivot(index='record_date', columns='transaction_catg', values='transaction_today_amt').fillna(0)

p = figure(title='TGA Taxes', sizing_mode='stretch_both', x_axis_type='datetime', x_axis_label='record_date', y_axis_label='amt')

# p.vbar_stack(stackers=pivot_df.columns, x='record_date', width=0.5, source=pivot_df, legend_label=pivot_df.columns, color=bokeh.palettes.Category20[20])

width = pd.Timedelta(days=0.5)

# p.vbar_stack(stackers=pivot_df.columns, x='record_date', width=0.5, source=pivot_df, color=bokeh.palettes.Category20[15], legend_label=pivot_df.columns.tolist())

p.vbar_stack(stackers=pivot_df.columns, x='record_date', width=width, source=pivot_df, color=bokeh.palettes.Category20[15], legend_label=pivot_df.columns.tolist())

p.xaxis.ticker = bokeh.models.DatetimeTicker(desired_num_ticks=30)

p.legend.click_policy = 'hide'

p.legend.location = 'top_left'

show(p)

© www.soinside.com 2019 - 2024. All rights reserved.