从财政部网站上删除联邦票据收益率表

问题描述 投票:0回答:2

我想从财政部网站下载 10 年期联邦票据收益率: https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data =产量

为了解析上述网页并检索最新的 10 年期国库券收益率,我曾经按照此处提供的说明进行操作: 从网站解析10年期联邦票据收益率

library(httr)
URL = "https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yield"

urldata <- GET(URL)
data <- readHTMLTable(rawToChar(urldata$content),
                      stringsAsFactors = FALSE)

data <- as.data.frame((data[69]))
names(data) <- gsub("NULL.","", names(data)) # Take out "NULL."

但它不再起作用了。

有什么想法可能是错误的或替代建议吗?

html r quantmod rvest httr
2个回答
4
投票

这并不能回答为什么你的代码不再有效的具体问题。这是使用

rvest
包的替代方案,它简化了许多抓取操作。特别是在下面,表格的选择是通过 class id CSS 选择器
.t-chart
进行的。这使得它更能容忍页面格式更改。与运算符 %>% 的链接使得代码非常紧凑。

library(rvest)
t_url = "https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yield"
rates <- read_html(t_url) %>%
  html_node(".t-chart") %>% 
  html_table()

rates

#       Date 1 mo 3 mo 6 mo 1 yr 2 yr 3 yr 5 yr 7 yr 10 yr 20 yr 30 yr
# 1 04/03/17 0.73 0.79 0.92 1.02 1.24 1.47 1.88 2.16  2.35  2.71  2.98
# 2 04/04/17 0.77 0.79 0.92 1.03 1.25 1.47 1.88 2.16  2.36  2.72  2.99
# 3 04/05/17 0.77 0.80 0.93 1.03 1.24 1.44 1.85 2.14  2.34  2.71  2.98
# 4 04/06/17 0.78 0.79 0.94 1.05 1.24 1.45 1.87 2.15  2.34  2.72  2.99
# 5 04/07/17 0.77 0.82 0.95 1.08 1.29 1.52 1.92 2.20  2.38  2.74  3.00
# 6 04/10/17 0.77 0.82 0.97 1.07 1.29 1.52 1.91 2.18  2.37  2.72  2.99
# 7 04/11/17 0.74 0.82 0.94 1.05 1.24 1.45 1.84 2.11  2.32  2.67  2.93
# 8 04/12/17 0.77 0.81 0.95 1.04 1.24 1.44 1.81 2.09  2.28  2.65  2.92
# 9 04/13/17 0.76 0.81 0.94 1.03 1.21 1.40 1.77 2.05  2.24  2.62  2.89

0
投票

您可以从财政部网站获取 10 年期国债收益率的最新数据,该数据每天更新:

import requests
from bs4 import BeautifulSoup
import pandas as pd
from io import StringIO

# URL of the Treasury rates page
url = "https://home.treasury.gov/resource-center/data-chart- 
center/interest-rates/TextView? 
type=daily_treasury_yield_curve&field_tdr_date_value_month=202404"

# Fetch the content from the URL
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')

# Find the table in the webpage
table = soup.find("table")
if table:
    # Convert the HTML table to a string, then read it into a DataFrame
    rates = pd.read_html(StringIO(table.prettify()))[0]

    # Check if the '10 Yr' column exists and extract the last non-NaN
    if '10 Yr' in rates.columns:
        # Drop NaN values and get the last value in the '10 Yr' column
        last_value = rates['10 Yr'].dropna().iloc[-1]
        print("Extracted value:", last_value)
    else:
        print("'10 Yr' column not found in the table.")
else:
    print("No tables found on the page.")
© www.soinside.com 2019 - 2024. All rights reserved.