我正在尝试 IBM 文档。以下是我正在查看的网址。我想知道如何以编程方式展开左侧窗格上的所有切换,以便我可以获得所有 URL 并获取数据。
https://www.ibm.com/docs/en/b2b-integrator/6.1.0
看来 RPA 是一种可行的方法,可以扩展每个切换按钮并使用 Selenium 之类的库来扩展它并抓取数据。
但是有人可以提供任何想法吗?
感谢和问候
尝试:
import json
import requests
doc_url = "https://www.ibm.com/docs/api/v1/toc/b2b-integrator/6.1.0?lang=en"
def print_topics(o, lvl=0):
if isinstance(o, dict):
print("\t" * lvl, o.get("label"), "->", o.get("href", ""))
for t in o.get("topics", []):
print_topics(t, lvl + 1)
elif isinstance(o, list):
for v in o:
print_topics(v, lvl)
data = requests.get(doc_url).json()
# print(json.dumps(data, indent=4))
print_topics(data["toc"])
打印:
IBM Sterling B2B Integrator -> SS3JSW_6.1.0
IBM Sterling B2B Integrator v6.1.0 documentation -> SS3JSW_6.1.0/kc_welcome_b2bi.html
What's new in the release? ->
IBM Sterling B2B Integrator ->
What's new in 6.1.0.0 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new.html
What's new in 6.1.0.1 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6101.html
What's new in 6.1.0.3 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6103.html
What's new in 6.1.0.4 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6104.html
What's new in 6.1.0.5 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6105.html
What's new in 6.1.0.5_1 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6105_1.html
What's new in 6.1.0.5_2 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6105_2.html
What's new in 6.1.0.6 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6106.html
What's new in 6.1.0.7 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6107.html
What's new in 6.1.0.8 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_new_6108.html
What's deprecated -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_deprecated.html
What's deprecated in 6.1.0.1 -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_whats_deprecated_6101.html
Support policy for container delivery models -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_support_policy.html
Resolved issues -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_resolved_issues.html
Known issues -> SS3JSW_6.1.0/whatsnew/whats_new/integrator/integrator_known_issues.html
IBM Global Mailbox ->
Known issues -> SS3JSW_6.1.0/whatsnew/whats_new/globalmailbox/gm_known_issues.html
Release Notes -> SS3JSW_6.1.0/ReleaseNotes.html
APAR Fixes -> SS3JSW_6.1.0/APAR_Fixes.html
Quick Start Guide -> SS3JSW_6.1.0/QuickStartGuide.html
Downloading installation media and components -> SS3JSW_6.1.0/IBMB2BIntegratorDownloadDoc.html
Overview ->
IBM Sterling B2B Integrator Overview ->
System requirements -> SS3JSW_6.1.0/overview/overview/integrator/SI_system_requirements.html
Sterling B2B Integrator overview -> SS3JSW_6.1.0/overview/overview/integrator/si_overview.html
Introduction to Sterling B2B Integrator -> SS3JSW_6.1.0/overview/overview/integrator/SI_Introduction.html
Evolving Business and Integration Objectives -> SS3JSW_6.1.0/overview/overview/integrator/SI_EvolvBusIntObj.html
...