如何将一个或多个 Confluence 数据中心/服务器空间导出为 PDF?

问题描述 投票:0回答:1

如何根据对所有可用空间的搜索将一个或多个 Confluence 空间导出为 PDF? 信息稀缺,所以我把它做成一个问答来帮助别人。

我通读了一大堆 API 弃用、替换和问题报告,我了解到 Confluence 仍然不允许通过现代 RESTful API 导出 PDF,只能通过其长期不受支持的 SOAP API。 2023年

我读过的一些比较有用的内容包括:

https://jira.atlassian.com/browse/CONFSERVER-9901 https://community.atlassian.com/t5/Confluence-questions/RPC-Confluence-export-fails-with-TYPE-PDF/qaq-p/269310 https://developer.atlassian.com/server/confluence/remote-api-specification-for-pdf-export/

以下 SO 示例与所需的类似,但它不搜索空间,这需要在 2015 年 6 月之前的某个时间使用不同的端点。使用 Ruby 和 PHP 也代表我的团队引入了一种新语言,我们更喜欢坚持使用 C#、Python,并在紧急情况下使用 Java。 如何使用远程 API 将 Confluence“空间”导出为 PDF

python python-3.x soap confluence export-to-pdf
1个回答
0
投票

以下 Python 脚本使用 Python 3.11 和 Confluence Server 7.19 进行了测试。它写的很短,并不完美,所以可以根据需要随意修改。

Python 3 代码

# Saves one or more Confluence spaces to PDF files. On-prem installs only. SOAP API must be enabled/unblocked
# Be sure to: pip install zeep and change the URL and YOUR_KEY_FILTER_HERE below
# Charles Burns (https://stackoverflow.com/users/161816/charles-burns), February 2023

import shutil
import logging
from getpass import getpass
from datetime import datetime, timezone
from requests import Session
from requests.auth import HTTPBasicAuth
from zeep import Client
from zeep.transports import Transport

confluence = "on-prem-confluence.net" # Your company's Confluence URI
user = input("Confluence login name: ")
password = getpass()

print("Authorizing on " + confluence + "...")
session = Session()
session.auth = HTTPBasicAuth(user, password)
getSpacesClient = Client('https://' + confluence + '/rpc/soap-axis/confluenceservice-v2?WSDL', transport=Transport(session=session))
token = getSpacesClient.service.login(user, password)

print("Getting list of spaces to export...")
allSpaces = getSpacesClient.service.getSpaces(token)
spaces = list(filter(lambda s: s.key.startswith("YOUR_KEY_FILTER_HERE"), allSpaces))
print("Found {} spaces (filtered from {} total): {}".format(len(spaces), len(allSpaces), ", ".join([s.name for s in spaces])))
pdfExportClient = Client('https://' + confluence + '/rpc/soap-axis/pdfexport?WSDL', transport=Transport(session=session))

for space in spaces:
    print("Beginning export of '{}' from {}".format(space.name, space.url))
    try:
        url = siteExportUrl = pdfExportClient.service.exportSpace(token, space.key)
    except Exception as e:
        logging.exception("ERROR EXPORTING " + space.name)
        break
    print("    Downloading exported PDF from {}".format(url))
    fileName = "{}UTC_{}.pdf".format(datetime.now(timezone.utc).strftime("%Y%m%d-%H%M%S"), space.key)
    file = session.get(siteExportUrl, stream=True)
    with open(fileName, 'wb') as f:
        shutil.copyfileobj(file.raw, f)
    print("    Export complete: {}\n".format(fileName))

示例输出

Confluence login name: charlesburns
Password: 
Authorizing on on-prem-confluence.net...
Getting list of spaces to export...
Found 31 spaces (filtered from 4601 total): Some Space, Some Other Space, Yet Another Space
Beginning export of 'Some Space' from https://on-prem-confluence.net/display/MYKEY
    Downloading exported PDF from https://on-prem-confluence.net/download/temp/pdfexport-20230224/MYKEY.pdf
    Export complete: 20230225-000215UTC_MYKEY.pdf

Beginning export of 'Some Space' from https://on-prem-confluence.net/display/MYKEY
    Downloading exported PDF from https://on-prem-confluence.net/download/temp/pdfexport-20230224/MYKEY.pdf
    Export complete: 20230225-000215UTC_MYKEY.pdf

成功导出后,PDF 文件将与脚本位于同一文件夹中。

遇到的错误及可能的原因

错误 注意
ValueError:无效的标签名称'Object []' SOAP API 可能被禁用,询问管理员
requests.exceptions.HTTPError: 401 客户端错误 密码错误或无法导出空间
requests.exceptions.ConnectTimeout Confluence 实例宕机或 URL 不正确
© www.soinside.com 2019 - 2024. All rights reserved.