ReportLab：使用中文/Unicode 字符

Question

长话短说： 是否有某种方式告诉 ReportLab 使用特定字体，如果缺少某些字符的字形，则回退到另一种字体？ 或者，您是否知道包含所有欧洲语言、希伯来语、俄语的字形的压缩 TrueType 字体、中文、日文和阿拉伯文？

我一直在用 ReportLab 创建报告，在渲染包含汉字的字符串时遇到了问题。我一直在使用的字体是 DejaVu Sans Condensed，它不包含中文的字形（但是，它确实包含西里尔文、希伯来文、阿拉伯文和各种欧洲语言支持的变音符号——这使得它非常通用，我需要他们不时）

然而，该字体不支持中文，我一直无法找到支持所有语言并满足我们图形设计要求的 TrueType 字体。作为一个临时的解决方法，我让中国客户的报告使用完全不同的字体，只包含英文和中文字形，希望其他语言的字符不会出现在字符串中。然而，由于显而易见的原因，这很笨拙并且破坏了图形设计，因为它不是 DejaVu Sans，整个外观和感觉都是围绕它设计的。

所以问题是，您将如何处理在一个文档中支持多种语言的需要，并为每种语言维护指定字体的使用。由于有时字符串包含多种语言，这变得更加复杂，因此无法确定每个字符串应使用哪种字体。

是否有某种方式告诉 ReportLab 使用特定字体，并在某些字符的字形丢失时回退到另一种字体？我在文档中发现了一些模糊的暗示，它应该是可能的，尽管我可能理解不正确。

或者，您是否知道一种浓缩的 TrueType 字体，它包含所有欧洲语言、希伯来语、俄语、中文、日语和阿拉伯语的字形？

谢谢。

Answer 1

这个问题整个星期都让我着迷，所以因为是周末，我直接投入其中并准确地找到了一个我称之为

MultiFontParagraph

的解决方案，这是一个正常的

Paragraph

，有一个很大的不同，你可以准确地设置字体回退顺序。

例如，我从互联网上提取的这个随机日语文本使用了以下字体后备

"Bauhaus", "Arial", "HanaMinA"

。它检查第一个字体是否有该字符的字形，如果有则使用它，如果没有则回退到下一个字体。目前代码不是很有效，因为它在每个字符周围放置标签，这很容易修复，但为了清楚起见，我没有在这里这样做。

使用下面的代码我创建了上面的例子：

foreign_string = u'6905\u897f\u963f\u79d1\u8857\uff0c\u5927\u53a6\uff03\u5927'
P = MultiFontParagraph(foreign_string, styles["Normal"],
                     [  ("Bauhaus", "C:\Windows\Fonts\\BAUHS93.TTF"),
                        ("Arial", "C:\Windows\Fonts\\arial.ttf"),
                        ("HanaMinA", 'C:\Windows\Fonts\HanaMinA.ttf')])

MultiFontParagraph

(git)的来源如下：

from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
from reportlab.platypus import Paragraph


class MultiFontParagraph(Paragraph):
    # Created by B8Vrede for http://stackoverflow.com/questions/35172207/
    def __init__(self, text, style, fonts_locations):

        font_list = []
        for font_name, font_location in fonts_locations:
            # Load the font
            font = TTFont(font_name, font_location)

            # Get the char width of all known symbols
            font_widths = font.face.charWidths

            # Register the font to able it use
            pdfmetrics.registerFont(font)

            # Store the font and info in a list for lookup
            font_list.append((font_name, font_widths))

        # Set up the string to hold the new text
        new_text = u''

        # Loop through the string
        for char in text:

            # Loop through the fonts
            for font_name, font_widths in font_list:

                # Check whether this font know the width of the character
                # If so it has a Glyph for it so use it
                if ord(char) in font_widths:

                    # Set the working font for the current character
                    new_text += u'<font name="{}">{}</font>'.format(font_name, char)
                    break

        Paragraph.__init__(self, new_text, style)

Answer 2

来自Google Noto字体：

谷歌一直在开发一个名为 Noto 的字体系列，旨在以和谐的外观和感觉支持所有语言。

统一的 Noto Sans 字体包括单一字体，支持来自以下领域的 581 种语言：

其他如希伯来语、阿拉伯语和日语在能登网站上被列为单独的项目。

Answer 3

我们也可以使用 Reportlab Chinese Fonts 包。

from reportlab.pdfgen import canvas
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.cidfonts import UnicodeCIDFont

# Register the Chinese font with Reportlab
pdfmetrics.registerFont(UnicodeCIDFont('STSong-Light'))

# Create a new canvas
c = canvas.Canvas("sample.pdf")

# Set the font to the Chinese font
c.setFont('STSong-Light', 32)

# Draw some Chinese characters
c.drawString(50, 750, '世界，你好！')

# Save the PDF
c.save()

ReportLab：使用中文/Unicode 字符

问题描述投票：0回答：3

3个回答

最新问题

ReportLab：使用中文/Unicode 字符

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3