更改文档字符串最大行长度,同时保留换行符、缩进和单词

问题描述 投票:0回答:1

给定一个函数文档字符串,我寻求减少其最大行长度。我的尝试保留了换行符、缩进和单词(不会破坏单词),但未能实际强制执行最大行长度。我的其他尝试强制长度但未能保留,并且代码往往会很快变得复杂。

Autopep8 自 2019 年以来似乎有一个未修复的错误

我需要一种自动方法来更改文档字符串最大长度 - 允许覆盖目标文件。我是否缺少一些简单的

textwrap
的东西,或者还有其他模块/实用程序吗?

尝试

import textwrap

def change_maxlen(txt):
    # hard-coded stuff (e.g. `76`) for demo simplicity
    tnew = ["    "]
    for l in txt.splitlines():
        wrap = textwrap.fill(l[4:], width=76)
        if '\n' in wrap:
            pnew = "\n    " + wrap.replace('\n', ' ')
        else:
            pnew = "\n    " + wrap
        tnew += [pnew]
    tnew = "".join(tnew).replace('"""', '')
    return tnew

许多其他失败的尝试...

输入示例

    """Compute spatial support of `pf` as the interval, in number of samples,
    where sum of envelope (absolute value) inside it is `1 / criterion_amplitude`
    times greater than outside.

    Used for avoiding boundary effects and incomplete filter decay. Accounts for
    tail decay, but with lessser weight than the Heisgenberg resolution measure
    for monotonic decays (i.e. no bumps/spikes). For prioritizing the main lobe,
    see `dumdum.utils.measures.compute_temporal_width()`.

    Parameters
    ----------
    pf : np.ndarray, 1D
        Filter, in frequency domain.

        Assumes that the time-domain waveform, `ifft(pf)`, is centered at index

    guarantee_decay : bool (default False)
        In practice, this is identical to `True`, though `True` will still return

    Returns
    -------
    support : int
        Total temporal support of `pf`, measured in samples ("total" as opposed to
        only right or left part).
    """

所需输出

    """Compute spatial support of `pf` as the interval, in number of samples,
    where sum of envelope (absolute value) inside it is `1 / 
    criterion_amplitude` times greater than outside.
    
    Used for avoiding boundary effects and incomplete filter decay. Accounts 
    for tail decay, but with lessser weight than the Heisgenberg resolution 
    measure for monotonic decays (i.e. no bumps/spikes). For prioritizing the 
    main lobe, see `dumdum.utils.measures.compute_temporal_width()`.
     
    Parameters
    ----------
    pf : np.ndarray, 1D
        Filter, in frequency domain.
    
        Assumes that the time-domain waveform, `ifft(pf)`, is centered at index
    
    guarantee_decay : bool (default False)
        In practice, this is identical to `True`, though `True` will still 
        return
    
    Returns
    -------
    support : int
        Total temporal support of `pf`, measured in samples ("total" as opposed
        to only right or left part).
    """
python string multiline
1个回答
0
投票

我从头开始实现它 - 远未达到性能优化,并且有重要的未处理案例,但它是一些东西。在比 OP 更具挑战性和更长的文档字符串上进行了测试。

我还尝试让 ChatGPT-3.5 重写我的解决方案或从头开始制作一个解决方案,还尝试了 platform.openai 上的每个非 GPT4 选项 - 一切都惨败。我仔细地写下了我的提示。 GPT4 要求已经支付 1 美元的使用费(...而不是让我支付 1 美元来解锁?)

不满意的解决方案!分享事业总比没有好。

使用方法

(1)

max_len
:目标每行最大字符数。

(2)

base_indent=4
:所有行共享最小缩进。较小的缩进被裁剪。如果真正的
base_indent
更大,代码仍然有效,但最终缩进可能会关闭。

(3) 输入字符串必须由另一个多行字符串包裹。所以,如果我们想重新格式化

    """Blah blah
    rasengan
    """

我们愿意

txt = '''    """Blah blah
    rasengan
    """'''

也就是说,从

开始
txt = ''''''

然后复制完整的原始字符串,并将其粘贴到

'''
'''
之间。

未处理的案件

未处理的案件就是子弹溢出,所以

    """
    Normal line

        - Indented bullet ...
          continuation ...

如果

Indented bullet
线超过
max_len
,则
continuation
将与
-
而不是
I
垂直对齐。其他情况:

  • 之间有换行符的嵌套项目符号 - 换行符被删除
  • 单词长于
    max_len
  • 可能还有其他人

代码

def change_max_len(txt, max_len, base_indent):
    def body(new_lines, carry, current_indent, l):
        l_orig = l
        l_indent = len(l) - len(l.lstrip())
        if l not in ("", '"""', "'''"):
            current_indent = l_indent
        extra_newline = False
    
        if carry != "":
            if l_orig in ('"""', "'''"):
                l = " "*current_indent + carry + "\n    " + l_orig
            elif l_indent > 0 or l_orig == "":
                l = " "*current_indent + carry + " " + l[l_indent:]
                if l_orig == "":
                    extra_newline = True
            else:
                l = carry + " " + l
            carry = ""
        
        def handle_overflow(l, new_lines):
            words = l.split(" ")
    
            l0 = ""
            l1 = ""
            for w in words:
                w_add = " " + w
                # `-1` as later we drop a `" "`
                if len(l0) + len(w_add) - 1 <= max_len_adj:
                    l0 += w_add
                else:
                    l1 += w_add
            # drop the first `" "`
            l0, l1 = l0[1:], l1[1:]
    
            carry = l1
    
            assert len(l0) <= max_len_adj
            new_line = "    " + l0 + "\n"
            return new_line, carry
    
    
        if len(l) <= max_len_adj:
            new_line = " "*base_indent + l + "\n"
        else:
            new_line, carry = handle_overflow(l, new_lines)
            # handle over-overflow
            if len(carry) / max_len_adj > 2:
                new_lines += [new_line]
                l = carry
                i = 0
                while True:
                    new_line, carry = handle_overflow(l, new_lines)
                    if len(carry) / max_len_adj <= 2:
                        break
                    new_lines += [new_line]
                    l = carry
                    i += 1
                    if i > 10:
                        # probably a single word in excess of max_len_adj
                        raise Exception("`max_len` probably too small")
                        
        if extra_newline:
            new_line += "\n"
     
        new_lines += [new_line]
    
        return new_lines, carry, current_indent

    max_len_adj = max_len - base_indent
    # first dedent then reindent
    lines = [l[base_indent:] for l in txt.splitlines()]
    
    # initialze loop values
    new_lines = []
    carry = ""
    current_indent = 0
    
    # main loop
    for i, l in enumerate(lines):
        new_lines, carry, current_indent = body(
            new_lines, carry, current_indent, l)
    
    new_txt = "".join(new_lines)
    return new_txt

txt = '''    """Compute spatial support of `pf` as the interval, in number of samples,
    where sum of envelope (absolute value) inside it is `1 / criterion_amplitude`
    times greater than outside.

    Used for avoiding boundary effects and incomplete filter decay. Accounts for
    tail decay, but with lessser weight than the Heisgenberg resolution measure
    for monotonic decays (i.e. no bumps/spikes). For prioritizing the main lobe,
    see `dumdum.utils.measures.compute_temporal_width()`.

    Parameters
    ----------
    pf : np.ndarray, 1D
        Filter, in frequency domain.

        Assumes that the time-domain waveform, `ifft(pf)`, is centered at index

    guarantee_decay : bool (default False)
        In practice, this is identical to `True`, though `True` will still return

    Returns
    -------
    support : int
        Total temporal support of `pf`, measured in samples ("total" as opposed to
        only right or left part).
    """'''

new_txt = change_max_len(txt, max_len=70, base_indent=4)
print(new_txt)
© www.soinside.com 2019 - 2024. All rights reserved.