具有相对导入的Python项目文件结构，或者如何正确构建所描述的项目

Question

我一直在尝试解决Rosalind.info网站上的生物信息学问题，现在当我想执行一些简单的测试时遇到了一些麻烦。

我的项目的结构如下：

Rosalind-problems/
├─ bioinformatics_stronghold/
│  ├─ data/
│  ├─ modules/
│  │  ├─ __init__.py
│  │  ├─ read_fasta.py
│  ├─ CONS.py
│  ├─ IEV.py
├─ tests/
│  ├─ __init__.py
│  ├─ test_CONS.py
│  ├─ test_IEV.py

这里的目标是能够测试生物信息学据点文件夹中的所有单个文件（CONS.py、IEV.py 等）。然而我遇到的问题是：

运行 test_IEV.py 按预期工作。
运行 test_CONS.py 不起作用。

查看下面所有受影响的文件：

test_IEV.py

import pytest
from bioinformatics_stronghold.IEV import calculate_offspring

def test_calculate_offspring():
    assert calculate_offspring([1, 0, 0, 1, 0, 1]) == 3.5
    assert calculate_offspring([1, 1, 1, 1, 1, 1]) == 8.5

IEV.py

def calculate_offspring(input_list:list[int]) -> float:
    """This function will take an input list of non-negative integers no larger than 20,000. The function will then calculate the expected offspring showing the dominant phenotype.

    Args:
        input_list (list): Input a list of integers representing the number of couples

    Returns:
        float: The expected number of offspring
    """
    input_list = input_list
    expected_dominant_offspring = 0
    
    # For all cases, it is assumed that all couples will have exactly 2 calculate_offspring
    for index, count in enumerate(input_list):
        print("Index:", index, "   ", "Num couples:", count)
            
        # Case AA-AA, all offspring will be dominant phenotye
        if index == 0:
            expected_dominant_offspring += count * 2 * 1 
            
        # Case AA-Aa, all offspring will be dominant phenotype
        elif index == 1:
            expected_dominant_offspring += count * 2 * 1
            
        # Case AA-aa, all offspring will be dominant phenotype
        elif index == 2:
            expected_dominant_offspring += count * 2 * 1

        # Case Aa-Aa, 3 out of 4 offspring will be dominant genotype
        elif index == 3:
            expected_dominant_offspring += count * (2 * (3/4))

        # Case Aa-aa, 1 out of 4 offspring will be dominant phenotype
        elif index == 4:
            expected_dominant_offspring += count * (2 * (2/4))

        # Case aa-aa, no offspring will be dominant phenotype
        elif index == 5:
            expected_dominant_offspring += count * 2 * 0

    print(expected_dominant_offspring)
    
    return expected_dominant_offspring

这两个效果很好。

现在处理有问题的文件...

test_CONS.py

import pytest
from bioinformatics_stronghold.CONS import find_consensus_sequence


def test_find_consensus_sequence():
    assert find_consensus_sequence("tests\\data\\CONS_sample_data.fasta") == [[5, 1, 0, 0, 5, 5, 0, 0], [0, 0, 1, 4, 2, 0, 6, 1], [1, 1, 6, 3, 0, 1, 0, 0], [1, 5, 0, 0, 0, 1, 1, 6]], ['A', 'T', 'G', 'C', 'A', 'A', 'C', 'T']

添加行

from bioinformatics_stronghold.modules.read_fasta import read_fasta_file

只会给我一个导入错误 ModuleNotFound。添加 .或 .. 来自 ImportError 的结果：尝试在没有已知父包的情况下进行相对导入。

缺点.py

from modules.read_fasta import read_fasta_file

def find_consensus_sequence(fasta_location):
    """ 
    This function will read a given fasta file and extract all sequences using the read_fasta.py module.
    The function will then create a profile matrix as well as a consensus sequence, both as lists.

    Args:
        fasta_location (str): The location of the fasta file as a string.

    Returns:
        profile_matrix (list[lists]): The profile matrix of all given sequences. 
        consensus_sequence (list): The consensus sequences of all given sequences.
    """
    
    fasta_content = read_fasta_file(fasta_location, debug=False)
    
    # Create a matrix with all sequences
    sequence_matrix = []
    for item in fasta_content:
        sequence_matrix.append(list(item.sequence))
    # print(sequence_matrix)
    
    # Create the empty profile matrix
    # [A, C, G, T]
    profile_matrix = [[0]*len(sequence_matrix[0]), [0]*len(sequence_matrix[0]), [0]*len(sequence_matrix[0]), [0]*len(sequence_matrix[0])]
    
    # print(profile_matrix)
    
    # Add to the nucleotide count depending on the sequence
    for index, sublist in enumerate(sequence_matrix):
        for index, nucleotide in enumerate(sublist):
            if nucleotide == "A":
                profile_matrix[0][index] += 1
            if nucleotide == "C":
                profile_matrix[1][index] += 1
            if nucleotide == "G":
                profile_matrix[2][index] += 1
            if nucleotide == "T":
                profile_matrix[3][index] += 1
                
    # print(profile_matrix)

    consensus_sequence = []
    # NOTE: Ugly solution, but it seems to work. Quite ineffective, but not sure how to improve at this time.
    # For each position in the sequence, check which "letter" is larger than all other
    for index in range(len(profile_matrix[0])):
        
        if profile_matrix[0][index] > profile_matrix[1][index] and profile_matrix[0][index] > profile_matrix[2][index] and          profile_matrix[0][index] > profile_matrix[3][index]:
            consensus_sequence.append("A")
            
        elif profile_matrix[1][index] > profile_matrix[0][index] and profile_matrix[1][index] > profile_matrix[2][index] and          profile_matrix[1][index] > profile_matrix[3][index]:
            consensus_sequence.append("C")
            
        elif profile_matrix[2][index] > profile_matrix[0][index] and profile_matrix[2][index] > profile_matrix[1][index] and          profile_matrix[2][index] > profile_matrix[3][index]:
            consensus_sequence.append("G")
            
        elif profile_matrix[3][index] > profile_matrix[0][index] and profile_matrix[3][index] > profile_matrix[1][index] and          profile_matrix[3][index] > profile_matrix[2][index]:
            consensus_sequence.append("T")
    
    # print(consensus_sequence)
    
    return profile_matrix, consensus_sequence

test_CONS.py 不起作用。问题似乎是找不到modules文件夹。

将 __init__.py 添加到 bioinformatics_stronghold 文件夹并不能解决此问题。

如果我将测试文件夹移动到 bioinformatics_stronghold 文件夹中，pytest 就会中断，没有明显的错误消息，并且我无法在 VSCodium 中设置测试。

我的问题是：

为什么pytest无法导入模块中的read_fasta函数？
我应该如何安排这样的项目，以便让我拥有几个这样的小脚本，同时仍然能够测试它们。

Answer 1

我认为改变这个应该可以做到：

缺点.py

from .modules.read_fasta import read_fasta_file

如果这不起作用，

read_fasta.py

中可能存在某种导入问题，我鼓励您在此处评论您所看到的完整错误回溯，而不仅仅是错误消息。

注意：您的命名约定不遵循 PEP8 准则。

模块应该有简短的、全小写的名称。如果可以提高可读性，可以在模块名称中使用下划线。

编辑： 这是一个有关如何构建项目并使其可调用的示例。

Rosalind-problems/
├─ bioinformatics_stronghold/
│  ├─ data/
│  ├─ modules/
│  │  ├─ __init__.py
│  │  ├─ read_fasta.py
│  ├─ __main__.py
│  ├─ CONS.py
│  ├─ IEV.py
├─ tests/
│  ├─ __init__.py
│  ├─ test_CONS.py
│  ├─ test_IEV.py

__main__.py

from .modules import read_fasta


read_fasta.call_a_function()

要执行此操作，只需在终端中输入

python -m bioinformatics_stronghold

即可。使用单个主入口点，您可以执行各种操作，例如接受用户输入、添加

argparse

界面等。

具有相对导入的Python项目文件结构，或者如何正确构建所描述的项目

问题描述投票：0回答：1

1个回答

缺点.py

`main.py`

最新问题

具有相对导入的Python项目文件结构，或者如何正确构建所描述的项目

问题描述 投票：0回答：1

1个回答

缺点.py

__main__.py

最新问题

问题描述投票：0回答：1

`main.py`