如何使用 setup.py 和 pip install -e 在 python 项目的根目录下拥有多个 src 目录?

问题描述 投票:0回答:2

我想在项目的根目录下有两个 src 目录。原因是我想在不修改任何导入的情况下使用代码。第二个是独立于“旧代码”的新代码。我想要两个 src 和

pip install -e .
可以工作。我的
setup.py
是:

"""
python -c "print()"

refs:
    - setup tools: https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#using-find-or-find-packages
    - https://stackoverflow.com/questions/70295885/how-does-one-install-pytorch-and-related-tools-from-within-the-setup-py-install
"""
from setuptools import setup
from setuptools import find_packages
import os

here = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(here, 'README.md'), encoding='utf-8') as f:
    long_description = f.read()

setup(
    name='massive-evaporate-4-math',  # project name
    version='0.0.1',
    long_description=long_description,
    long_description_content_type="text/markdown",
    author='Me',
    author_email='[email protected]',
    python_requires='>=3.9',
    license='Apache 2.0',

    # ref: https://chat.openai.com/c/d0edae00-0eb2-4837-b492-df1d595b6cab
    # The `package_dir` parameter is a dictionary that maps package names to directories.
    # A key of an empty string represents the root package, and its corresponding value
    # is the directory containing the root package. Here, the root package is set to the
    # 'src' directory.
    #
    # The use of an empty string `''` as a key is significant. In the context of setuptools,
    # an empty string `''` denotes the root package of the project. It means that the
    # packages and modules located in the specified directory ('src' in this case) are
    # considered to be in the root of the package hierarchy. This is crucial for correctly
    # resolving package and module imports when the project is installed.
    #
    # By specifying `{'': 'src'}`, we are informing setuptools that the 'src' directory is
    # the location of the root package, and it should look in this directory to find the
    # Python packages and modules to be included in the distribution.
    package_dir={
            '': 'src_math_evaporate',
            'bm_evaporate': 'src_bm_evaporate', 
        },

    # The `packages` parameter lists all Python packages that should be included in the
    # distribution. A Python package is a way of organizing related Python modules into a
    # directory hierarchy. Any directory containing an __init__.py file is considered a
    # Python package.
    #
    # `find_packages('src')` is a convenience function provided by setuptools, which
    # automatically discovers and lists all packages in the specified 'src' directory.
    # This means it will include all directories in 'src' that contain an __init__.py file,
    # treating them as Python packages to be included in the distribution.
    #
    # By using `find_packages('src')`, we ensure that all valid Python packages inside the
    # 'src' directory, regardless of their depth in the directory hierarchy, are included
    # in the distribution, eliminating the need to manually list them. This is particularly
    # useful for projects with a large number of packages and subpackages, as it reduces
    # the risk of omitting packages from the distribution.
    packages=find_packages('src_math_evaporate') + find_packages('src_bm_evaporate'),
    # When using `pip install -e .`, the package is installed in 'editable' or 'develop' mode.
    # This means that changes to the source files immediately affect the installed package
    # without requiring a reinstall. This is extremely useful during development as it allows
    # for testing and iteration without the constant need for reinstallation.
    #
    # In 'editable' mode, the correct resolution of package and module locations is crucial.
    # The `package_dir` and `packages` configurations play a vital role in this. If the
    # `package_dir` is incorrectly set, or if a package is omitted from the `packages` list,
    # it can lead to ImportError due to Python not being able to locate the packages and
    # modules correctly.
    #
    # Therefore, when using `pip install -e .`, it is essential to ensure that `package_dir`
    # correctly maps to the root of the package hierarchy and that `packages` includes all
    # the necessary packages by using `find_packages`, especially when the project has a
    # complex structure with nested packages. This ensures that the Python interpreter can
    # correctly resolve imports and locate the source files, allowing for a smooth and
    # efficient development workflow.

    # for pytorch see doc string at the top of file
    install_requires=[
        'fire',
        'dill',
        'networkx>=2.5',
        'scipy',
        'scikit-learn',
        'lark-parser',
        'tensorboard',
        'pandas',
        'progressbar2',
        'requests',
        'aiohttp',
        'numpy',
        'plotly',
        'wandb',
        'matplotlib',
        # 'statsmodels'
        # 'statsmodels==0.12.2'
        # 'statsmodels==0.13.5'
        # - later check why we are not installing it...
        # 'seaborn'
        # 'nltk'
        'twine',

        # # mercury: https://github.com/vllm-project/vllm/issues/2747
        # 'dspy-ai',
        # # 'torch==2.1.2+cu118',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        # 'torch==2.2.2',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        # # 'torchvision',
        # # 'torchaudio',
        # # 'trl',
        # 'transformers',
        # 'accelerate',
        # # 'peft',
        # # 'datasets==2.18.0', 
        # 'datasets',  
        # 'evaluate', 
        # 'bitsandbytes',
        # # 'einops',
        # # 'vllm==0.4.0.post1', # my gold-ai-olympiad project uses 0.4.0.post1 ref: https://github.com/vllm-project/vllm/issues/2747

        # ampere
        'dspy-ai',
        # 'torch==2.1.2+cu118',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        'torch==2.1.2',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        # 'torchvision',
        # 'torchaudio',
        # 'trl',
        'transformers==4.39.2',
        'accelerate==0.29.2',
        # 'peft',
        # 'datasets==2.18.0', 
        'datasets==2.14.7',  
        'evaluate==0.4.1', 
        'bitsandbytes== 0.43.0',
        # 'einops',
        'vllm==0.4.0.post1', # my gold-ai-olympiad project uses 0.4.0.post1 ref: https://github.com/vllm-project/vllm/issues/2747
        # pip install -q -U google-generativeai

        "tqdm",
        "openai",
        "manifest-ml",
        'beautifulsoup4',
        # 'pandas',
        'cvxpy',
        # 'sklearn',The 'sklearn' PyPI package is deprecated, use 'scikit-learn' rather than 'sklearn' for pip commands.
        # 'scikit-learn',
        'snorkel',
        'snorkel-metal', 
        'tensorboardX',
        'pyyaml',
        'TexSoup',
    ]
)

以及我在 cli bash 中遇到的错误:

(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ tree src_math_evaporate/
src_math_evaporate/
└── math_evaporate_llm_direct.py

0 directories, 1 file
(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ tree src_bm_evaporate/
src_bm_evaporate/
├── configs.py
├── evaluate_profiler.py
├── evaluate_synthetic.py
├── evaluate_synthetic_utils.py
├── massive_evaporate_4_math.egg-info
│   ├── dependency_links.txt
│   ├── PKG-INFO
│   ├── requires.txt
│   ├── SOURCES.txt
│   └── top_level.txt
├── profiler.py
├── profiler_utils.py
├── prompts_math.py
├── prompts.py
├── __pycache__
│   ├── configs.cpython-39.pyc
│   ├── prompts.cpython-39.pyc
│   └── utils.cpython-39.pyc
├── run_profiler_maf.py
├── run_profiler_math_evaporate.py
├── run_profiler.py
├── run.sh
├── schema_identification.py
├── snap_cluster_setup.egg-info
│   ├── dependency_links.txt
│   ├── PKG-INFO
│   ├── requires.txt
│   ├── SOURCES.txt
│   └── top_level.txt
├── utils.py
└── weak_supervision
    ├── binary_deps.py
    ├── __init__.py
    ├── make_pgm.py
    ├── methods.py
    ├── pgm.py
    ├── run_ws.py
    └── ws_utils.py

4 directories, 34 files
(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ pip install -e .
Obtaining file:///afs/cs.stanford.edu/u/brando9/massive-evaporation-4-math
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      running egg_info
      creating /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info
      writing /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/PKG-INFO
      writing dependency_links to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/dependency_links.txt
      writing requirements to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/requires.txt
      writing top-level names to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/top_level.txt
      writing manifest file '/tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/SOURCES.txt'
      error: package directory 'src_math_evaporate/weak_supervision' does not exist
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

一切看起来都对我有利。为什么会出现错误?

我尝试过:

    package_dir={
            '': 'src_math_evaporate',
            'bm_evaporate': 'src_bm_evaporate', 
        },

    package_dir={
            'math_evaporate': 'src_math_evaporate',
            'bm_evaporate': 'src_bm_evaporate', 
        },

不起作用。两者都是root:

    package_dir={
            '': 'src_math_evaporate',
            '': 'src_bm_evaporate', 
        },

不知道还能尝试什么。我该怎么办?

python pip setuptools setup.py python-packaging
2个回答
2
投票

错误信息是正确的,没有“src_math_evaporate/weak_supervision”子目录。修复使用

setup(
    name='massive-evaporate-4-math',
    version='0.0.1',
    package_dir={
        'math_evaporate': 'src_math_evaporate',
        'bm_evaporate': 'src_bm_evaporate',
    },
...
)

不要在 setup.py 中使用

packages=...
。安装程序完全能够找到所有没有
packages
明确值的包。

顺便说一句 - 虽然构建/安装不需要,但最好将

__init__.py
文件添加到所有包和模块目录中。在当前的目录结构下,子模块很难发现。


0
投票

[作品] 让我演示一下使用两个 src 目录而不妨碍导入的打包过程。

项目结构

.
├── MANIFEST.in
├── README.md
├── pyproject.toml
├── requirements.txt
├── setup.cfg
├── setup.py
├── src_bm_evaporate
│   ├── __init__.py
│   ├── module1
│   │   ├── __init__.py
│   │   └── main.py
│   └── module2
│       └── main.py
└── src_math_evaporate
    ├── __init__.py
    └── random_string.py

打包
Evaporate
文件

src_math_evaporate

__init__.py

from .random_string import generate_random_string


__all__ = ["generate_random_string"]

random_string.py

import random
from string import ascii_letters, digits


def generate_random_string(size: int = 32):
    choices = ascii_letters + digits
    return "".join([random.choice(choices) for _ in range(size)])

src_bm_蒸发

__init__.py

from .module1.main import generate_16_digit_random_str
from .module2.main import generate_8_digit_random_str


__all__ = [
    "generate_16_digit_random_str",
    "generate_8_digit_random_str"
]

module1/main.py

""" Both imports work. """
# from src_math_evaporate.random_string import generate_random_string
from src_math_evaporate.random_string import generate_random_string


def generate_16_digit_random_str():
    return generate_random_string(size=16)

module2/main.py

""" Both imports work. """
# from src_math_evaporate.random_string import generate_random_string
from src_math_evaporate import generate_random_string


def generate_8_digit_random_str():
    return generate_random_string(size=8)

打包文件

MANIFEST.in

include setup.py
include MANIFEST.in
include README.md

graft src_math_evaporate
graft src_bm_evaporate

pyproject.toml

[build-system]
# These are the assumed default build requirements from pip:
# https://pip.pypa.io/en/stable/reference/pip/#pep-517-and-518-support
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"

requirements.txt

-e .

setup.cfg

[metadata]
name = Evaporate
version = 1.0.0
url = 
license = 
description = ""
long_description = file: README.md
long_description_content_type = text/markdown
classifiers=
   Programming Language :: Python :: 3 :: Only
   Programming Language :: Python :: Implementation


[options]
package_dir =
    bm_evaporate = src_bm_evaporate
    math_evaporate = src_math_evaporate
packages = find:
include_package_data = True

setup.py

from setuptools import setup, find_packages

setup(
    name="Evaporate",
    version="1.0.0",
    packages=find_packages(where=["src_bm_evaporate", "src_math_evaporate"]),
)

安装项目

$ Project via v3.9.6 (venv) on
❯ pip install -r requirements.txt 
Obtaining file:///Users/rk4bir/Workspace/stackoverflow/Project (from -r requirements.txt (line 1))
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Preparing editable metadata (pyproject.toml) ... done
Building wheels for collected packages: Evaporate
  Building editable for Evaporate (pyproject.toml) ... done
  Created wheel for Evaporate: filename=Evaporate-1.0.0-0.editable-py3-none-any.whl size=2663 sha256=776b86201258a51485a3a813ad3b77ed0f3084eff7f446c5aca5fc120475ea9c
  Stored in directory: /private/var/folders/tp/rpcspmx561bcqm4b82j43yym0000gn/T/pip-ephem-wheel-cache-eulzmur2/wheels/94/12/8f/da50fb6f83490ee21fe4e2791ad3320258060f4b4995742f2c
Successfully built Evaporate
Installing collected packages: Evaporate
Successfully installed Evaporate-1.0.0

希望有帮助。

© www.soinside.com 2019 - 2024. All rights reserved.