用于使用 pytest-xdist 下载和写入数据的测试夹具锁

问题描述 投票:0回答:1

我用

pytest
编写了 Python 测试。这些测试下载测试数据并将其缓存为本地写入文件。

现在我正在与

pytest-xdist
并行测试。如何防止测试装置中的并行写入,因为这会导致数据损坏和测试失败?

理想情况下,只有一个测试进程需要下载数据并将其缓存为文件。

parallel-processing pytest locking pytest-xdist
1个回答
0
投票

您可以使用 filelock 库为测试装置或测试中发生的每次下载创建一个锁定文件。

  • 创建一个锁定文件,阻止除单个进程之外的所有进程的读/写
  • 第一个获取锁的进程下载数据并写入文件
  • 后续进程返回缓存数据

这是一个示例函数

wait_other_writers()
,它可以实现上述目标:

@contextmanager
def wait_other_writers(path: Path | str, timeout=120):
    """Wait other potential writers writing the same file.

    - Work around issues when parallel unit tests and such
      try to write the same file

    Example:

    .. code-block:: python

        import urllib
        import tempfile

        import pytest
        import pandas as pd

        @pytest.fixture()
        def my_cached_test_data_frame() -> pd.DataFrame:

            # Al tests use a cached dataset stored in the /tmp directory
            path = os.path.join(tempfile.gettempdir(), "my_shared_data.parquet")

            with wait_other_writers(path):

                # Read result from the previous writer
                if not path.exists():
                    # Download and write to cache
                    urllib.request.urlretrieve("https://example.com", path)

                return pd.read_parquet(path)

    :param path:
        File that is being written

    :param timeout:
        How many seconds wait to acquire the lock file.

        Default 2 minutes.
    """

    if type(path) == str:
        path = Path(path)

    assert isinstance(path, Path), f"Not Path object: {path}"

    assert path.is_absolute(), f"Did not get an absolute path: {path}\n" \
                               f"Please use absolute paths for lock files to prevent polluting the local working directory."

    # If we are writing to a new temp folder, create any parent paths
    os.makedirs(path.parent, exist_ok=True)

    # https://stackoverflow.com/a/60281933/315168
    lock_file = path.parent / (path.name + '.lock')

    lock = FileLock(lock_file, timeout=timeout)
    with lock:
        yield

有关此功能的示例使用,请参阅此处

最新问题
© www.soinside.com 2019 - 2024. All rights reserved.