Python 在循环中找不到第一个文件之外的文件

Question

我有一个脚本，其中包含对目录中文件的 for 循环。脚本和目录都放在我机器上的同一目录中。目录名称作为参数传递给脚本。

我打开每个文件（yml），将内容读取到字典中，然后使用文件的所述字典的内容从我的代码中调用方法。 第一个循环（在第一个文件上）运行没有错误，但在第二个循环中，python 在打开文件的行中抛出

"FileNotFoundError: [Errno 2] No such file or directory"

。

config_dir = args.param # this is the directory which contains the yml files
config_file_names = os.listdir(config_dir)  # this is a list of said files

    for config_file_name in config_file_names:
        config_file_path = os.path.join(config_dir, config_file_name) # here the current file is concatenated with the directory to form a path from the working directory of script to the file
        with open(config_file_path) as f:  # LINE OF ERROR IN SECOND LOOP
            print(config_file_path)
            config = yaml.load(f, Loader=SafeLoader)
        --------------- MORE CODE ---------------

我认为这是构建路径的错误。但这并没有改变任何事情：

config_file_path = config_dir + "/" + config_file_name

如果我在上面显示的代码之后，但在我省略的其余代码之前添加继续，则代码运行时不会出现错误：

parser = argparse.ArgumentParser()

parser.add_argument("-p", "--param", help="Path to parameter file directory", required=True)

args = parser.parse_args()

config_dir = args.param # this is the directory which contains the yml files
config_file_names = os.listdir(config_dir)  # this is a list of said files

    for config_file_name in config_file_names:
        config_file_path = os.path.join(config_dir, config_file_name) # here the current file is concatenated with the directory to form a path from the working directory of script to the file
        with open(config_file_path) as f:
            print(config_file_path)
            config = yaml.load(f, Loader=SafeLoader)
            continue # this fixes it
        --------------- MORE CODE ---------------

这会产生以下控制台输出：

C:\Users\user\mambaforge\envs\traintool\python.exe C:\Users\user\PycharmProjects\traintool\src\multiple_runs.py -p opt_configs 
2023-10-11 14:16:05.419007: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
1 Physical GPUs, 1 Logical GPUs
opt_configs\config_alpha_proteo_class.yml
2023-10-11 14:16:06.069136: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5987 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5
opt_configs\config_amp_class.yml
opt_configs\config_biofilm_class.yml
opt_configs\config_cytotoxic_class.yml
opt_configs\config_proteo_class.yml
opt_configs\config_pseudo_biofilm_class.yml
opt_configs\config_p_aeruginosa_class.yml

Process finished with exit code 0

不要介意张量流的细节，这些总是由这个特定的第三方包打印。

这表明后面的代码就是原因。但根据 pycharms 搜索使用功能，这些是读取或写入相关元素的唯一行：config_dir、config_file_names、config_file_name、config_file_path

在我的整个项目中，我没有改变这些元素。我不知道什么可能会影响这些文件的迭代。我尝试删除第一个文件。该错误将发生在第三个文件（当然，在这种情况下，该文件将是第二个文件）。我无法使用示例代码在程序之外重现错误。

该项目的代码分布在多个脚本中，因此共享整个脚本是不可行的。我只使用 yaml 加载器创建的字典内容。我希望有人能够知道任何事情如何真正影响一个又一个文件的阅读。我在两台机器（家用电脑和服务器集群）上尝试过，没有区别。

Python 3.9.13

编辑：这是脚本文件的其余部分：

import argparse
import yaml
from yaml.loader import SafeLoader
import os

def main():
    parser = argparse.ArgumentParser()

    parser.add_argument("-p", "--param", help="Path to parameter file directory", required=True)

    args = parser.parse_args()

    config_dir = args.param
    config_file_names = os.listdir(config_dir)

    for config_file_name in config_file_names:
        config_file_path = os.path.join(config_dir, config_file_name)
        with open(config_file_path) as f:
            print(config_file_path)
            config = yaml.load(f, Loader=SafeLoader)
        continue

        assert config["modus"] == "train" or config["modus"] == "selection", (
            "Illegal Argument." "modus must be either train or selection."
        )

        if config["modus"] == "train":
            for model in config["model"]:
                for loss in config["loss"]:
                    for optimizer in config["optimizer"]:
                        train.train_seq(
                            trainfile=config["train_file"],
                            lr=config["learning_rate"],
                            epochs=config["epochs"],
                            batch_size=config["batch_size"],
                            loss=loss,
                            optimizer=optimizer,
                            model_name=model,
                            encoding=config["encoding"],
                            patience=config["patience"],
                            cv_split=config["cv_split"],
                            val_files=config["val_file"],
                            test_size=config["test_size"],
                            early_stopping=config["early_stopping"],
                            monitor=config["monitor"],
                            shuffle=config["shuffle"],
                            validation_split=config["validation_split"],
                            x_values=config["x_values"],
                            y_values=config["y_values"],
                            regression=config["regression"],
                            momentum=config["momentum"],
                            sep=config["sep"]
                        )
        elif config["modus"] == "selection":
            utils.find_best_model(config["path_to_models"], config["selection_criterion"])


if __name__ == "__main__":
    import utils
    utils.enable_gpu_mem_growth()
    import train
    main()

据我所知，所有内容均来自字典配置中的文件内容。

Answer 1

在项目深处的某个地方，工作目录已更改。当然，仅发布上面脚本的部分时，这是不可见的。

我通过进行此更改解决了问题：

base_dir = os.getcwd()

for config_file_name in config_file_names:
    os.chdir(base_dir)

这会将工作目录重置为循环之前的原始目录。我通过在循环中获取工作目录并在调试器中显示它来找到它。如果您遇到相同的情况，请在您的项目中搜索

import os

、

os.chdir

或

os.getcwd

。这些可能会引导您找到原因。 IMO 更改工作目录是不好的做法，我的问题是为什么我这么认为。

Python 在循环中找不到第一个文件之外的文件

问题描述投票：0回答：1

1个回答

最新问题

Python 在循环中找不到第一个文件之外的文件

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1