使用Python从YAML文件中删除内容,同时保留原始结构

问题描述 投票:0回答:2

我有一个 YAML 文件。 我想使用我的脚本来存储未包含在我定义的字符串列表中的所有“存储库”实例。我的脚本:

import yaml

core_repos = ["REPO1",
              "REPO2"]

if __name__ == "__main__":
    yml_file_name = "azure-pipelines.yml"
    with open(yml_file_name, 'r') as yml_file:
        yml_content = yaml.safe_load(yml_file)
    repositories = yml_content.get("resources", {}).get("repositories", [])
    filtered_repositories = [repo for repo in repositories if repo.get("repository") in core_repos]
    yml_content["resources"]["repositories"] = filtered_repositories

    with open(yml_file_name, 'w') as f:
        yaml.safe_dump(yml_content, f, default_flow_style=False)

原文:

trigger:
  - release/test

pool:
  name: <REDACTED>-Linux
  demands:
    - agent.name -equals  <REDACTED>

# Overrides the value for Build.BuildNumber, which is used to name the artifact (ZIP file) that is produced
name: '$(Date:yyyyMMdd)T$(Hours)$(Minutes)$(Seconds)'

resources:
  repositories:
    - repository: REPO1
      type: git
      ref: release/test
      name: <REDACTED>/REPO1
      trigger:
        branches:
          include:
            - release/test

    - repository: REPO2
      type: git
      ref: release/test
      name: <REDACTED>/REPO2
      trigger:
        branches:
          include:
            - release/test

    - repository: REPO3
      type: git
      ref: release/test
      name: <REDACTED>/REPO3
      trigger:
        branches:
          include:
            - release/test

    - repository: REPO4
      type: git
      ref: release/test
      name: <REDACTED>/REPO4
      trigger:
        branches:
          include:
            - release/test

stages:
  - stage: 'BuildAndUploadArtifact'
    jobs:
      - job:
        workspace:
          clean: all
        steps:
          - checkout: self
          # Core repos
          - checkout: REPO1
          - checkout: REPO2
          - checkout: REPO3
          - checkout: REPO4

运行脚本后,我的主要目标似乎已经完成,但在某些情况下输出看起来非常错误。仅举几例,触发器最终位于底部,我的评论完全丢失。这是什么原因造成的?

name: $(Date:yyyyMMdd)T$(Hours)$(Minutes)$(Seconds)
pool:
  demands:
  - agent.name -equals  <REDACTED>
  name: <REDACTED>-Linux
resources:
  repositories:
  - name: <REDACTED>/REPO1
    ref: release/test
    repository: REPO1
    trigger:
      branches:
        include:
        - release/test
    type: git
  - name: <REDACTED>/REPO2
    ref: release/test
    repository: REPO2
    trigger:
      branches:
        include:
        - release/test
    type: git
stages:
- jobs:
  - job: null
    steps:
    - checkout: self
    - checkout: REPO1
    - checkout: REPO2
    - checkout: REPO3
    - checkout: REPO4
    workspace:
      clean: all
  stage: BuildAndUploadArtifact
trigger:
- release/test
python yaml pyyaml
2个回答
0
投票

当您加载原始文件,处理生成的 YAML 对象,然后将其转储到同名文件时,您不应该期望一个简单的文件版本,而只是一个新的 YAML 文件,它表示您想要的结构,但不是必须与原始文件具有相同的格式。

在您的情况下,所有键最终都按字母顺序排序,并且加载方法会忽略注释,因此它不存在于 YAML 对象中,也不存在于结果文件中。

如果您确实想确保 YAML 版本的行为就像您手动编辑文件一样,则需要将该文件视为常规 txt 并过滤掉要删除的行。但这是更多的正则表达式和逻辑


0
投票

我自己修复了它,这可以使用 ruamel.yaml:

from ruamel.yaml import YAML

core_repos = ["REPO1", "REPO2"]

if __name__ == "__main__":
    yml_file_name = "azure-pipelines.yml"

    yaml = YAML()
    yaml.preserve_quotes = True
    with open(yml_file_name, 'rb') as yml_file:
        yml_content = yaml.load(yml_file)

        repositories = yml_content.get("resources", {}).get("repositories", [])
        filtered_repositories = [repo for repo in repositories if repo.get("repository") in core_repos]
        yml_content["resources"]["repositories"] = filtered_repositories

    with open(yml_file_name, 'wb') as f:
        yaml.dump(yml_content, f)
© www.soinside.com 2019 - 2024. All rights reserved.