在 powershell 中拆分包含多个页眉和页脚的大文本文件

问题描述 投票:0回答:1

我正在使用 powershell 脚本拆分包含多个页眉和页脚的每日文本文件,页眉记录始终以“0”开头,页脚以“9”开头。每次找到新的标题记录时,我的脚本都可以将原始文件拆分为 N 个输出文件,问题是当原始文本文件较大时,我在我的脚本上使用 Get-Content cmdlet,我知道它对大文件不是很有效,但我是 powershell 的新手,我不知道如何使用其他功能,你能帮忙吗?

这是我的剧本:

# Define input and output paths
$inputFilePath = "D:\TEMP\Input\MyFile.TXT"
$outputFolderPath = "D:\TEMP\Output"

# Read the input file
$content = Get-Content $inputFilePath

# Initialize the output file content, filename and counter
$outputFileContent = ""
$outputFileName = ""
$fileCounter = 0

# Loop through each line in the input file
foreach ($line in $content) {
# If the line starts with "0C2B", create a new output file with a timestamp
if ($line.StartsWith("0C2B")) {
    # If there is an existing output file content, write it to a file
    if ($outputFileContent) {
        $outputFileName = "MyFile$fileCounter$((Get-Date).ToString("HHmmssfff")).txt"
        $outputFilePath = Join-Path $outputFolderPath $outputFileName
        Set-Content $outputFilePath $outputFileContent.TrimEnd("`r`n")
        $fileCounter++
        $outputFileContent = ""
    }
}

# Append the line to the current output file content
$outputFileContent += "$line`r`n"
}

# If there is remaining output file content, write it to a file
if ($outputFileContent) {
$outputFileName = "MyFile$fileCounter$((Get-Date).ToString("HHmmssfff")).txt"
$outputFilePath = Join-Path $outputFolderPath $outputFileName
Set-Content $outputFilePath $outputFileContent.TrimEnd("`r`n")
}
powershell split header footer large-files
1个回答
0
投票

使用

System.IO.StreamReader
逐行读取文件以减少内存消耗怎么样?

# Define input and output paths
$inputFilePath = "D:\TEMP\Input\MyFile.TXT"
$outputFolderPath = "D:\TEMP\Output"

# Initialize the output file content, filename and counter
$outputFileContent = ""
$outputFileName = ""
$fileCounter = 0

# Create a StreamReader to read the input file line by line
$streamReader = New-Object System.IO.StreamReader($inputFilePath)

# Loop through each line in the input file
while (-not $streamReader.EndOfStream) {
    $line = $streamReader.ReadLine()

    # If the line starts with "0C2B", create a new output file with a timestamp
    if ($line.StartsWith("0C2B")) {
        # If there is an existing output file content, write it to a file
        if ($outputFileContent) {
            $outputFileName = "MyFile$fileCounter$((Get-Date).ToString("HHmmssfff")).txt"
            $outputFilePath = Join-Path $outputFolderPath $outputFileName
            Set-Content $outputFilePath $outputFileContent.TrimEnd("`r`n")
            $fileCounter++
            $outputFileContent = ""
        }
    }

    # Append the line to the current output file content
    $outputFileContent += "$line`r`n"
}

# If there is remaining output file content, write it to a file
if ($outputFileContent) {
    $outputFileName = "MyFile$fileCounter$((Get-Date).ToString("HHmmssfff")).txt"
    $outputFilePath = Join-Path $outputFolderPath $outputFileName
    Set-Content $outputFilePath $outputFileContent.TrimEnd("`r`n")
}

# Close the StreamReader
$streamReader.Close()

© www.soinside.com 2019 - 2024. All rights reserved.