Perl:在多个文本文件中查找并替换特定字符串

问题描述 投票:11回答:6

我需要获取给定目录中的所有.config文件,并且在每个文件中我需要搜索特定字符串并根据文件替换另一个。

例如,如果我在给定目录中有3个文件:

 for  my_foo.config - string to search "fooCommon >" replace with "~ /fooCommon[\/ >"
 for  my_bar.config - string to search "barCommon >" replace with "~ /barCommon[\/ >"
 for  my_file.config - string to search "someCommon >" replace with "~ /someCommon[\/ >"

请告诉我如何在Perl中完成此操作?

下面是我在shell脚本中尝试的代码:

OLD="\/fooCommon >"
NEW="~ \"\/fooCommon[^\/]*\" >"
DPATH="/myhome/aru/conf/host*.conf"
BPATH="/myhome/aru/conf/bakup"
TFILE="/myhome/aru/out.tmp.$$"
[ ! -d $BPATH ] && mkdir -p $BPATH || :
for f in $DPATH
do
  if [ -f $f -a -r $f ]; then
   /bin/cp -f $f $BPATH
   echo sed \"s\/$OLD\/$NEW\/g\"
   sed "s/$OLD/$NEW/g" "$f" > $TFILE && mv $TFILE "$f"
  else
   echo "Error: Cannot read $f"

fi
done
/bin/rm $TFILE
regex perl shell replace
6个回答
24
投票

如果您使用的是Unix平台,可以在命令行上使用Perl来完成;无需编写脚本。

perl -i -p -e 's/old/new/g;' *.config

为了更安全,您可能希望使用带备份选项的命令。

perl -i.bak  -p -e 's/old/new/g;' *.config

10
投票

这里的Perl只是修改文件...我不明白为什么要在perl中编写它,如果你能做到这么简单:

find . -maxdepth 1 -type f -name '*.conf' | \
    xargs perl -i.bak -pe 's/localhost/example.com/;'

2
投票

如果你真的需要用perl来做这件事,我不建议这样做,因为已经发布了优秀而简单的答案,这里有:

#!/usr/bin/perl

# take the directory to be processed from first command line argument
opendir($dh, $ARGV[0]);
# take only relevant files ie. "*.config"
@cfgs = grep { /\.config$/ } readdir($dh);
# loop through files
foreach(@cfgs) {
  # generate source string from the filename
  ($s) = ($_ =~ /.*_(\w+)\.config.*/);
  $s = "${s}Common";
  # generate replacement string from the filename
  $r = "~ /${s}[/ >";
  # move original file to a backup
  rename("${ARGV[0]}${_}", "${ARGV[0]}${_}.bak");
  # open backup file for reading
  open(I, "< ${ARGV[0]}${_}.bak");
  # open a new file, with original name for writing
  open(O, "> ${ARGV[0]}${_}");
  # go through the file, replacing strings
  while(<I>) { $_ =~ s/$s/$r/g; print O $_; }
  # close files
  close(I);
  close(O);
}

# end of file.

请注意,使用简单的find和/或shell通配符执行此操作要简单得多。但是把它作为一个关于如何使用perl处理文件的小教程。


1
投票

虽然可以从命令行完成,但有时您只需要一个易于使用的脚本,它可以提供更有用的输出。考虑到这一点,这里有一个perl解决方案,对于遇到这个问题的任何人都有友好的输出。

#!/usr/bin/env perl5.8.3

# subst [-v] [-f] "re/string to find" "string to replace" -- list of files
#  optional -v flag shows each line with replacement, must be 1st arg to script
#  optional -f flag says to disable regexp functionality and make the strings match exactly
#  replacement string may include back references ($1, $2, etc) to items in "string to find" if they are surrounded by grouping parenthesis

use strict;
use warnings;
use List::Util;
use IO::File;
use Fcntl;
use Getopt::Long qw(GetOptions);

my $verbose = 0;
my $fixed   = 0;

GetOptions("v" => \$verbose,
           "f" => \$fixed);

my $find    = shift @ARGV;
my $replace = shift @ARGV;

die "Error: missing 1st arg, string to find\n"         if not defined $find;
die "Error: missing 2nd arg, string to replace with\n" if not defined $replace;
die "No files were specified\n"                        if @ARGV == 0;

# open a temp file for writing changes to
my $TEMP = IO::File->new_tmpfile;
if (not defined $TEMP)
{
    print STDERR "ERROR: failed to create temp file: $!\n";
    exit 1;
}

# Fix max file name width for printing
my $fwidth = List::Util::max map { length $_ } @ARGV;

# Process each file
my $unchanged = 0;
my $changed   = 0;
foreach my $file (@ARGV)
{
    if (open(my $FILE, '<', $file))
    {
        # Reset temp file
        seek $TEMP, 0, SEEK_SET or die "ERROR: seek in temp file failed: $!";
        truncate $TEMP, 0       or die "ERROR: truncate of temp file failed: $!";

        # go through the file, replacing strings
        my $changes = 0;
        while(defined(my $line = <$FILE>))
        {
            if ($line =~ m/$find/g)
            {
                print "-" . $line if $verbose;
                print "\n" if $verbose and $line !~ m/\n$/;

                if ($fixed)
                {
                    my $index = index($line, $find);
                    substr($line, $index, length($find)) = $replace;
                }
                else
                {
                    $line =~ s/$find/replacebackrefs($replace)/eg;
                }

                $changes++;
                print "+" . $line if $verbose;
                print "\n" if $verbose and $line !~ m/\n$/;
            }

            print $TEMP $line;
        }
        close $FILE;

        if ($changes == 0)
        {
            $unchanged++;
            unlink("/tmp/subst$$");
            next;
        }

        # Move new contents into old file
        $changed++;
        printf "%*s - %3d changes\n", -$fwidth, $file, $changes;

        seek $TEMP, 0, SEEK_SET or die "ERROR: rewind of temp file failed: $!";
        open $FILE, '>', $file or die "ERROR: failed to re-write $file: $!\n";
        while (<$TEMP>) { print $FILE $_ }
        close $FILE;

        print "\n" if $verbose;
    }
    else
    {
        print STDERR "Error opening $file: $!\n";
    }
}

close $TEMP;

print "\n";
print "$changed files changed, $unchanged files unchanged\n";

exit 0;

sub replacebackrefs
{
    # 1st/only argument is the text matched
    my $matchedtext = shift @_;

    my @backref;
    # @- is a dynamic variable that holds the offsets of submatches in
    # the currently active dynamic scope (i.e. within each regexp
    # match), corresponding to grouping parentheses. We use the count
    # of entrees in @- to determine how many matches there were and
    # store them into an array. Note that @- index [0] is not
    # interesting to us because it has a special meaning (see man
    # perlvar for @-)\, and that backrefs start with $1 not $0.
    # We cannot do the actual replacement within this loop.
    do
    {
        no strict 'refs'; # turn of warnings of dynamic variables
        foreach my $matchnum (1 .. $#-)
        {
            $backref[$matchnum] = ${$matchnum}; # i.e. $1 or $2 ...
        }
    } while(0);

    # now actually replace each back reference in the matched text
    # with the saved submatches.
    $matchedtext =~ s/\$(\d+)/$backref[$1]/g;

    # return a scalar string to actually use as the replacement text,
    # with all the backreferences in the matched text replaced with
    # their submatch text.
    return $matchedtext;
}

0
投票

也许以下内容会有所帮助:

use strict;
use warnings;

my %replacements =
  map { chomp; my @x = split /\|/; $x[0] => [ $x[1], $x[2] ] } <DATA>;

local $^I = '.bak';

for my $file (<*.config>) {
    push @ARGV, $file;

    while (<>) {
        s/\b\Q$replacements{$file}[0]/$replacements{$file}[1]/g;
        print;
    }
}

__DATA__
my_foo.config|fooCommon >|~ /fooCommon[/ >
my_bar.config|barCommon >|~ /barCommon[/ >
my_file.config|someCommon >|~ /someCommon[/ >

数组哈希(HoA)是由splitting |分隔的DATA行构建的,其中键是文件名,值是对匿名数组的引用,其中两个元素用于文件上的替换。 local $^I = '.bak'表示法创建原始文件的备份。

您可能需要调整替换。例如,通过在\b中使用s/\b\Q$replacements{$file}[0]/$replacements{$file}[1]/g;在替换中观察到单词边界。您可能需要也可能不需要(或想要)。

我建议首先在一个'scratch'文件上尝试它,以确保在完全实现之前获得所需的结果 - 即使原始文件已备份。


0
投票

你的脚本是一个很好的尝试。

它包含一些冗余:

  • 它对cp $f毫无用处
  • $TFILE也没用(只需将sed输出直接写入目标文件)

您可以在没有目录路径的情况下从$NEW的值构造$f和目标文件名,您可以按如下方式获取:

bf=`basename "$f"`
© www.soinside.com 2019 - 2024. All rights reserved.