匹配两个文件中的值并替换所选列中的值

问题描述 投票:1回答:2

目的是检查file1中第3列和第4列的值是否与file2中的第1列匹配。如果任何值匹配,则使用file1第5列和第6列的信息替换file2中第2列和第3列的值

另外,我需要将第7列和第8列的值从file1添加到第1列和第2列中的file2中,匹配行,将字符R替换为行,而将O替换为未替换的行,

菲尔1

2,100,31431,37131,999991.70,0000000.30,11111,22222,3
3,100,31431,37471,111113.20,1111111.30,22222,33333,4

文件2

3143137113 318512.50 2334387.50 100
3143137131 318737.50 2334387.50 100
3143137201 319612.50 2334387.50 100
3143137471 322987.50 2334387.50 100
3143137491 323237.50 2334387.50 100

期望的输出:

31431,37113,318512.50,2334387.50,100,O
11111,22222,999991.70,0000000.30,100,R
31431,37201,319612.50,2334387.50,100,O
22222,33333,111113.20,1111111.30,100,R
31431,37491,323237.50,2334387.50,100,O

我尝试了这两个:

1)

awk '
BEGIN{
  OFS=","
}
FNR==NR{
  a[$3 $4]=$3 OFS $4
  b[$3 $4]=$5
  c[$3 $4]=$6
  d[$3 $4]=$7 OFS $8
  next
}
($1 in
 a){
  $4=d[$1]
  $3=c[$1]
  $2=b[$1]
  $1=a[$1]
  print
  next
}
{
  $1=$1
  sub(/^...../,"&,",$1)
  print
}
' FS=","  file1 FS=" "  file2

产量

31431,37113,318512.50,2334387.50,100
31431,37131,999991.70,0000000.30,11111,22222
31431,37201,319612.50,2334387.50,100
31431,37471,111113.20,1111111.30,22222,33333
31431,37491,323237.50,2334387.50,100

2)

awk -F, 'NR==FNR{a[$3 $4]=substr($0,length($3 FS)+1);next} $1 in a{print a[$1],$NF;next} {$1=substr($1,1,5) OFS substr($1,6,5);} 1' OFS=, file1 FS=' ' file2

产量

31431,37113,318512.50,2334387.50,100
31431,37131,999991.70,0000000.30,11111,22222,3,100
31431,37201,319612.50,2334387.50,100
31431,37471,111113.20,1111111.30,22222,33333,4,100
31431,37491,323237.50,2334387.50,100

两者都有效,但并非完全。

提前致谢

awk
2个回答
2
投票

你可以尝试一下吗?

awk '
FNR==NR{
  a[$3 $4]=$7 $8
  b[$3 $4]=$5
  c[$3 $4]=$6
  next
}
($1 in a){
  $2=b[$1]
  $3=c[$1]
  $1=a[$1]
  found=1
}
{
  $0=found==1?$0",R":$0",O"
  sub(/^...../,"&,")
  $1=$1
  found=""
}
1
' FS="," file1 FS=" " OFS="," file2

输出如下。

31431,37113,318512.50,2334387.50,100,O
11111,22222,999991.70,0000000.30,100,R
31431,37201,319612.50,2334387.50,100,O
22222,33333,111113.20,1111111.30,100,R
31431,37491,323237.50,2334387.50,100,O

2
投票

Perl版本:

#!/usr/bin/perl
use warnings;
use strict;
use autodie;
use feature qw/say/;

my ($file1, $file2) = @ARGV;
my %rows;

open my $f1, '<', $file1;
while (<$f1>) {
  chomp;
  my @F = split /,/;
  $rows{"$F[2]$F[3]"} = \@F;
}

open my $f2, '<', $file2;
$, = ','; # Like awk OFS
while (<$f2>) {
  chomp;
  my @F = split;
  if (exists $rows{$F[0]}) {
    my $left = $rows{$F[0]};
    say @{$left}[2..5], $F[3], @{$left}[6,7]; 
  } else {
    my ($col1, $col2) = $F[0] =~ m/^(.{5})(.{5})$/;
    say $col1, $col2, @F[1..3];
  }
}

例:

$ ./example.pl file1.csv file2.txt
31431,37113,318512.50,2334387.50,100
31431,37131,999991.70,0000000.30,100,11111,22222
31431,37201,319612.50,2334387.50,100
31431,37471,111113.20,1111111.30,100,22222,33333
31431,37491,323237.50,2334387.50,100
© www.soinside.com 2019 - 2024. All rights reserved.