我的输入文件有以下格式,
ATOM 1 Cal Cal 1 61.270 93.780 100.040 1.00 0.00
ATOM 2 Cal Cal 2 12.080 65.560 105.560 1.00 0.00
ATOM 13 Cal Cal 13 40.800 13.530 75.100 1.00 0.00
ATOM 200 Cal Cal 200 102.620 22.520 97.600 1.00 0.00
我想用值“32.450”替换第 8 列中的所有值,同时保持原始格式(间距)不变。即,预期的输出应该如下所示,
ATOM 1 Cal Cal 1 61.270 93.780 32.450 1.00 0.00
ATOM 2 Cal Cal 2 12.080 65.560 32.450 1.00 0.00
ATOM 13 Cal Cal 13 40.800 13.530 32.450 1.00 0.00
ATOM 200 Cal Cal 200 102.620 22.520 32.450 1.00 0.00
我试过简单的awk命令
awk -F " " '{
print $1" " $2" "$3" "$4" "$5" "$6" "$7" "'32.450'" "$9" "$10"
}' input.pdb > output.pdb
但是,它未能保留原始格式。
任何人都可以帮助我找到更好的方法来做到这一点,最好是使用 awk 或 gawk 吗?
GNU awk:
gawk '
BEGIN {FIELDWIDTHS="5 7 4 5 6 12 8 8 6 6"; OFS=""}
{$8=" 32.450"; print}
' file
输入
ATOM 1 Cal Cal 1 61.270 93.780 100.040 1.00 0.00
ATOM 2 Cal Cal 2 12.080 65.560 105.560 1.00 0.00
ATOM 13 Cal Cal 13 40.800 13.530 75.100 1.00 0.00
ATOM 200 Cal Cal 200 102.620 22.520 97.600 1.00 0.00
输出
ATOM 1 Cal Cal 1 61.270 93.780 32.450 1.00 0.00
ATOM 2 Cal Cal 2 12.080 65.560 32.450 1.00 0.00
ATOM 13 Cal Cal 13 40.800 13.530 32.450 1.00 0.00
ATOM 200 Cal Cal 200 102.620 22.520 32.450 1.00 0.00
如果您在示例输入中一直具有固定宽度的列:
$ awk '{ print substr($0,1,47) " 32.450" substr($0,55) }' f.txt
ATOM 1 Cal Cal 1 61.270 93.780 32.450 1.00 0.00
ATOM 2 Cal Cal 2 12.080 65.560 32.450 1.00 0.00
ATOM 13 Cal Cal 13 40.800 13.530 32.450 1.00 0.00
ATOM 200 Cal Cal 200 102.620 22.520 32.450 1.00 0.00
只需告诉
sed
抓住前 7 个块,跳过第 8 个并打印 7,然后是 32.450
.
$ sed -r 's/(( +[^ ]+){7}) +[^ ]+/\1 32.450/' file
ATOM 1 Cal Cal 1 61.270 93.780 32.450 1.00 0.00
ATOM 2 Cal Cal 2 12.080 65.560 32.450 1.00 0.00
ATOM 13 Cal Cal 13 40.800 13.530 32.450 1.00 0.00
ATOM 200 Cal Cal 200 102.620 22.520 32.450 1.00 0.00