awk - Replace all instances of - with . until A or G or T or C from both ends of a string -
so example if input is:-------a--gg---ccaat---a------
output should be:.......a--gg---ccaat---a......
prefer in awk.
this going bit complicated because awk
not allow call function matched string, need manually take out matched strings (l
, r
), further process them, , replace $0
original string plus matched strings:
awk '{ if (match($0, /^-*/)) { l = substr($0, 1, rlength); gsub("-", ".", l); $0 = l substr($0, rlength + 1); } if (match($0, /-*$/)) { r = substr($0, rstart); gsub("-", ".", r); $0 = substr($0, 1, rstart - 1) r; } print $0; }'
or using gsub
again mutate matched strings in $0
instead of concatenating:
awk '{ if (match($0, /^-*/)) { l = substr($0, 1, rlength); gsub("-", ".", l); gsub(/^-*/, l, $0); } if (match($0, /-*$/)) { r = substr($0, rstart); gsub("-", ".", r); gsub(/-*$/, r, $0); } print $0; }'
Comments
Post a Comment