|
||||||
| Forums FAQ | Members List | Search | Today's Posts | Mark Forums Read |
| Programming/Scripting Questions about programming, bash scripts, perl, php, cron jobs, ruby, python, etc. |
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Hi all.
I have a bit of a problem here. I'm a bit new to awk, but I think that he is the one for the following task. I have files in the following format: Code:
C C 1 1.3756312 C 2 1.3879733 1 120.0957745 C 3 1.3850912 2 119.6454502 1 7.8601452 0 C 4 1.3892896 3 119.7264098 2 -6.4720895 0 C 5 1.3844873 4 120.0924064 3 -1.2289029 0 N 3 1.4348439 2 119.2949612 1 -166.5566349 0 C 7 1.3250991 3 123.0481206 2 18.5755546 0 ... Code:
c c 1 cc2 c 2 cc3 1 ccc3 c 3 cc4 2 ccc4 1 dih4 c 4 cc5 3 ccc5 2 dih5 c 5 cc6 4 ccc6 3 dih6 n 2 nc7 3 ncc7 4 dih7 c 7 cn8 2 cnc8 3 dih8 c 8 cc9 7 ccn9 2 dih9 n 9 nc10 8 ncc10 7 dih10 ... cc2 1.385627 cc3 1.387278 ccc3 120.462 cc4 1.387224 ccc4 119.224 dih4 11.852 cc5 1.384238 ccc5 119.249 dih5 0.237 cc6 1.387992 ccc6 120.570 dih6 -10.931 nc7 1.442256 ncc7 118.545 dih7 -156.703 cn8 1.381867 cnc8 130.485 ... But it's about a thousand of them easily. ![]() That's really not a simple task. Is there any way of doing so in a script? ![]() Btw, I have some freedom in the "variable names". o, as in the 7th column of the first section in second file I can have just "dih" strings, I can also have "bon" and "ang" in the 3rd and 5th columns.
|
|
|||
|
Quite done here:
Code:
#!/bin/tcsh
set FILE = $1
set N = `cat -n $FILE | awk '{print $1}' | tail -n 1`
set i = 0
# First Atom:
head -n 1 $FILE
# Second Atom:
sed -n -e 2,2p $FILE | awk '{print "", $1, " ", $2, "bon2"}'
sed -n -e 2,2p $FILE | awk '{print "bon2", " ", $3}' > tmp
# Third Atom:
sed -n -e 3,3p $FILE | awk '{print "", $1, " ", $2, "bon3", " ", $4, "ang3"}'
sed -n -e 3,3p $FILE | awk '{print "bon3", " ", $3}' >> tmp
sed -n -e 3,3p $FILE | awk '{print "ang3", " ", $5}' >> tmp
# All The Others:
set i = 4
while ($i <= $N)
sed -n -e $i,$i\p $FILE | awk '{print "", $1, " ", $2, "bon" "'"$i"'", " ", $4, "ang" "'"$i"'", " ", $6, "dih" "'"$i"'"}'
sed -n -e $i,$i\p $FILE | awk '{print "bon" "'"$i"'", " ", $3}' >> tmp
sed -n -e $i,$i\p $FILE | awk '{print "ang" "'"$i"'", " ", $5}' >> tmp
sed -n -e $i,$i\p $FILE | awk '{print "dih" "'"$i"'", " ", $7}' >> tmp
@ i = ( $i + 1 )
end
# Now The Variables And Their Values List:
echo ''
cat tmp
rm tmp
#EOF
![]() Thanks a lot! |
|
|||
|
Good that you worked it out.
However you may find that in future, it may be more elegant, more efficient and less error-prone to do it all inside awk instead of mixing awk and shell. After all, awk is a programming language. To help you here are some features of awk: Associative arrays, e.g.: value["cc2"] = $2; var="cc2"; print value[var]; BEGIN and END blocks. They are executed before and after respectively, any lines of input are read in. You can use an END block to do the final processing, after you have accumulated the info. You would put your awk program in a file and run it like this: awk -f munge.awk < input > output You could also do it all in a language like Perl, Python or Ruby. |
|
|||
|
Hi ken.
Thanks a lot. I still need time to really test if it's ok, because I'm not sure about how the programs will deal with those "unaligned columns". But most of the work is done. I never liked perl. I considered python and ruby, but they were left aside due to the learning curves taking time that I can't afford now. For the same reason, at a certain point I decided to not go for bash just due to RNGs, and stay on tcsh in order to avoid having to rewrite a lot of job that was already done (despiting the "test run" options available in bash tha looks *really* nice. ).I also thought about programming straight to awk, but I never found a tutorial enoughly good on that. The ones I seen always shown awk in programming to be a bit too "clumsy" for my taste, and really hard to get. Taking the task above as an example, I would have to make awk read in different manners the first 3 lines, then the fourthy until the end of the file, and make a straighty redirection to file of part of the results while keeping track of the other part to put it just after the full input file was read in. I guess an pure awk program for that will easilly get *really* nasty to read. ![]() Again, thanks a lot, and if anyone come up with a simple trick to properly align the columns (which my script doesn't do) I would be really gratefull!
|
|
|||
|
Hi johannesrs,
I found your question as a challenge and I wrote a script in awk.(was good to remember it after 4-5 years). It's not the best but it does the job, the same yours does. This script deals with only 1 input file at a time. If you need more then 1, I might be able to help(time permitting). Here is the script: #! /usr/bin/gawk -f { if (NF >= 7) { print " "$1,$2,"bon"NR," ",$4,"ang"NR," ",$6,"dih"NR variable_values[1,NR]="bon"NR" "$3 variable_values[2,NR]="ang"NR" "$5 variable_values[3,NR]="dih"NR" "$7 } else if (NF == 5) { print " "$1,$2,"bon"NR," ",$4,"ang"NR variable_values[1,NR]="bon"NR" "$3 variable_values[2,NR]="ang"NR" "$5 } else if (NF == 3) { print " "$1,$2,"bon"NR variable_values[1,NR]="bon"NR" "$3 } else { print " "$0 } } END { print "" for (col = 1; col <= NR; ++col) { for (row = 1; row <= 3; ++row) { if ( variable_values[row,col] != "") { print (variable_values[row,col]) } } } } you should run it like this: awk -f './awkProcessor' ./input.file > ./tempfile where: awkProcessor is the script (don't forget to give it execute rights) ./input.file is your input file ./tempfile is your output file cheers |
|
|||
|
@dmera, good work. But also remember you can test NF in the gate expression.
Code:
NF>=7 { ... }
NF==5 { ... }
NF==3 { ... }
(a bit tricky for the else case, but not too bad, the values are 1, 2, 4, and 6.)
END { dump out saved info }
@johannesrs, pardon the frankness, but what a wuss you are. A search would have found you any of a number of awk tutes. As for aligning fields, try using printf with \t in the format string. |
|
|||
|
Thank you, ken_yap for the tip. Well, @johannesrs will have to pay for the tips(with 25 OpenSuse community replies to new users). I had nothing to do at work while waiting for some user_id's to be created(over a month) where the bureaucracy related to security is crazy(as it is on all the big companies I think).
|
![]() |
| Bookmarks |
| Thread Tools | |
| Display Modes | |
|
|