curl to awk results in printf overwriting beginning of line?

Hi,
i am currently experimenting a bit with the yahoo finance api, basically a command like this:

curl -s 'http://download.finance.yahoo.com/d/quotes.csv?s=GOOG+YHOO&f=nsl1c1p2'

gives the following output:

"Google Inc.","GOOG",566.07,-5.53,"-0.97%"
"Yahoo! Inc.","YHOO",35.62,-0.19,"-0.53%"

that is: the name, the stock symbol, the price, the chance and the change in percent of the stock.

My plan was to get access to an arbitrary element of this multiline output (you can add a lot more symbols and get a new line of output for each).
With a small bash script i wanted to pipe the output to awk and then just print out each value seperately.
My first attempt was this:

#!/bin/bash
symbols="GOOG+YHOO"
curl -s 'http://download.finance.yahoo.com/d/quotes.csv?s='${symbols}'&f=nsl1c1p2' | awk -F',]' '{
    gsub(/"/, "",$0);
    print $0;
    printf "%s %s %s (%s)abc
", $1, $3, $4, $5;
}'

i simply piped the output of curl to awk, changed the delimiter to ‘,’, removed the “”" with gsub and then just printed each element seperately. The output was this:

Google Inc.,GOOG,566.07,-5.53,-0.97%
)abcle Inc. 566.07 -5.53 (-0.97%
Yahoo! Inc.,YHOO,35.62,-0.19,-0.53%
)abco! Inc. 35.62 -0.19 (-0.53%

To make the problem more obvious i added “abc” to the end of the printf command:
the symbols that should be at the end of the line get written to the beginning. I do not know why, but i tried it with a totally different approach:

#!/bin/bash
symbols="GOOG+YHOO"
curl -s 'http://download.finance.yahoo.com/d/quotes.csv?s='${symbols}'&f=nsl1c1p2' | awk '{
    gsub(/"/, "",$0);
    print $0;
    split($0,a,",");
    print a[1], a[3], a[4], "("a[5]")abc"
}'

this way i do not need the -F argument and i do switched from printf to print, but i used split to split the string, using the ‘,’ as delimiter.
But the output is exactly the same!

Whats the problem here?

Apparently $5 contains a carriage return at the end.
So the cursor jumps back to the beginning of the line when $5 is printed and then the following characters are printed there of course.

Try to pipe the output to a file and open it in a hexeditor, that should make it obvious.

It seems that the data you download with curl has CR+LF (i.e. MSDOS style) line endings, and awk appends that CR to $5 when splitting.

I don’t know how to prevent this at the moment, I would have to google to find out.
I’m not really an awk expert…

thanks to your hint about the carriage return i found i solution, i simply use another gsub to remove all ‘\r’

gsub(/\r/, "",$0);

this worked.

On 2014-08-02 04:16, sabo007 wrote:
> thanks to your hint about the carriage return i found i solution, i
> simply use another gsub to remove all ‘\r’

Good :slight_smile:

May I suggest that, if you need more scripting/coding info, to ask a
moderator to move this thread to the programming/scripting forum here?
You will get there most of the people with knowledge in such things as
awk coding :wink:


Cheers / Saludos,

Carlos E. R.

(from 13.1 x86_64 “Bottle” (Minas Tirith))