Hi. I have had a script running on my server for months. Essentially it downloads a web page and srtrips out the data I want evry night. As such a load of filtering happens. Replacing the source data with a simple echo command I have been doing this:
echo "----><----> 3,818.91 <----><---->" | sed 's/<>!-\"]*//' | sed 's/ <.*//'
and it striped out the leading and trailing garbage to give " 3,818.91" for months.
as of last Thursday my system started returning “----><----> 3,818.91” which caused a load of cascade errors in my system.
I can get it to work as I want it again with (The ‘-’ is moved to the first character in the regex range):
echo "----><----> 3,818.91 <----><---->" | sed 's/-<>!\"]*//' | sed 's/ <.*//'
I also tried escaping the ‘-’ with a ‘’ but that made no difference.
Clearly it is interpreting the ‘-’ as a range of characters. Anyone any ideas as to what changed? Should it always have failed and I was exploiting a bug. Any comments or suggestions welcome.
Hm, strange indeed. As you have seen from my prompt, I was user henk.
That is why we ask to always include the line with the prompt and the command, the output and the line with the new prompt when posting code from the terminal. Only output often hides a lot of information the potential helpers need.
Which is the only correct way to include literal “-” in character list. Your original expression is interpreted as range from “!” to “”" inclusive. I have no idea why it worked and if it did, it is by accident (or mistake).
Compare environment variables and locale settings. It is possible that for one user character range between “!” and “”" includes “-”, and for another does not.
@avidjaar. Thanks for the explanation. When thinking about collating sequence I always restricted that to the alphabet, like e.g. where is the é in French. Never that the sequence of non-alphabetic/numeric characters would differ between them.