Help with understanding ls with grep command

Hello,

After fooling around with the command line I am having trouble
understanding why I am getting the following using the “ls” and “grep”
commands

code:

eric:~> ls

bin Music
Desktop osha top 10 violations 2014.pdf
Documents Pictures
Downloads Public
french fry receipe public_html
hard drive return label Templates
kennedy.pdf tv-viewer_diag.out
k-birthday present.pdf Videos

eric:~> ls | grep D*

grep: Documents: Is a directory
grep: Downloads: Is a directory

eric:~> ls | grep Do*

grep: Downloads: Is a directory

I expected the last command to give me the output of the second command
and the second command to also include Desktop but I was wrong.

Does it have to do with the fact that “ls D*” also lists the contents of
the sub-directories of Documents and Downloads? I probably need to go
back to basics of the command line.

I am running openSUSE 13.2 using the bash shell with the Gnome Terminal.

Can anyone explain what is going on.

Thank you,

Eric

Mmm I am seeing this too:

cristiano@xmper8q3:~> ls|grep D*
grep: Documents: È una directory
grep: Downloads: È una directory
grep: Dropbox: È una directory

I would have expected to see “Desktop” too here.

It seems like grep is interpreting the results as files/directories to search, instead of just outputting them.

This appears to be confirmed by this experiment:

cristiano@xmper8q3:~> ls > tmpfile
cristiano@xmper8q3:~> grep -e D* tmpfile
grep: Documents: È una directory
grep: Downloads: È una directory
grep: Dropbox: È una directory
tmpfile:Desktop

As you can see “Desktop” is output as a result coming from tmpfile, while the other directories are interpreted as files to search (and grep fails because they are not files but directories).
I still don’t understand why grep is behaving this way, btw.

Cris

Not sure what you are trying to get: see the man page.


REGULAR EXPRESSIONS
...
Repetition
       A regular expression may be followed by one of several repetition operators:
       ?      The preceding item is optional and matched at most once.
       *      The preceding item will be matched zero or more times.
       +      The preceding item will be matched one or more times.
...

Maybe you need “ls | grep Do” or “ls | grep Do*”

While not having the full explanation yet, please understand that there is a difference between

ls

and

ls -1

The last option is implicated when standard output of ls is not the terminal, as is the case when you pipe the output to another program, as you do.

Oh, and BTW @Eric, the standard advice to new members:
Please in the future use CODE tags around copied/pasted computer text in a post. It is the # button in the tool bar of the post editor. When applicable copy/paste complete, that is including the prompt, the command, the output and the next prompt.

I see you use NNTP, which makes it a bit more difficult to adhere to this, but many people will not even try to decipher computer text on the forums without the CODE tags. IIRC, you can create them by typing CODE] (without the space I added to “disarm” the tag) and

 before and after the copied/pasted text.

Maybe a bit further explanattion on what OrsoBruno says:

When you type this in bash

ls | grep D*

the shell wil first do a lot of things to it like word splitting and many expansions. One of this is “Pathname expansion”. This results in D* being replaced by the file names in the working directory that start with a D. After this expansion the line will be

ls | grep Desktop Documents Downloads

ls wil run (as ls -1) and pipe it’s output to grep.
grep will run and it first argument is Desktop, which it wll take as the string to search. It then finds argument #2 which is Documents, this means search for Desktop in Documents (it will thus not use standard input and thus the output of ls goes into the black hole!). grep will then find out that Documents is a directory and display the error message you see, Likewise for Downloads.

In short you forgot that output is first interpreted by the shell (and the shell has no knowledge of what the commands are going to do). After that, commands are started and arguments given to them (and to commands have no knowlege of how those arguments originated).

In general, when you want that a text is unchanged offered as an argument to a program, it i best to quote it. Thus in this case you can give D* to grep:

ls | grep 'D*'

But:
Because grep searches for a string “sowmehere” in a line, it is of no use to have a pattern like D*, when you mean D with zero or more characters after it. When the D is somewhere in the line: Bingo! And thus

ls | grep 'D'

is sufficient (and you can in this case drop the quoting).
It is however possible to say to grep that he line must start with D

ls | '^D'

HTH

On Tue, 05 Jan 2016 06:30:23 +0000, Eric wrote:

> eric:~> ls | grep Do*

What you’re seeing here is shell filename expansion - grep doesn’t
require a wildcard like that (and as a regex, it would not match what you
expect - it would match Do, Doo, Dooo, Doooo, and so on).

ls | grep Do

would be sufficient to your need here. What’s happening with yours is
the shell is expanding the wildcard to match filenames (using filename
matches rather than a regex), and then trying to use that as a
parameter. What it’s expanding to is:

ls | grep Documents Downloads

And grep is then trying to process “Downloads” as a filename (the second
parameter to grep), and since it’s a directory, it’s telling you it can’t
grep it, because it’s a directory and not a file.

Jim

Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C

On 01/05/2016 08:18 AM, Jim Henderson wrote:
> On Tue, 05 Jan 2016 06:30:23 +0000, Eric wrote:
>
>> eric:~> ls | grep Do*
>
> What you’re seeing here is shell filename expansion - grep doesn’t
> require a wildcard like that (and as a regex, it would not match what you
> expect - it would match Do, Doo, Dooo, Doooo, and so on).
>
> ls | grep Do
>
> would be sufficient to your need here. What’s happening with yours is
> the shell is expanding the wildcard to match filenames (using filename
> matches rather than a regex), and then trying to use that as a
> parameter. What it’s expanding to is:
>
> ls | grep Documents Downloads
>
> And grep is then trying to process “Downloads” as a filename (the second
> parameter to grep), and since it’s a directory, it’s telling you it can’t
> grep it, because it’s a directory and not a file.
>
> Jim
>

Thank you all for the replies.

I just did not need the “*” in my search string. Without it I get the
results I was looking for.

Eric

Thank you Henk for your explanation, that is completely satisfactory.

I had the suspect shell expansion could get in the way, but when I tried using

ls|grep D\*

I had an even more confusing output (that I still don’t understand):

cristiano@xmper8q3:~> ls|grep D\*
bin
Copy
Desktop
Documents
Downloads
Dropbox
Google Drive
horizon-log_old.tar.gz
horizon-log.tar.gz
Music
Pictures
Projects
Public
public_html
Rimborsi_20140327_FABI.pdf
SoftMaker
Templates
tmp
tor-browser_en-US
Videos
wmsystemtray.kwinrule 

How can e.g. ‘bin’ and ‘Copy’ be a result for that regexp?

OTOH, using

ls|grep '^D'

works perfectly.

Cris

Well, as the shell converts D* into D* and offers that as an argument to grep, you now have to read

man grep

where it says that that the first argument (after those starting with a -, which are option arguments) is an Extended Regular Expression.

Lower on it explaines what an ERE is. Amongst that it says under Repetition:

The preceding item will be matched zero or more times.

Thus grep will display all lines it reads and that have zero or more D in it. Which are all of them.

Take your time, consider each and every step that is taken by the computer (what happens when I type, who reads what I type, how is that interpreted, what is the next step, who interpretes what is left over, …). :wink:

Wow, you’re right!
I’m usually good at searching/replacing using RE in editors, but I guess I’m being confused by the shell because of the “usual” meaning of the characters ‘*’ and ‘?’ as wildcards.

Thank you!
Cris

Linux is very nice to us in having Patterns, REs, EREs, … And we all have to know which one is used where. Keeps us attentive.

On Tue, 05 Jan 2016 06:30:23 +0000, Eric wrote:

> eric:~> ls | grep Do*

What you’re seeing here is shell filename expansion - grep doesn’t
require a wildcard like that (and as a regex, it would not match what you
expect - it would match Do, Doo, Dooo, Doooo, and so on).

ls | grep Do

would be sufficient to your need here. What’s happening with yours is
the shell is expanding the wildcard to match filenames (using filename
matches rather than a regex), and then trying to use that as a
parameter. What it’s expanding to is:

ls | grep Documents Downloads

And grep is then trying to process “Downloads” as a filename (the second
parameter to grep), and since it’s a directory, it’s telling you it can’t
grep it, because it’s a directory and not a file.

Jim

Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C

On 01/05/2016 08:18 AM, Jim Henderson wrote:
> On Tue, 05 Jan 2016 06:30:23 +0000, Eric wrote:
>
>> eric:~> ls | grep Do*
>
> What you’re seeing here is shell filename expansion - grep doesn’t
> require a wildcard like that (and as a regex, it would not match what you
> expect - it would match Do, Doo, Dooo, Doooo, and so on).
>
> ls | grep Do
>
> would be sufficient to your need here. What’s happening with yours is
> the shell is expanding the wildcard to match filenames (using filename
> matches rather than a regex), and then trying to use that as a
> parameter. What it’s expanding to is:
>
> ls | grep Documents Downloads
>
> And grep is then trying to process “Downloads” as a filename (the second
> parameter to grep), and since it’s a directory, it’s telling you it can’t
> grep it, because it’s a directory and not a file.
>
> Jim
>

Thank you all for the replies.

I just did not need the “*” in my search string. Without it I get the
results I was looking for.

Eric