using wget as an offline browser to download all mp3 files from a website.

Hi I am Rupesh from India and I want to download a website using wget for offline viewing I mean I want to mirror a website ie., want to maintain exact copy of the website in my hard-disk. I have installed opensuse leap 42.3 with wget and it’s GUI.

Previously I have downloaded the website using an offline browser called extreme picture finder. 90 % 0f the files of which I want have been successfully downloaded and so I want to download remaining 10 %.

I have read the manual page of wget and examined some of the tutorials found by searching web related to wget. I have tried what I found in tutorials and I am providing the output of those commands.

I have issued the command as below


wget -c -t 0 --recursive --force-directories   -o logfile.txt ‐‐recursive ‐‐no-clobber ‐‐accept jpg,gif,png,jpeg,mp3,MP3,pdf  

For the above command I got the output as below


idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-09-29 18:08:43--  http://%E2%80%90%E2%80%90recursive/
Resolving ‐‐recursive (‐‐recursive)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐recursive’
idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-09-29 18:08:43--  http://%E2%80%90%E2%80%90no-clobber/
Resolving ‐‐no-clobber (‐‐no-clobber)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐no-clobber’
idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-09-29 18:08:43--  http://%E2%80%90%E2%80%90accept/
Resolving ‐‐accept (‐‐accept)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐accept’
--2017-09-29 18:08:43--  http://jpg,gif,png,jpeg,mp3,mp3,pdf/
Resolving jpg,gif,png,jpeg,mp3,mp3,pdf (jpg,gif,png,jpeg,mp3,mp3,pdf)... failed: Name or service not known.
wget: unable to resolve host address ‘jpg,gif,png,jpeg,mp3,mp3,pdf’
idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-09-29 18:08:43--  http://%E2%80%90%E2%80%90directory-prefix=/mnt/source/downloads/lectures/
Resolving ‐‐directory-prefix= (‐‐directory-prefix=)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐directory-prefix=’
--2017-09-29 18:08:43--  http://www.pravachanam.com/categorybrowselist/20
Resolving www.pravachanam.com (www.pravachanam.com)... 162.144.54.142
Connecting to www.pravachanam.com (www.pravachanam.com)|162.144.54.142|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘www.pravachanam.com/categorybrowselist/20’

     0K .......... .......... .......... .......... .......... 31.3K
    50K .......... ....                                        1.54M=1.6s

2017-09-29 18:08:46 (40.0 KB/s) - ‘www.pravachanam.com/categorybrowselist/20’ saved [65802]

Loading robots.txt; please ignore errors.
--2017-09-29 18:08:46--  http://www.pravachanam.com/robots.txt
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 404 Not Found
2017-09-29 18:08:48 ERROR 404: Not Found.

--2017-09-29 18:08:48--  http://www.pravachanam.com/sites/default/files/favicon.ico
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 404 Not Found
2017-09-29 18:08:51 ERROR 404: Not Found.

--2017-09-29 18:08:51--  http://www.pravachanam.com/modules/system/system.base.css?owgg5m
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 5428 (5.3K) [text/css]
Saving to: ‘www.pravachanam.com/modules/system/system.base.css?owgg5m’

     0K .....                                                 100% 16.2K=0.3s

2017-09-29 18:08:51 (16.2 KB/s) - ‘www.pravachanam.com/modules/system/system.base.css?owgg5m’ saved [5428/5428]

--2017-09-29 18:08:51--  http://www.pravachanam.com/modules/system/system.menus.css?owgg5m
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 2035 (2.0K) [text/css]
Saving to: ‘www.pravachanam.com/modules/system/system.menus.css?owgg5m’

     0K .                                                     100%  236K=0.008s

2017-09-29 18:08:52 (236 KB/s) - ‘www.pravachanam.com/modules/system/system.menus.css?owgg5m’ saved [2035/2035]

--2017-09-29 18:08:52--  http://www.pravachanam.com/modules/system/system.messages.css?owgg5m
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 961 [text/css]
Saving to: ‘www.pravachanam.com/modules/system/system.messages.css?owgg5m’

     0K                                                       100%  255M=0s

2017-09-29 18:08:52 (255 MB/s) - ‘www.pravachanam.com/modules/system/system.messages.css?owgg5m’ saved [961/961]

--2017-09-29 18:08:52--  http://www.pravachanam.com/modules/system/system.theme.css?owgg5m
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 3711 (3.6K) [text/css]
Saving to: ‘www.pravachanam.com/modules/system/system.theme.css?owgg5m’

     0K ...                                                   100%  374K=0.01s

2017-09-29 18:08:52 (374 KB/s) - ‘www.pravachanam.com/modules/system/system.theme.css?owgg5m’ saved [3711/3711]

--2017-09-29 18:08:52--  http://www.pravachanam.com/sites/all/libraries/mediaelement/build/mediaelementplayer.min.css?owgg5m
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 404 Not Found
2017-09-29 18:08:54 ERROR 404: Not Found.

--2017-09-29 18:08:54--  http://www.pravachanam.com/sites/all/modules/views_slideshow/views_slideshow.css?owgg5m
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 404 Not Found
2017-09-29 18:08:56 ERROR 404: Not Found.

--2017-09-29 18:08:56--  http://www.pravachanam.com/modules/comment/comment.css?owgg5m
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 184 [text/css]
Saving to: ‘www.pravachanam.com/modules/comment/comment.css?owgg5m’

On examining the above output we can clearly guess that wget is treating options as website addressees.

After that I have issued the command as below


wget ‐‐level=1 ‐‐recursive ‐‐no-parent ‐‐no-clobber   ‐‐accept mp3,MP3  http://www.pravachanam.com/categorybrowselist/20


On executing the above command it has created outfile.txt file and a directory called www.pravachanam.com under my current directory. wget has created some directories but not same as the source website I mean it has not maintained the directory structure same as source website.

In the outfile.txt I have found some lines ending with .mp3 and I have tried to examined the corresponding file in the directory created by wget but failed to locate the file and even failed to directory structure related to mp3 file.

I have installed and tried gwget which is the gnomes GUI for wget and in that I have tried a number of options or settings but it has failed to download I mean it has downloaded the home page and then stopped and after that it has displayed message as successfully completed downloading the website. In the GUI version of wget there is no options for selecting all the options found in the command line version of wget.

Please try suggest how to download mp3 files from a website with the following options using wget.

1)option for maintaining directory structure same as source website.
2)option for rejecting download of already downloaded files I mean skip those.
3)As I want to download all the mp3 files except the folders and files containing some words like xyz and so can you suggest how to skip download if the files or folders contain xyz in their names.
4) option to download files recursively and not to visit other website’s.
5) option to try downloading files infinitely in the case of network failure.
6) option to resume download the files which are downloaded partially previously.
7) option to download only mp3 and reject all other file types if possible including html,php,css files.

Many of you may suggest that try to the manual page of wget and experiment on your own but taking advice and help from expert people like you is the signal to success. At present I am also reading the manuals and guides of wget but the help provided by you is most valuable. I am requesting as many people as to reply to this thread and help me.

Regards,
Rupesh.

Hi
Install and use httrack which is in the openSUSE Leap 42.3 release;


zypper in httrack

http://www.httrack.com/

I have used a number of offline browser’s previously including httrack and all of them behaved same I mean they do not have the following capabilities

  1. Unable to resume the download when the internet is alive I mean they stop downloading when the internet connection has been dropped.
  2. They don’t download the files which are previously partially downloaded I mean they do not try to download the remaining part of the file which was partially downloaded.

Wget do not have the above mentioned drawbacks and so I am preferring to use it instead of others.

Good news…wget is the perfect tool for this job. As you well know, since you’ve been asking people to write you a script to do this for 2 mons. now on other forums. Did you pay attention to the error you got? It’s telling you you didn’t give it a url, and its trying to put one of your switches in as it. And failing.

If you spent 5 minutes reading the man page, you would see the proper syntax and get it done. Building a list using diff to get the missing files is trivial. Writing a script to loop through that list is ALSO trivial. But you have done nothing to work towards your goal, except post back “please help, I don’t understand anything” every time you get told to learn on your own. What is stopping you?

All, please view these questions on LinuxQuestion, LinuxForums, Neowin, and the Ubuntu forums. Same user name, same posting pattern. Before wasting time trying to get the poster started, it probably will not work. Others have tried for years.

At present I have issued the command wget as below and I am providing the exact command with options and also some of its output below.


linux-ps66:~ # wget -c -t 0 -v --recursive --force-directories ‐‐recursive ‐‐no-clobber ‐‐accept jpg,gif,png,jpeg,mp3,MP3,pdf  ‐‐directory-prefix=/mnt/source/downloads/lectures/   http://www.pravachanam.com/categorybrowselist/20q
idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-10-02 00:19:52--  http://%E2%80%90%E2%80%90recursive/
Resolving ‐‐recursive (‐‐recursive)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐recursive’
idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-10-02 00:19:52--  http://%E2%80%90%E2%80%90no-clobber/
Resolving ‐‐no-clobber (‐‐no-clobber)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐no-clobber’
idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-10-02 00:19:52--  http://%E2%80%90%E2%80%90accept/
Resolving ‐‐accept (‐‐accept)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐accept’
--2017-10-02 00:19:52--  http://jpg,gif,png,jpeg,mp3,mp3,pdf/
Resolving jpg,gif,png,jpeg,mp3,mp3,pdf (jpg,gif,png,jpeg,mp3,mp3,pdf)... failed: Name or service not known.
wget: unable to resolve host address ‘jpg,gif,png,jpeg,mp3,mp3,pdf’
idn_encode failed (-304): ‘string contains a disallowed character’
idn_encode failed (-304): ‘string contains a disallowed character’
--2017-10-02 00:19:52--  http://%E2%80%90%E2%80%90directory-prefix=/mnt/source/downloads/lectures/
Resolving ‐‐directory-prefix= (‐‐directory-prefix=)... failed: Name or service not known.
wget: unable to resolve host address ‘‐‐directory-prefix=’
--2017-10-02 00:19:52--  http://www.pravachanam.com/categorybrowselist/20q
Resolving www.pravachanam.com (www.pravachanam.com)... 162.144.54.142
Connecting to www.pravachanam.com (www.pravachanam.com)|162.144.54.142|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘www.pravachanam.com/categorybrowselist/20q’

www.pravachanam.com/categorybrowselis        <=>                                                                  ]  64.28K  97.2KB/s    in 0.7s    

2017-10-02 00:19:55 (97.2 KB/s) - ‘www.pravachanam.com/categorybrowselist/20q’ saved [65824]

From the above I have found the word failed mostly. I have also found the sentence “string contains a disallowed character”. What does these mean. At present the tool is trying to download something and previously even this was not happened.

Please try to examine the above code and suggest a way not to get any other errors.

At present after sometime I have examined the directory /mnt/source/downloads/lectures/ but found no contents at all ie., it is empty. Where the wget is storing the downloaded files.

Check that your options start with double dash and not with characters that look like double dash (common issue when copy-pasting something). In your case you are using UNICODE U+2010 (HYPHEN) character instead of plain ole ASCII 0x2d.

Sorry, I tried to post this yesterday, but something went wrong. Now additional to what @avidjaar says. Below you can see that sometimes you do not type (or otherwise insert) - in your command, but something different. And when you do that often, it could very well explain a lot of other problems you encounter.

I copied your command from the post above:

wget -c -t 0 -v --recursive --force-directories ‐‐recursive ‐‐no-clobber ‐‐accept jpg,gif,png,jpeg,mp3,MP3,pdf  ‐‐directory-prefix=/mnt/source/downloads/lectures/   http://www.pravachanam.com/categorybrowselist/20q

and put it in a file. When I list all characters of the file with od, it shows that there are a lot of strange characters in the command:

0000000   w   g   e   t       -   c       -   t       0       -   v    
0000020   -   -   r   e   c   u   r   s   i   v   e       -   -   f   o
0000040   r   c   e   -   d   i   r   e   c   t   o   r   i   e   s    
0000060 342 200 220 342 200 220   r   e   c   u   r   s   i   v   e    
0000100 342 200 220 342 200 220   n   o   -   c   l   o   b   b   e   r
0000120     342 200 220 342 200 220   a   c   c   e   p   t       j   p
0000140   g   ,   g   i   f   ,   p   n   g   ,   j   p   e   g   ,   m
0000160   p   3   ,   M   P   3   ,   p   d   f         342 200 220 342
0000200 200 220   d   i   r   e   c   t   o   r   y   -   p   r   e   f
0000220   i   x   =   /   m   n   t   /   s   o   u   r   c   e   /   d
0000240   o   w   n   l   o   a   d   s   /   l   e   c   t   u   r   e
0000260   s   /               h   t   t   p   :   /   /   w   w   w   .
0000300   p   r   a   v   a   c   h   a   n   a   m   .   c   o   m   /
0000320   c   a   t   e   g   o   r   y   b   r   o   w   s   e   l   i
0000340   s   t   /   2   0   q  

0000347

As you see there is several times the sequence (in octal)

342 200 220 342 200 220

The equivalent in hex is

80e2 e290 9080

and that is what you see in your error messages as

/%E2%80%90%E2%80%90

They are not the

--

that should be there.

Please tell what to do now.

Isn’t that clear? You should type the statement fresh on your keyboard and not use copy/paste from some untrusted source.

**READ THE MAN PAGE AND PAY ATTENTION!!!

**Sorry, but really!!! You’re putting --recursive in twice, didn’t specify a URL, and aren’t bothering to actually try anything. You’re just asking someone to tell you “type this in and press ENTER, and your problem is solved”.

Please tone down. The fact that --recursive is twice there does not do much harm. The fact that initially there was no URL, there is one now. The fact that the wrong type of - is used is the main cause of the failing of this statement. When you, after reading carefully what the other posts say about using U+2010 instead of U+002D, think those posters are wrong, then please post your arguments against that analysis.

When you can not answer in a normal way, please don’t. You are not obliged to answer. When you are frustrated, go somewhere else, take a walk, drink a beer, whatever.

Understand and my apologies, but this person has been posting about this for two months in various forums. They’ve been given many solutions, but they don’t want to put effort into reading a man page. What they’re looking for is someone to write a script for them. Take a look here:

…where he explicitly says he’ll just keep changing the wording until someone helps him. He’s been doing this same stuff for at least 5 yrs…

Let’s see if OP now does what was promised :slight_smile:

If you atleast try to suggest why it is considering options as website addresses I will stop posting anything related to mp3, internet etc.,.

I see that you never started a thread in these forums because you needed help. Nor did you ever post to help other members here. My conclusion is that you joined these forums here for the sole reason to post in rupeshforu3’s threads. You may not be glad with his behaviour or otherwise be a fan of him, but is that a reason to spend so much effort and time in following him through all forums where he posts?

My advice is that you stop spending time on this and go for more worthy actions in life. In short: forget him.

In any case, staff here will not tolerate the way you post with all those shouting in bold, etc. any further. We think we are capable enough in managing these forums ourselves.

At present I have typed the whole command including options and it’s arguments by hand and even not succeeded. I am providing what I have typed and it’s output below.


linux-ps66:~ # wget --convert-links -c -t 0  -v --recursive --force-directories --no-clobber --accept mp3 --directory-prefix=/root/temp http://www.pravachanam.com/categorybrowselist/20
Both --no-clobber and --convert-links were specified, only --convert-links will be used.
--2017-10-02 22:30:48--  http://www.pravachanam.com/categorybrowselist/20
Resolving www.pravachanam.com (www.pravachanam.com)... 162.144.54.142
Connecting to www.pravachanam.com (www.pravachanam.com)|162.144.54.142|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘/root/temp/www.pravachanam.com/categorybrowselist/20.tmp’

www.pravachanam.com/categorybrowselis              <=>                                                            ]  64.26K  18.5KB/s    in 3.5s    

2017-10-02 22:30:53 (18.5 KB/s) - ‘/root/temp/www.pravachanam.com/categorybrowselist/20.tmp’ saved [65802]

Loading robots.txt; please ignore errors.
--2017-10-02 22:30:53--  http://www.pravachanam.com/robots.txt
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 1479 (1.4K) [text/plain]
Saving to: ‘/root/temp/www.pravachanam.com/robots.txt.tmp’

www.pravachanam.com/robots.txt.tmp    100%=======================================================================>]   1.44K  --.-KB/s    in 0.003s  

2017-10-02 22:30:54 (437 KB/s) - ‘/root/temp/www.pravachanam.com/robots.txt.tmp’ saved [1479/1479]

Removing /root/temp/www.pravachanam.com/categorybrowselist/20.tmp since it should be rejected.

--2017-10-02 22:30:54--  http://www.pravachanam.com/
Reusing existing connection to www.pravachanam.com:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘/root/temp/www.pravachanam.com/index.html.tmp’

www.pravachanam.com/index.html.tmp              <=>                                                               ]  50.83K  19.2KB/s    in 2.7s    

2017-10-02 22:30:57 (19.2 KB/s) - ‘/root/temp/www.pravachanam.com/index.html.tmp’ saved [52054]

Removing /root/temp/www.pravachanam.com/index.html.tmp since it should be rejected.

FINISHED --2017-10-02 22:30:57--
Total wall clock time: 9.4s
Downloaded: 3 files, 117K in 6.1s (19.0 KB/s)
Converted links in 0 files in 0 seconds.
linux-ps66:~ # 

Sometimes the utility wget runs for more than one hour and at sometimes more than 20 minutes and at sometimes less than one minute. Most of the time it displays messages as failed, something went wrong etc.,.

I have read the manual page very carefully and wrote the above command but even failed. I can’t understand what’s going wrong ie., whether wget is buggy etc., . Each and every time it displayed message as failed or runs for 30 seconds I can’t explain how much I was frustrated and felt depressed. From now I am going to stop experimenting with wget. In the open source world some people try to write very complex tools which are difficult to understand. When I can’t understand the functioning of these tools I may ask a question then immediately from the lovers of free and open source software I will get response as “you are an idiot”.

Try to run the command which I have provided at present and see if whether you succeed or fail. If you fail please let me know why you failed and if you succeed please let me know how you succeeded.

I hope you see that you do not have now the error messages you had earlier. I also hope you understand what the difference between your earlier command and the command above is and why you now do not have those error messages. It is important to learn from your errors.

So, now your command seems to be syntactic correct. And it starts running and does function. There is no error message of any kind and it downloads some files.

Why it thinks all is done, I do not know because for that you have to know what is on the other side and this I do not know.

That won’t work for HTTP the way you probably expect. In HTTP there is no difference between directory and file - everything is just URL. Which means wget will ignore everything that does not end with mp3 in each page it downloads. So it won’t really follow links recursively. This is different from FTP where it knows how to traverse directories.

One exception is documents ending with html that are treated as sort of “directories” and always downloaded.

This is actually mentioned in wget documentation, although probably not prominently enough; some examples would be useful as well.

I have read the manual page very carefully

Manual page is reference page. You should always read full documentation if available, not limit yourself to manual page.

I can’t understand what’s going wrong

Well, with debug option wget tells you quite clearly that it ignores URLs on the page because they do not match your accept/reject rules …

When I can’t understand the functioning of these tools I may ask a question then immediately from the lovers of free and open source software I will get response as “you are an idiot”.

The only word “idiot” in this thread is the one used by you.

Try to run the command which I have provided at present and see if whether you succeed or fail.

Why do you expect someone to do your homework for you?

The tool has displayed message as Removing /root/temp/www.pravachanam.com/index.html.tmp since it should be rejected.

Previously I have added HTML,php,css,mp3 to wget accept list and even I got the same output as above. Can you suggest how to download files properly.

Now look here.

I have a directory ~/test/rupesh:

henk@boven:~/test/rupesh> ls
command  www.pravachanam.com
henk@boven:~/test/rupesh> cat command 
wget --convert-links -c -t 0  -v --recursive --force-directories http://www.pravachanam.com/categorybrowselist/20
henk@boven:~/test/rupesh>

And I am running:

. command

which will run what is in command.
It created the directory of the website and it is filling it as I type; At this moment:

henk@boven:~/test/rupesh> ls -l www.pravachanam.com/
totaal 1252
drwxr-xr-x 229 henk wij  4096 11 okt 18:42 albumfilesbrowselist
drwxr-xr-x   2 henk wij  4096 11 okt 18:38 categorybrowselist
drwxr-xr-x   2 henk wij  4096 11 okt 18:36 comment
-rw-r--r--   1 henk wij 60244 11 okt 18:34 comments
-rw-r--r--   1 henk wij 60666 11 okt 18:36 comments?page=1
-rw-r--r--   1 henk wij 59928 11 okt 18:36 comments?page=2
-rw-r--r--   1 henk wij 61534 11 okt 18:36 comments?page=3
-rw-r--r--   1 henk wij 60999 11 okt 18:36 comments?page=4
-rw-r--r--   1 henk wij 45471 11 okt 18:36 comments?page=44
-rw-r--r--   1 henk wij 60349 11 okt 18:36 comments?page=5
-rw-r--r--   1 henk wij 60662 11 okt 18:36 comments?page=6
-rw-r--r--   1 henk wij 60363 11 okt 18:36 comments?page=7
-rw-r--r--   1 henk wij 61862 11 okt 18:36 comments?page=8
-rw-r--r--   1 henk wij 34658 11 okt 18:34 contact
drwxr-xr-x   2 henk wij  4096 11 okt 18:37 content
-rw-r--r--   1 henk wij 44620 11 okt 18:34 devotional-links2
-rw-r--r--   1 henk wij 37065 11 okt 18:36 devotional-links2?page=1
-rw-r--r--   1 henk wij 52185 11 okt 18:34 events_all
-rw-r--r--   1 henk wij 52233 11 okt 18:37 events_all?qt-events=0
-rw-r--r--   1 henk wij 52233 11 okt 18:37 events_all?qt-events=1
drwxr-xr-x   2 henk wij  4096 11 okt 18:36 filter
drwxr-xr-x   2 henk wij  4096 11 okt 18:34 images
-rw-r--r--   1 henk wij 46064 11 okt 18:34 index.html
-rw-r--r--   1 henk wij 41499 11 okt 18:34 latest30days
-rw-r--r--   1 henk wij 41585 11 okt 18:36 latest30days?order=changed&sort=asc
drwxr-xr-x   2 henk wij  4096 11 okt 18:36 node
drwxr-xr-x  47 henk wij  4096 11 okt 18:38 pravachanambrowselist
-rw-r--r--   1 henk wij  1479  5 mrt  2017 robots.txt
drwxr-xr-x   4 henk wij  4096 11 okt 18:34 sites
drwxr-xr-x  38 henk wij  4096 11 okt 18:36 speakerbrowselist
-rw-r--r--   1 henk wij 56203 11 okt 18:34 v_books
-rw-r--r--   1 henk wij 48780 11 okt 18:34 vedartha_sangraham
-rw-r--r--   1 henk wij 49370 11 okt 18:37 vedartha_sangraham?page=1
-rw-r--r--   1 henk wij 44521 11 okt 18:37 vedartha_sangraham?page=2
henk@boven:~/test/rupesh> 

It seems to work like a charm.