I have an application that creates a number of files. The files are reasonably large at between 4 and 11MB. The entire directory tree may contain up to 1500 or so files of this size. There is another application that would add another 1500 files of about half the size. The script the runs this is failing because at some point in the process bash is not able to access files that should exist. Since the error arrise late in the process, I am making the assumption that the issue is the file size. There is pleanty of space in the partition and if I decrease the total number of files that are on the system at any one time, the errors go away.
I am just trying to track down what the issue is and what the limitations may be. I am not that fimiluar with ext4 file systems.
There might be a limit but if speed is what you’re after then the shell is not the tool you should be using. Care to tell why is that script creating a lot of files? (what type of files)
And please, when you got an error message, then copy/paste it here in a post (beteen CODE tags). When you want others to interprete such a message, they must see it and not read your vague interpretation of it.
> I am just trying to track down what the issue is and what the
> limitations may be. I am not that fimiluar with ext4 file systems.
There is no directory size limit. It may become slower, yes. Other
filesystem types cope better: you can easily place a million files in a
single reiserfs directory at top speed.
The limit is typically inside your script: there is a limit to the
command line size you get when you try to get a directory listing. Like
“ls *”.
–
Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)
Hmmm, ok you might hit some brick wall if you add the & after the **dd **inside the for loop.
WARNING**: Don’t try to attempt this if you have a lot of workload going on or some mission critical work.**
for ((i=0;i<3000;i++)); do dd if=/dev/zero of=$i bs=1M count=11 & done
That will run dd in an “**asynchronous” **mode ie it will not wait for other dd process to finish before it starts another, so imagine 3000 dd process running at the same time, that could send your system into the rabbit hole
> So that concludes my test case. In the end 3000k files with 11MB size is
> not an issue even for the shell
But your file names are short, just 1, 2, 3, if I got it right. If the
length of the file names is 12 chars, 3000 of them make for a 39000
chars. I think that the limit is 64KB, perhaps 128K. I don’t remember.
–
Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)
You can launch, say, four such processes, then a “wait” for the for to
finish. This is faster when the hurdle is the processor, but not when it
is the hard disk, as is probably the case here.
Then you can do it for a million files, and compare ext4, xfs, btrfs,
reiserfs… I crashed btrfs with this test. Others hit a limit at some
number, but not reiserfs: it is both faster and limitless.
I don’t remember xfs if it had an issue or not, because its inodes are
dynamic, or something like that. I think the issue was blocksize.
–
Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 “Bottle” at Telcontar)
Maybe it is a limit of the number of files, which you can have open at any given time.
Please look with ulimit -n and do this inside the very shell session, which has the problems, as those limits are user dependant.
ulimit -a will give you a list of all limits present at the time.
In my 12.3 if I look at output of ulimit -n it is 1024 and this is the default as I never changed this.
I think the shell has a 32K char limit. 3000 files -> 10 characters per file name is approx when one would run into glob trouble. We recompiled our shell at work to significantly raise the max line size for a large project (different distro though).
My appoligies for being away yesterday. It appears I missed quite a bit of activity here.
The files are text files which I am calling input files here for the sake of reference. The input files are generated by modeling software and processed by statistical software. The statistical software creates one output file for each input file that is processed. All of this currently runs out of a bash script.
The input files are larger than I have used before and are running at about 11MB. Each output file is running at 4MB. The run I had going was supposed to generate 1500 input files and another 2000 output files. That would be 16.5GB for the input files and another 6GB for the generated output files (unless I am doing my math wrong). There is pleanty of free space on the partition.
I cannot find the logfile that was generated from that run. I likely overwrote it. The error was ‘cannot access file, file not found’. I have solved the issue for now by restructuring the script to work in smaller batches and delete files as it goes. Most of what is generated is not needed but needs to be evaluated before it is discarded. Evaluation used to be done at the end but it was not difficult to break things up to evaluate and clean periodically as the script progresses. The script now runs to completion without error.
I don’t like to leave things unresolved like that and assume the issues have been addressed when I have not confirmed that so I thought I would post. Since the files that were not found were clearly in the locations where the script was looking for them, I assumed that something had run out of space (I also checked permissions and such first).
The total number of files that are being managed by the shell at any one time is not as large. The shell creates a list that is passed to the c++ modeling software. This sofware generates one output file for each entry in the list. None of this is being managed in the shell. In this case, there were 1500 entries on the list so the modeling software produced 1500 files. The 1500 are divided over 10 directories with 150 files each. For processing, the shell loops over the 10 directories. For each directory, a list is made of the input files and then one input file at a time is passed to the stats app. There shouldn’t be any particularly long lists in scope at any one point. This is the main reason I thought the issue was some fundamental limitation in the file system.
I will find the version of the script that I was using and post it if that would help.