Write the long string to a file and then iterate over that file’s contents
rather than trying to use that as one big list of parameters.
Alternatively, echo the big long list and capture it in some script using
the something like the ‘read’ command so that you can iterate over the
list that way.
–
Good luck.
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below…
Although I have no formal instruction/education in BASH data size limits, I’ve encountered what you describe a few times…
A simple Google search suggests that “ulimit” will display set limits that are relevant
In my own words,
What I found was that if the data is processed through an interactive shell, or in your case the equivalent using another app like Kate which processes the data completely before inputting the entire data set into your main script, the data is processed as a batch job.
If this is actually what is happening to you, then the above (particularly the second) link describes how to change the buffer size limits, but in my case I decided instead to feed the data directly to the script without going through the intermediate step of an interactive shell so that the data is streamed and processed FIFO instead of as a batch job. So, if for instance you were to do what I opted to do, feed your data directly into your processing script without first opening in Kate… Or, at least if I understand what your script is trying to do, design your script to more efficiently store patterns before dedupping… Maybe even use a distributed dedup script instead of a home made script.
Hash the first entry and write to an array,
Then loop the following…
Hash the next entry, if it doesn’t match anything in the array then append the new hash to the array.
If it matches, then you know you have a dupe, so then you can do what you want(create a new list? immediately remove?)
Compared to storing a list of the original full text string in your comparison list, my process would require more processing initially but as the enormous list grows, I’d expect that comparing hashes should be much faster at some point.
Number of arguments and maximum length of one argument At least on Linux 2.6, there’s also a limit on the maximum number of arguments in argv].
On Linux 2.6.14 the function do_execve() in fs/exec.c tests if the number exceeds PAGE_SIZE*MAX_ARG_PAGES-sizeof(void *) / sizeof(void *) On a 32-bit Linux, this is ARGMAX/4-1 (32767). This becomes relevant if the average length of arguments is smaller than 4. Since Linux 2.6.23, this function tests if the number exceeds MAX_ARG_STRINGS in <linux/binfmts.h> (2^32-1 = 4294967296-1).
And as additional limit since 2.6.23, one argument must not be longer than MAX_ARG_STRLEN (131072).
This might become relevant if you generate a long call like “sh -c ‘automatically generated with many arguments’”.
(pointed out by Xan Lopez and Ralf Wildenhues)
What ab and I were suggesting is to not invoke kate at all, bypass that step entirely.
Is there some reason why you’re processing your data through kate?
Yes .
I use a script to help me when I install/reinstall linux on a computer. So I load all the scripts and all the config files at ounce. So I have everything at hand if things are going wrong.