My computers, files and backups are a mess. I have everything somewhere but can’t easily find it. In addition to large hard drives on three computers, I have several external hard drives and thumb drives with backups going back to 1992. My major project for this year is to organize it all.
My thought is to index all of the media so I can first sort each by year. Will Dolphin do this or do I need a more powerful indexing/search application?
Can I add tags in batches (for example Pictures, Tax Records, Manuals, Computer, Accounting, Personal, etc) or is it a one file at a time process? Do the tags stay with the file if it is moved to another system? What about Comments?
Comments and suggestions from anyone who has actually done this would be very welcome.
i use recoll for managing library of technical pdf documents, its a fantastic piece of software. at least one of the moderators on here uses it instead of baloo/dolphin for system search. cant say any more on your particular use case.
Recoll by itself does not have any file management capabilities.
I think that there are two possible approaches which may be of help:
Use the Recoll KIO. This is a KDE plugin, which you can use to perform searches from Dolphin (by typing something like “recoll: some terms” in the address bar). The results are then available for the usual Dolphin file management operations. I’m not too sure of the package availability under OpenSUSE at the moment, but last time I looked, the plugin worked, and I can probably produce a package if needed.
Use the “Export to CSV” feature. This is available by right-clicking the table header when the results are displayed in table mode. It should then be reasonably easy to use the output in a script to do what you want. You will have to be a bit careful about quoting issues though, depending on your file names.
The KIO approach should be more convenient in most cases, but the interface is lacking many things from the Recoll GUI, so which is better probably depends on the use case.
I installed Recoll KIO from the KDE:Extra repository after using Recoll to index my home folder. Presumably (?) KIO uses that index for its’ search within Dolphin.
Now if I type in Recoll:/<some word or phrase> the files containing that phrase appear in Dolphin where they can be managed. It happens likity-split (translation to standard English = “very fast”).
This indicates to me that Dolphin can act as a file manager for a Recoll index. Is that correct?
If so, what kind of query terms should follow Recoll:/ to extract, for example, all files created in 2014 to Dolphin? I can’t find any documentation on this.
The project I plan is outlined in the first post of this thread. There doesn’t seem to be much interest in organizing old files. That turned out to be a good thing because it forced me to do a lot of searching and reading. I learned a lot that I would like to share with the forum. There may be little interest now but with Terabyte drives and more and more business being conducted online, it is a topic that will soon become important to desktop end-users I think. Here are some things I have learned:
Dolphin is a simple lightweight file browser for the home desktop user. I has (deliberately) limited search capabilities to keep it simple.
Konqueror is a much stronger file browser with more powerful search capabilities. It may be suitable for this project. I will give it a try.
Recoll is an amazing piece of software!!! It uses a semantic database (not a relational database). It searches for and indexes content within files, not metadata (such as date, owner, size, type, tags etc.) Type in a phrase or word and (after building the index) and it will find every instance of it in the index, even if it is in an attachment to an email or in Kontact Trash or a compressed tar ball.
Baloo seems to be a simplified version of Recoll, but I don’t find much information about it.
Apparently open SUSE abandoned Strigi/Akonadi/Nepomuk because they were to unwieldy for the average desktop user. Face it, it is going to take some time, CPU cycles and disk space to index a terabyte of data accumulated over 20 years. The average user doesn’t want or need this capability.
I may be absolutely wrong about these observations. If so, I hope one of the experts here will set it right.
How am I going to go about the major organizational task outlined in the first post of this thread?
I am going to use Konqueror to sort everything on the backup external drives by year and save it in a folder for each year.
Delete duplicate files. There will be many because backups overlap. Some of these files were originally saved on floppy disks and copied many times to new computers and backup media. Maybe this can be automated.
Add tags to files to facilitate future data retrieval. Broad category tags should be easy to add in batches.
I am sure corporate IT people deal with this kind of thing on a much larger scale every day. Any suggestions would be very welcome.