Hi all.
First of all, sorry if this is the wrong place in the forums to put this question: I was unsure where it would be mostly appropriate, specially after it was refuse in the mail lists.
I’m looking for some advise for a test implementation of a distributed filesystem at a computational chemistry students lab. If it is successful, the same solution will be considered for the (small) research lab clusters available.
At the research lab we are used to NFS usage for quite some time (since the 90’s) under suse, even for our clusters. The students lab also usually used similar solutions, but for reasons that are unimportant here a single NFS server won’t be an option now. The students lab is also being upgraded at this moment, and will have new 6 i3 computers with 500gb disks (that will still share room with 5 old Pentium Ds with 200gb disks).
The idea that we are considering is the following:
-
The old Pentium Ds will mount an NFS /home from the i3s (with “failure fallback” between them);
-
The i3s will serve the NFS /home for the Pentium Ds;
-
In order to have all computers accessing all files coherently, a distributed filesystem among them for this /home must be used. This will also increase the available disk space due to data striping;
-
Reliability is however important, and as such we are considering that instead of 3tb (6500gb) of total space we have only 2tb (4500gb + 2) and be able to withstand up to two i3 computers failing;
-
Reliability must go farther than just files, but even necessary metadata servers and others must be redundant: optimally, replicated on every i3 computer;
-
They must be able to also deal with all 11 computers accessing, both reads and writes, the whole/home . From molecular Dynamics to quantum mechanics calculations and also molecular docking, they must be able to deal with different demands and some big (several gb) file sizes, all simultaneously;
-
Also we would not like to have to have to make special instructions to the users, like “you can’t rename a file, but copy and delete it later” for example, as some (old?) gluster instructions indicate to be necessary;
-
The simpler the implementation and the maintenance, the better it will be for both actual students lab and the future production clusters use.
From what I can gather, the most usual suggestions are ceph, gluster or beegfs. Which ones are available on opensuse leap 15? Any other options worth considering? Which ones are recommended for the proposed application?
Any help or direction on this will be really useful, and we would be very grateful for it.
Thanks a lot in advance.