I have contacted the persons responsible for mirrorcache.opensuse.org, and temporarily asked for mirror.firstyear.id.au to be removed from the listings. I will stop the mirror service in the meantime to prevent further headaches.
What is occurring has two parts, but requires an understanding of this mirror works. It is a bit of a “special child” if you want to call it that.
mirror.firstyear.id.au is one of the only caching mirrors in the world. It runs GitHub - Firstyear/opensuse-proxy-cache: opensuse-proxy-cache which I wrote specifically for opensuse. It acts like an edge-cdn - when we first proxy a file, we then cache it and serve it for subsequent hits.
This has worked exceedingly well - with only 140GB of storage, mirror.fy has a 95% hit ratio, which has massively helped the Australian and NZ community, especially for repos that are not commonly cached. Additionally the hosting of this mirror has a better network path to opensuse’s DC’s, where as other paths have significant issues due to a peering partner on the opensuse side.
This means this mirror has 2 possible access flows.
First, we have not seen the file, so we are streaming and caching it. This requires us to connect to the parent/main mirror and download the file, while also streaming it to you, the consumer.
Second, we have seen the file so we simply serve it. We perform a background/async refresh to ensure the etags are up to date so we can prefect a changed file in the background ahead of time if possible. But normally we just stream from disk to you the client.
What has impacted mirror.firstyear.id.au has affected both sides of this equation.
Currently, I am seeing a massive collapse in throughput when streaming from the main mirror to mirror.fy. Testing this from other machines does NOT show the same collapse, meaning it is specific to the combination of the software on mirror.fy and the main mirror.
The other half of this is that downloading an already cached file from mirror.fy has an interaction with linux routers that causes performance degredation. On my ubiquiti uxg pro (a 10GBe capable router), if I fetch from mirror.fy directly from the router I see 30MB/s throughput. However if I fetch from a linux client behind the router, I see 5MB/s, with very bursty networking patterns.
I have spent multiple hours today trying to isolate the causes of both of these - I have not been successful. I recreated the setup in a lab, and started to introduce factors one at a time, I have tcpdumped/wiresharked on every endpoint and router involved that I have access to, and used this to guide research, investigation and tuning.
I have eliminated many potential causes - mtu, mss, fragmentation etc.
Currently in the lab I am able to see:
- 750MB/s from mirror to localhost
- 300MB/s from mirror to unrouted client
- 200MB/s from mirror to client via ubiquiti router
Testing with the true mirror, and reflecting many of the improvements I was able to achieve today I was able to see 5MB/s to a client for a cached file, and 200kb/s for an uncached one.
I still have many more network traces to review, but I also have been looking at this for multiple days, and today 8 hours already.
This is why I’m stopping the mirror in the meantime. I need a break, but I also don’t want the ongoing issues to affect people.
I’m really sorry that this has happened, and I’m working as hard as I can to resolve it.
I hope everyone has a great weekend, and I appreciate the support people have shown despite this issue. I host this mirror from my own wallet because I want opensuse to have a good presence here, and it really does help me to hear that people appreciate it.
Thanks!