Block specific mirror or region before migration to v16

Hi, I’ve been having trouble with a mirror for a day or so now: https://mirror.firstyear.id.au

It’s insanely slow, like 2hrs to install go. I managed to get around it with mirrorsorcerer, which updated all my existing repos to use cdn.opensuse.org and package installs started working again. But when I try to update with opensuse-migration-tool, I notice in the logs it’s using this terrible mirror again and it gets stuck fetching repository info. I thought zypper was supposed to automatically pick good mirrors but anyway.

I’m guessing it’s because the migration process adds new repos, and they’re not configured to use the CDN. Is there any way I can tell zypper as a whole to ignore this mirror, or at least the whole region, so I can update from 15.6 to 16? All the advice I found online was about editing the URL on specific repos.

Side note: the migration failed because it added a bunch of “\0” to the top of a new repo file, I assume as a side effect of this mirror issue, but the migration tool kept going as though it had succeeded. It gave me the option for post-migration actions like enabling selinux, and output a ‘migration succeeded, you can reboot’ log. This left me with a bunch of extra repos and god knows what else it managed to update before it failed. Same thing happens if I kill it with ctrl-c (I tried the update again this morning and killed it when I saw the mirror was still choking). Thankfully I can roll back with snapper but I feel like this migration tool should be able to detect when something goes wrong, even if it can’t clean up.

@firstyear might need to kick the system? <=== @ebat

Oh no I called it a terrible mirror and now there’s a real person associated with it. Sorry firstyear! I’m sure it’s a fine mirror usually

It’s okay :slight_smile: no harm done.

The mirror received a migration in recent days, and since then there has been an observed issue with throughput, but it’s caused by an interaction with routers. I happen to be affected as well, and I have spent 4 days investigating.

Can you let me know if your home router is a ubiquiti by any chance? Or is your router running linux in some form?

1 Like

Hello! It’s a tp-link archer vx-1800v running stock firmware.

There’s a modem (NTD) between it and the outside world, but I’m out so can’t give any info about it atm.

Gday mate. Thanks again for your mirror.

I’ve been seeing this throughput problem a while, I am starting to get desperate now though, was about to try and ping you on github since I could tell it was exceptional to the normal when it went from “takes days” to “won’t finish at all”.

Is there a better way to find you than there or here, if there are problems? I generally don’t come here really and I can’t sign up for mail lists. Glad you are already aware of it, but sorry to hear it’s being a bit of a devil to fix.

I do have a ubiquiti router.

Pre-existing bug report follows. You can ignore the title, I found the zypper verbose logging env var last night and I can see it’s resolving to CZ, but then internally zypper is redirecting to your box. I’m just down the road from it but getting 8k/s and it eventually times out. But once the file has been downloaded one time, subsequent re-downloads of that file, fill my downlink.
https://bugzilla.opensuse.org/show_bug.cgi?id=1247679

I couldn’t get zypper to behave, it always redirected to you from cdn.o.o or download.o.o or mirrorcache.o.o, and the mirrors at Mirrors Report for TW - openSUSE Download are all wildly out of date (wanted to roll back packages)

Very happy to dedicate as much time as you need, to fixing this. If there’s some kind of temporary workaround that would be great. But mostly I’d just like to offer to assist you, as your services are critical to aussie suse users, and we all appreciate it a lot.

No problem.

As mentioned I am already affected so I have all the tools needed to analyse this. I’m going to spend a few more hours on it today. I’ll probably write it up on my blog later once I under the cause - currently the evidence points to the fact that the server is fine (server to server within the same DC I see 400MB/s throughput, even via a router). Other nearby Australians with other brands of routers report full line rate (20MB/s etc). The issue seems to be that something is triggering a router/tcp misbehaviour if the router runs linux. This would apply to the ubiquiti and potential the tp link as well.

I’m really sorry that this degradation has occurred. :frowning:

3 Likes

I am also having issues.
Trying to do fresh install, and very little action at all.
Running a TP-Link M5 mesh system router.
Anything I can do, let me know…

I have contacted the persons responsible for mirrorcache.opensuse.org, and temporarily asked for mirror.firstyear.id.au to be removed from the listings. I will stop the mirror service in the meantime to prevent further headaches.

What is occurring has two parts, but requires an understanding of this mirror works. It is a bit of a “special child” if you want to call it that.

mirror.firstyear.id.au is one of the only caching mirrors in the world. It runs GitHub - Firstyear/opensuse-proxy-cache: opensuse-proxy-cache which I wrote specifically for opensuse. It acts like an edge-cdn - when we first proxy a file, we then cache it and serve it for subsequent hits.

This has worked exceedingly well - with only 140GB of storage, mirror.fy has a 95% hit ratio, which has massively helped the Australian and NZ community, especially for repos that are not commonly cached. Additionally the hosting of this mirror has a better network path to opensuse’s DC’s, where as other paths have significant issues due to a peering partner on the opensuse side.

This means this mirror has 2 possible access flows.

First, we have not seen the file, so we are streaming and caching it. This requires us to connect to the parent/main mirror and download the file, while also streaming it to you, the consumer.

Second, we have seen the file so we simply serve it. We perform a background/async refresh to ensure the etags are up to date so we can prefect a changed file in the background ahead of time if possible. But normally we just stream from disk to you the client.

What has impacted mirror.firstyear.id.au has affected both sides of this equation.

Currently, I am seeing a massive collapse in throughput when streaming from the main mirror to mirror.fy. Testing this from other machines does NOT show the same collapse, meaning it is specific to the combination of the software on mirror.fy and the main mirror.

The other half of this is that downloading an already cached file from mirror.fy has an interaction with linux routers that causes performance degredation. On my ubiquiti uxg pro (a 10GBe capable router), if I fetch from mirror.fy directly from the router I see 30MB/s throughput. However if I fetch from a linux client behind the router, I see 5MB/s, with very bursty networking patterns.

I have spent multiple hours today trying to isolate the causes of both of these - I have not been successful. I recreated the setup in a lab, and started to introduce factors one at a time, I have tcpdumped/wiresharked on every endpoint and router involved that I have access to, and used this to guide research, investigation and tuning.

I have eliminated many potential causes - mtu, mss, fragmentation etc.

Currently in the lab I am able to see:

  • 750MB/s from mirror to localhost
  • 300MB/s from mirror to unrouted client
  • 200MB/s from mirror to client via ubiquiti router

Testing with the true mirror, and reflecting many of the improvements I was able to achieve today I was able to see 5MB/s to a client for a cached file, and 200kb/s for an uncached one.

I still have many more network traces to review, but I also have been looking at this for multiple days, and today 8 hours already.

This is why I’m stopping the mirror in the meantime. I need a break, but I also don’t want the ongoing issues to affect people.

I’m really sorry that this has happened, and I’m working as hard as I can to resolve it.

I hope everyone has a great weekend, and I appreciate the support people have shown despite this issue. I host this mirror from my own wallet because I want opensuse to have a good presence here, and it really does help me to hear that people appreciate it.

Thanks!

2 Likes

How is zypper supposed to know whether a mirror is good or not without trying it first? It will likely move to the next one after timeout. You can try adjusting download.connect_timeout and/or download.transfer_timeout.

Mate, that’s an enormous effort you’ve put in. I’m sorry it’s eaten up your weekend. I appreciate the temp fix of disabling the mirror as well. Hope you get to relax a bit!

1 Like

I’ll check that setting later, ty. Curious to see what its default value is, because I had it sitting on one ‘fetching repository information’ step for an hour+ (based on memory of tailing the zypper log).

Thanks mate!

I get a bit more technical at the bottom of this post and I think you might want to check it out.

TL;DR this has been a problem for months, not days, my usage has been masking it from you. Also it kinda looks like traffic to you is shaped, or something.

.

.
Quick note for other users: to make this change to the mirrors take effect, you need to run

sudo zypper clean -a

Otherwise zypper thinks the mirrorlist it already has, is fine:

Mirror cachefile cookie valid and cache is not too old, skipping download (/var/cache/zypp/raw/openSUSE:update-tumbleweed/mirrorlist)

After this, normal zypper dup is back in action :tada:
It will use the NZ mirror.
.

.

.
Nerdy stuff for @firstyear:

This is in line with what I’ve been reporting for months now (like, since around the change to the CDN and parallel zypper, I reported this, first on forums and also in the above bugzilla and recently, this thread)

I do a fair bit of packaging in my home repo, and I test for bugs in TW, including monitoring when the snapshot updates, and keep strange hours, so I am very often the first in the region to cache anything, so I see this all the time and very often, everyone else is immune to it.

Reddit had three posts from other aussies when I spent a weekend in hospital.

Basically, the reason everyone hasn’t seen this earlier is often because I was up at 3am getting that first cache done, so they got the fast downloads that come after the first guy (me) gets the slow one.

It’s extremely obvious if I build a package in my home repo and it’s just built, it’s guaranteed to be a slow download because it’s guaranteed to be a fresh pull from your upstream.

I feel like it might be helpful for you to know that this is not a new thing (it might mislead you if you are looking only at the effects of recent changes)
.

The other noteworthy behaviour I’ve seen is the very specific powers-of-two increments of bandwidth. It’s VERY consistently sitting at exactly 8/16/32/64/128/etc kBps. There’s zero chance it’s a coincidence. I don’t know what that means, but it jumps out at me. I mentioned it in the bugzilla as suspected traffic shaping between you and the upstream mirror, but I don’t really know.

.

You deserve as long a break as you like mate. I know what it’s like to be on that end of a problem like this. Gruelling. Really appreciate all the work you’ve done over the years for our local community.

Please, take a break and enjoy what’s left of your weekend. <3

1 Like

It does not. Also, for some repos, there was no next one.

You can try adjusting download.connect_timeout and/or download.transfer_timeout.

Tried that last night. It either does nothing, or just makes it take a while then fail on every file.

The ungraceful behaviour of zypper during this outage is a matter which deserves attention.

The change that occurred to mirror.fy.id.au happened last week on the 2025-12-05. If this has been occurring for a longer time, that is a separate issue and I’m happy to help you investigate that separately.

Thanks everyone for your patience. A good break and a sleep has helped, and today I solved the issue.

What was occurring was a very complex interaction between selective acknowledgements in TCP and congestion control. What was observed in that some routers on the paths between mirror.fy.id.au and other end points were mishandling SACK’s and this caused a pathological condition that caused throughput collapse both into/out of the mirror. This further was amplified by the congestion control algorithm that was in use, as well as some TCP buffer scaling issues that could interact.

I have now changed the relevant options and I see good throughput into the mirror during file caching, as well as good outbound throughput even via ubiquiti routers. There is likely further tuning to perform but this should resolve the issue.

For now you can test by directly accessing and downloading via mirror.fy.id.au - I have been using /distribution/leap/15.6/iso/openSUSE-Leap-15.6-DVD-aarch64-Current.iso as a test file, and so you should find it’s already “cache hot” allowing you to do a download/speed test.

I will leave things as they are for a few days and do some more testing/tuning and once that settles I’ll re-enable the mirror into the pool.

Thank you all again for your patience - I hope you all have a great week!

1 Like

Champion, thank you for all your hard work.

1 Like

NICE!

Thanks mate if it is a different issue dw about it I am not trying to hijack… buuuut… Preeeeety sure that whatever you did to fix this problem, fixed that problem.

I just branched an OBS package, and grabbed the rpm with wget from mirror.fy, came in at about 5MBps, cache cold. Hasn’t done that since ?May?, it’s been for me what you all have seen this week.

The iso you mentioned cache hot, came in at 15M, which is about right.

I mean, I can’t be certain until the mirror is back in the list, but… It sure looks fixed to me :tada:

Fantastic! Well if you do have issues in future you know where to find me and I’d be happy to help out.