Intermittent no space left on device - Leap 42.2 + docker + btrfs

jimmiebtlr · June 19, 2017, 6:12pm

Installed this weekend, so pretty close to a fresh install. I’m getting the following error, primarily on the very large image jupyter/datascience-notebook. It seems to happen at different points (the below is the furthest it’s been).

Using default tag: latestlatest: Pulling from jupyter/datascience-notebook
693502eb7dfb: Pull complete 
490c0d36e714: Pull complete 
b47c251cda4e: Pull complete 
5f06af7aed8b: Pull complete 
6486d270a020: Pull complete 
825ae89ffbbc: Pull complete 
aa6afd195d29: Pull complete 
74d4b5c1232e: Pull complete 
d2d97ba526ae: Pull complete 
877dfb20778a: Pull complete 
d1f76c86598c: Pull complete 
47584947ddfa: Pull complete 
7950a3634bd0: Pull complete 
73cec490b164: Pull complete 
55e9771cdf97: Pull complete 
9fe1be3663ff: Pull complete 
96936f8a45ae: Pull complete 
59039204a587: Pull complete 
e35b9e8aa2fb: Pull complete 
dc5e7e85cf19: Pull complete 
e9269da4c431: Pull complete 
2cd96abadd7a: Pull complete 
71977f80348a: Pull complete 
8fa0a7391ad3: Pull complete 
e2c348bdf854: Extracting ==================================================>] 162.4 MB/162.4 MB
failed to register layer: ApplyLayer exit status 1 stdout:  stderr: open /opt/julia/v0.5/METADATA/Nettle/versions/0.1.9/requires: no space left on device

Also, have around 160gb free, so don’t think it’s that.

jimmiebtlr · June 19, 2017, 7:03pm

I think every time I run it again, a new file throws the error. So potentially something where it thinks it failed, but didn’t?

hendersj · June 19, 2017, 7:13pm

On Mon, 19 Jun 2017 17:06:02 +0000, jimmiebtlr wrote:

> I think every time I run it again, a new file throws the error. So
> potentially something where it thinks it failed, but didn’t?

You might need to clean your snapper snapshots.

What’s the output from:

snapper list

?

With btrfs, using df to check free space is not sufficient.

Jim

–
Jim Henderson
openSUSE Forums Administrator
Forum Use Terms & Conditions at http://tinyurl.com/openSUSE-T-C

tsu2 · June 20, 2017, 12:46am

Turn snapshotting off, or re-install selecting ext4 instead of btrfs.

There is really no point in snapshotting storage for virtualization or containers, if you even had to rollback for <any> reason <every> image would be rolled back undoing changes in every one.

And, that’s on top of the massive numbers and sizes of snapshots you’d be creating if changes are being made to more than one container at once.

TSU

mkrwc · June 21, 2017, 8:20pm

Same issue here, docker on btrfs was working fine, but broke probably kernel/docker update (last week-ish?). It fails even downloading 47.58 MB of data:

leap: Pulling from library/opensuse
d7ceceec7bf6: Extracting 47.58 MB/47.58 MB
failed to register layer: ApplyLayer exit status 1 stdout:  stderr: open /var/lib/ca-certificates/openssl/T__RKTRUST_Elektronik_Sertifika_Hizmet_Sa__lay__c__s___H5.pem: no space left on device

But I have plenty of space left:


sudo btrfs fi df /var/lib/docker/btrfs
Data, single: total=12.01GiB, used=5.49GiB
System, single: total=4.00MiB, used=16.00KiB
Metadata, single: total=1.25GiB, used=308.91MiB
GlobalReserve, single: total=20.59MiB, used=0.00B

We should probably fire a bug if there are two of us.

Miuku · June 22, 2017, 5:17am

There’s an intermittent issue with BTRFS where it reports out of space even when there’s plenty of it left - temporary triage; disable quota on the drive.

IMO: btrfs is a piece of ****. Whoever had the great idea of making it the de facto in openSUSE needs to get kicked in the balls, hard.

gogalthorp · June 22, 2017, 1:44pm

You need to supply plenty of space for snapper. Normal Linux tools do not report correct space when snapper is involved. IMHO BTRFS with snapper has a place but most users in most desktop situation don’t really need it. It is great for developer or Tumbleweed uses that may need to roll back often but Leap users for the most part can get by with EXT4 just fine. There is no real difference in performance between BTRFS and EXT4

mkrwc · June 22, 2017, 2:28pm

I doubt this problem has to do anything with snapper, I removed all the snapshots and the problem still persists. I agree that using tools like df may yeld inaccurate results but btrfs fi df should always be fine. IMO there is a bug in docker/btrfs kernel driver introduced in one of the recent updates. Will fire a bugzilla ticket later today.

mkrwc · June 22, 2017, 5:59pm

Bug report: https://bugzilla.opensuse.org/show_bug.cgi?id=1045598

jimmiebtlr · June 24, 2017, 4:50am

Awesome, thanks for submitting the bug report.

tsu2 · July 3, 2017, 7:36pm

I’ve come across some info that calls into question my posted recommendation…

BTRFS has the ability to restore individual files with the following command, so you don’t have to roll back an entire partition

btrfs restore

That said,
BTRFS provides this highly desirable feature at a price… Compared to other fs BTRFS can be prohibitively slow, particularly if you’re supporting intensive disk I/O. So, BTRFS is highly discouraged when deploying a relational database or similar. You could probably extrapolate that recommendation to some highly intensive frontend apps. But, many if not most apps don’t behave this way so BTRFS might be “OK” for most apps.

If you want the features BTRFS provides, it is probably a better choice than others, BTRFS is particularly well known for performaing relatively better than all other current OS when writing metadata, the stuff things like snapshots and file recovery relies upon.

Although not ordinarily relevant to Docker, when considering BTRFS for storing virtualization images (eg VMware, VBox, KVM, Xen, etc), there are some articles which highly recommend making the virtual disk fs RAW or similar, and specifically to avoid COW and QCOW which is the default for KVM systems to avoid possible write amplification.

Lastly, of course this means that if you implement BTRFS on a multi-OS system like Docker or any of the Virtualization technologies, it’s imperative to create your own BTRFS config to discard old snapshots at a faster rate than default.

TSU