Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: Does snapshot happen atomically at a SINGLE point in time?

  1. #1

    Default Does snapshot happen atomically at a SINGLE point in time?

    Suppose I memory map a 100G file and I have a single process which is constantly changing the values everywhere (e.g., a banking app with 1B accounts doing 1M DB/CR per second randomly over 100G of RAM).

    If I do a snapshot on the volume that the file is in, is it done atomically so that all pages of the file are frozen in my app at a specific point in time? or does it take the snap incrementally over pages in my file so that some pages of my file might be snapshotted at a slightly different time (within a msec)?

    I'm guessing that all files in a volume are atomically snapshotted at a single moment in time and that I have to wait for snapshot to return (a few msec) until I can be sure it is done and i can continue. So I should stop processing, snap and wait for snap to return, then start processing again so I know exactly what my state was when the snap is taken (since I don't want to take in the middle of a DB/CR txn since that would be an inconsistent state... i always want to snap after a known transaction completed).

    Is there a way to avoid the 3 msec wait time for the snap to complete or that's the best you can do? I'm guessing i have to wait and i should just call it synchronously from within my app at the time everything is consistent.

    And presumably doing fsync or sync before a snap makes no difference whatsoever as far as what snap is produced.

  2. #2
    Join Date
    Aug 2010
    Location
    Chicago suburbs
    Posts
    12,627
    Blog Entries
    3

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    As far as I know, a snapshot is done atomically.

    This is mostly handled by CoW (copy on write). When a disk block has to be updated, the new data is written to a new disk block and the old disk block remains intact. So creating a snapshot is mostly just generating an index for blocks assigned to files.

    However, I don't personally use "btrfs", and I don't claim any expertise on the details.
    openSUSE Leap 15.1; KDE Plasma 5;
    testing Leap 15.2Alpha

  3. #3
    Join Date
    Sep 2012
    Posts
    5,138

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    From the btrfs point of view snapshot is atomic. But it does not automatically makes it consistent from the point of view of applications tusing this filesystem. So either your application needs to implement extended logic to recover from incomplete data (intent journal, roll back etc) or you need to do exactly as you describe - pause processing while application is in consistent state, capture this state in snapshot, continue processing. That is how it is implemented by virtually every backup application out there.

  4. #4
    Join Date
    Nov 2009
    Location
    West Virginia Sector 13
    Posts
    15,744

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    Note snapshots do not image all things on the system essential the target is the system files. Unless you modify the rules or put user data in system areas your working files are not imaged.

    Snaps are auto taken before and after an update also periodically once per week I think

  5. #5
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,289
    Blog Entries
    2

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    Technically,
    Although I think all understand what is meant and intended in the original post,

    Snapshots, and in particular as created by BTRFS cannot be considered atomic.
    If you look up the definition, "atomic" normally describes a specific transaction flow where if data integrity is threatened(ie the snapshot is attempted in the middle of a long running, incomplete transaction), then the transaction is rolled back.

    That doesn't happen when you create a snapshot, instead the state is frozen and changes are managed separately... Then after the snapshot has been created the changes are appended/merged to the snapshot to update to current.

    So, your question should have been more along the line of how and whether a snapshot can guarantee its own integrity.

    The description by @nrickert can be considered mostly correct, my only criticism is that it's my understanding common file changes without snapshotting can be found to happen that way as well in various fs(what actually happens varies from one fs to another), the description by @arvidjaar is accurate.

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

  6. #6
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,289
    Blog Entries
    2

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    Quote Originally Posted by gogalthorp View Post
    Note snapshots do not image all things on the system essential the target is the system files. Unless you modify the rules or put user data in system areas your working files are not imaged.

    Snaps are auto taken before and after an update also periodically once per week I think
    More accurately, the current default snapper policy creates a snapshot on every bootup, shutdown and whenever zypper executes a package operation (eg install, update) and retains/purges snapshots on its own schedule.

    For more info about BTRFS,
    I've compiled what I consider authoritative references at the following link.
    https://en.opensuse.org/User:Tsu2#BTRFS

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

  7. #7

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    Quote Originally Posted by arvidjaar View Post
    From the btrfs point of view snapshot is atomic. But it does not automatically makes it consistent from the point of view of applications tusing this filesystem. So either your application needs to implement extended logic to recover from incomplete data (intent journal, roll back etc) or you need to do exactly as you describe - pause processing while application is in consistent state, capture this state in snapshot, continue processing. That is how it is implemented by virtually every backup application out there.
    Great this is what I was looking for. Basically the snap is the state of all files (both mmapped and on disk) in that volume at that EXACT moment in time when it twiddles some magic counter or variable that anyone writing to a page will subsequently reference before modifying a page.

    So at that magical time, I would image that any modifications "in process" at the time of the snap will finish up so that the data being written becomes included in the snapshot (since otherwise you'd have to check for a new snap for every byte you wrote which would be insane).

    So I'd guess the snap is technically not atomic (by which I define for my question as "at a single point in time") because it doesn't stop i/o in progress.

    But maybe I am wrong...maybe snap waits for everyone modifying any file to finish before it says "OK, from now on, it's COW".

    I'm guessing halting the system to wait for all writing to cease would be a bit "over the top" so my guess is that snapping is pretty close to an atomic snap WITH the exception of any writes that were in process at the time which would be considered to be part of the snapshot even though they technically did not finish at the moment the snapshot took "effect".

    Practically speaking, i'll halt, snap, and resume so this doesn't matter to me, it's more of intellectual curiosity. It's a pretty cool feature.

  8. #8
    Join Date
    Sep 2012
    Posts
    5,138

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    Quote Originally Posted by skirsch View Post
    So I'd guess the snap is technically not atomic (by which I define for my question as "at a single point in time") because it doesn't stop i/o in progress.
    You are assuming there is continuous flow of IO. It is not how it works. btrfs performs IO in series of transactions. Either transaction is completed and all modifications applied or not. Transaction is minimal unit of filesystem changes that can be applied.

    maybe snap waits for everyone modifying any file to finish before it says "OK, from now on, it's COW".
    You are mistaken again. Snapshots do not work on file level, they work on block level. When snapshot is created, new metadata records shared state of blocks (extents) in source subvolume and transaction is initiated that writes this metadata on disk. So snapshot is atomic with respect to on-disk filesystem state - either transaction is committed and snapshot is present or transaction is aborted for whatever reason and snapshot is not present. It is impossible (sans bugs) to have partial snapshot of some subvolume content or snapshot that contains data from later transactions.

    That said, there is no serialization with other processes doing IO which means of process A writes to file and process B creates snapshot at the same time it is undefined whether changes made by process A will be part of snapshot.

  9. #9
    Join Date
    Jan 2014
    Location
    Erlangen
    Posts
    992

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    Quote Originally Posted by skirsch View Post
    Suppose I memory map a 100G file and I have a single process which is constantly changing the values everywhere (e.g., a banking app with 1B accounts doing 1M DB/CR per second randomly over 100G of RAM).

    If I do a snapshot on the volume that the file is in, is it done atomically so that all pages of the file are frozen in my app at a specific point in time? or does it take the snap incrementally over pages in my file so that some pages of my file might be snapshotted at a slightly different time (within a msec)?

    I'm guessing that all files in a volume are atomically snapshotted at a single moment in time and that I have to wait for snapshot to return (a few msec) until I can be sure it is done and i can continue. So I should stop processing, snap and wait for snap to return, then start processing again so I know exactly what my state was when the snap is taken (since I don't want to take in the middle of a DB/CR txn since that would be an inconsistent state... i always want to snap after a known transaction completed).

    Is there a way to avoid the 3 msec wait time for the snap to complete or that's the best you can do? I'm guessing i have to wait and i should just call it synchronously from within my app at the time everything is consistent.

    And presumably doing fsync or sync before a snap makes no difference whatsoever as far as what snap is produced.
    "An "atomic COW snapshot"—easily the most hilarious-sounding feature ever to grace a filesystem—is an image of the entire filesystem in exactly the condition it was in at a given instant in time, no matter what else was transpiring at the time

    So if you take a snapshot of a filesystem at 8:13 and 32 seconds pm on December 19, 2013, that snapshot will contain every single byte of that filesystem at exactly 8:13 and 32 seconds pm on December 19, 2013—period, no ifs, ands, or buts. This helps keep high-activity structures like databases consistent. As long as the database uses journaling (and if it doesn't, upgrade!), its journal will be consistent in the snapshot. Any partially completed transactions can be cleanly rolled back instead of leaving the database in an inconsistent state."

    Bitrot and atomic COWs: Inside “next-gen” filesystems
    AMD Athlon 4850e (2009), openSUSE 13.1, KDE 4, Intel i3-4130 (2014), i7-6700K (2016), i5-8250U (2018), openSUSE Tumbleweed, KDE Plasma 5

  10. #10
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,289
    Blog Entries
    2

    Default Re: Does snapshot happen atomically at a SINGLE point in time?

    Quote Originally Posted by karlmistelberger View Post
    "An "atomic COW snapshot"—easily the most hilarious-sounding feature ever to grace a filesystem—is an image of the entire filesystem in exactly the condition it was in at a given instant in time, no matter what else was transpiring at the time

    So if you take a snapshot of a filesystem at 8:13 and 32 seconds pm on December 19, 2013, that snapshot will contain every single byte of that filesystem at exactly 8:13 and 32 seconds pm on December 19, 2013—period, no ifs, ands, or buts. This helps keep high-activity structures like databases consistent. As long as the database uses journaling (and if it doesn't, upgrade!), its journal will be consistent in the snapshot. Any partially completed transactions can be cleanly rolled back instead of leaving the database in an inconsistent state."

    Bitrot and atomic COWs: Inside “next-gen” filesystems
    Unfortunately,
    You actually describe the most common scenario that is actually broken by snapshotting...
    When you're talking about a database and database transactions, you can have long running transactions (sequential flow, very large data changes), and since a snapshot is completely unaware of these kinds of operations cannot guarantee data integrity.

    This is why snapshots should expressly never be enabled where there is a database unless some day the BTRFS snapshot is made "application aware."
    So, for example in the MSWindows world,there is a snapshot technology called "Volume Shadow Copy" -- Where plugins are written to make it aware of specific database and mail applications, so that the application (and activity) is issued a suspend command and when the suspend happens only then the snapshot is created.

    BTRFS snapshots have no such "application awareness" so will create snapshots immediately regardless of application state and if data is in flight it's anyone's guess what will be in your snapshot.

    So yes... although there is no such thing as an atomic snapshot (go ahead and look that up, you won't find anything except a fairly specialized situation that has nothing to do with what we're talking about here), there definitely is such a thing as an "atomic transaction."

    BTW -
    Disregard anything anyone has to say about bitrot... It's FUD.
    I've spoken to disk manufacturers directly who do extensive testing on their own products, and they say that as long as you care for your disk there is no such thing as bitrot.

    TSU
    Beginner Wiki Quickstart - https://en.opensuse.org/User:Tsu2/Quickstart_Wiki
    Solved a problem recently? Create a wiki page for future personal reference!
    Learn something new?
    Attended a computing event?
    Post and Share!

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •