Running Mysql DB in Memory.

keyb_user1 · September 29, 2011, 6:51pm

Hi,

My question is only theoretical, I do not know if someone has ever made such a setup or if it even possible.

My idea is the following one:

I would like to run MySQL in memory in order to take advantage of the very high speed IO.
I know how to tune a MySQL server and I also know that making a very big memory cache for InnoDB tables allows the MySQL engine to optimize memory usage and if the size on cache is big enough it is like if the majority of the IO is done using memory.
Nontheless I would like to really find a way to make use only RAM … Something like mounting /var/lib in a tmpfs and then using a script in order to synchronize it with the HDD for persistence.

The ideal setup would be a system that would mount /var/lib in memory (the usual place where MySQL places the db files) and allowed for a synchronization with the actual disk for obvious persistence reasons.

In such a way that the system boots from hard disks … mounts partitions in memory … synchronizes partitions with local hard drive partitions and then some script could make the reverse memory/tmpfs synchronizes to disks.

What applies to /var/lib could also be done with other partitions … memory size is not a problem …

Is this possible?

Anyone tried this before ?

Regards.

john_hudson · September 29, 2011, 8:01pm

Doesn’t memcached meet your needs?

keyb_user1 · September 30, 2011, 1:02am

Hi,

Simple answer no.
Memcached is not a database.
It does indeed help no doubt by I would like to try a Lightning fast solution … one that would not need memcached. And it seems to me that placing the datafiles in memory is actually very simple in Linux … Actually tmpfs is usually Half (!!!) the computer ram memory … so Why not use it to a good purpose ?
In todays pc’ s 16GB are not a problem, so in development databases this is more then enough to run mysql …

I know it is easy to use the tmpfs … one needs to mount it in say /mnt/memdir and maybe do on boot or before using mysql a rsync -a /var/lib/mysql/ /mnt/memdir

Then change the data directory on my.conf file and that’s it.
Before shutting down … and this is really the problem … one must back-up the database and also synchronize say /mnt/memdir with the usual /var/lib/mysql or somewhere else appropriate for this purpose.

Of course if something goes down all data is lost … and that is a really big problem where the only solution is to make regular backups … but this defeats the purpose of low I/O …

Regards.

ken_yap · September 30, 2011, 1:16am

Maybe you want the memory storage engine:

MySQL :: MySQL 5.1 Reference Manual :: 13.9 The MEMORY Storage Engine

If not you could develop your own storage engine.

LewsTherinTelemon · September 30, 2011, 3:52am

I believe there are several in-memory MySQL engines available, such as Blackray:

BlackRay - Wikipedia, the free encyclopedia

In-memory database - Wikipedia, the free encyclopedia

Cheers,
Pete

vodoo · September 30, 2011, 11:47am

I would like to run MySQL in memory in order to take advantage of the very high speed IO.

Have a look at MySQL Cluster. It is designd to run in memory and can be installed on a single machine as well (but loosing some of its advantages).

keyb_user1 · September 30, 2011, 11:49am

Hi Ken,

ummm … not exactly what I was thinking.
I have to be more precise.
The Memory engine creates tables in memory, so it is for all due purposes like HSQLDB and Derby and the like (when used in memory mode only).
There is also no persistence .
Once the computer powers down … so does all data on those tables.
One can indeed make a similar setup as the one I was thinking … when the application boots load the data into the Memory tables from other backup tables or from file.
Both procedures … save into file on shutdown and load from file on boot have to be made …
In that sense it is like HSQLDB, Derby and the like.
Very good db’s no doubt but if I already have all code optimized and ready for MySQL why change ? If a simple setup like using tmpfs can solve the problem.

If not you could develop your own storage engine.

wow … hold your horses Ken … !!!
I do not know if I really have the talent necessary for that
I do really like to take advantage of what people more focused on the issue has to offer
And if linux can provide a functionality that solves the problem … excellent!

I am testing the thing I will feedback in a couple of hours about my small tests …

Regards.

tsu2 · September 30, 2011, 8:36pm

I think I see what you’re really asking about…

Be aware though that your query conflicts with most database requirements… Database engines not only have to be fast, but usually you can’t sacrifice reliability, so the more you intend to move data into memory even temporarily you risk the data’s integrity should there be a power glitch.

So, might I recommend…

You can create OLAP cubes and “views” on some database engines as virtual data stores. Note though that these virtual stores do not ordinarily write back to the original data so are used mainly for “read only” and hardly ever for “write” operations.

You can deploy a copy of the data to a virtual disk, but you severely risk the data’s integrity. You’d have to at least invest heavily in critical backup power systems (both within the machine and outside) to decrease the risk but there’s no getting around the overall increased risk.

If you deploy either of the above instances and wish to “write back” changes to disk, many database clustering architectures support “lazy write backs” which generally means that changes can be written as a lower priority process that won’t slow down the overall performance while still providing for the atomicity of transactions… which means in plain language that if a disruption occurs then you’ll be guaranteed to lose some of the latest data but the integrity of your databases are preserved.

Last comment…
Considering how cheap disks are today and some of the advances in massively extendable/expandable databases, both relational (like MySQL) and non-relational (like Cleopatra) which don’t have to risk your data, I’d recommend you first list your project objectives and requirements in concrete terms, then take a look at what technical solution might best address your needs.

HTH a bit,
Tony

keyb_user1 · September 30, 2011, 9:02pm

Hi,

tsu2:

I think I see what you’re really asking about…

Be aware though that your query conflicts with most database requirements… Database engines not only have to be fast, but usually you can’t sacrifice reliability, so the more you intend to move data into memory even temporarily you risk the data’s integrity should there be a power glitch.

So, might I recommend…

You can create OLAP cubes and “views” on some database engines as virtual data stores. Note though that these virtual stores do not ordinarily write back to the original data so are used mainly for “read only” and hardly ever for “write” operations.

You can deploy a copy of the data to a virtual disk, but you severely risk the data’s integrity. You’d have to at least invest heavily in critical backup power systems (both within the machine and outside) to decrease the risk but there’s no getting around the overall increased risk.

If you deploy either of the above instances and wish to “write back” changes to disk, many database clustering architectures support “lazy write backs” which generally means that changes can be written as a lower priority process that won’t slow down the overall performance while still providing for the atomicity of transactions… which means in plain language that if a disruption occurs then you’ll be guaranteed to lose some of the latest data but the integrity of your databases are preserved.

Last comment…
Considering how cheap disks are today and some of the advances in massively extendable/expandable databases, both relational (like MySQL) and non-relational (like Cleopatra) which don’t have to risk your data, I’d recommend you first list your project objectives and requirements in concrete terms, then take a look at what technical solution might best address your needs.

HTH a bit,
Tony

I get precisely your idea.
But mine will be a lot more simplistic:
Get a 1U server with 256GB Ram +
Only 2 hard drives to serve as storage to backup. And another 1U slot is a UPS and connects to the server. As soon as there is a power failure I can place a heartbeat/cron system that checks the output/input of the PSU in order to backup the db, stop the application and things will be ok.
This does lower a lot the level of availability of the system.
Every time there is a power failure … the system must be stopped somehow and the database backup to disk.

But also remember … there is a limit to the number of transaction on a InnoDB table on hard drive (IOP…) … that limit is only possible to surpass with memory or SSD hardrives …

Regards.

tsu2 · September 30, 2011, 10:36pm

If your database is small enough, or at least the working part of your database… then it might be possible to deploy on a single server as you propose… But depending on your database application typically you’d do better to deploy a proper RAID system (6, 10 or 01) and take a look at your typical use and execution plan… for a bigger system consider clustering on multiple machines.

You really need to properly analyze your database needs before proposing hardware configurations,

Provisioning a Database Server properly is a very big topic, recommend you do a little bit of research. Storage I/O is only one possible bottleneck, there are also CPU and RAM issues, reliability and availability on multiple levels, more

If possible I also recommend you consult with the architects of your dB. In my experience it’s far more important for the dB to be architected and optimized as best as possible, then from that try to understand the hardware requirements and then provision properly. If you don’t do that or do it in the wrong order, you’ll likely end up wasting a ton of money and likely getting a ton of unacceptable performance along the way.

A shortlist of stuff that should be part of your analysis
Short term vs intermediate term vs long term, how fast and far might grow and need to design for?
Application needs, types of queries. Run actual tests if possible
QoS needs, can the application be down for maintenance and/or emergencies?

And at the beginning and again at the end, How Much Money are you able to or want to spend to achieve your objectives?

HTH,
Tony

keyb_user1 · October 1, 2011, 1:18pm

Hi,

No doubt raid can duplicate hdd bandwidth (thinking about raid10) and make data very secure indeed.
But for a 1U solution a Raid car + BBU is a very hot item … “cramps” the server making airflow not optimal. This can be a hazard for lont term operation.
Also no matter what is Database tunning or optimization and the caching done for you ORM layer (Hibernate in my case ) one always reaches the same problem
DB optimization is indeed the single Most important part of development no doubt, but there is a limit for what you can do not only with your queries and schemas but also also, in my case, with the required transactions.

You really need to properly analyze your database needs before proposing hardware configurations,

Provisioning a Database Server properly is a very big topic, recommend you do a little bit of research. Storage I/O is only one possible bottleneck, there are also CPU and RAM issues, reliability and availability on multiple levels, more

I have quite some years of experience in app development.
And today one can get a 1U 128GB DDR3 (expandable to 256GB ! ) system for about 5200 euros …

Until recently I was not convince Memory DB was a solution … the price of memory was very very High.
Maybe using something like SSD, although SSD’s have “the problem” of certain failure in a very short time … 2-3 years.
So for a very high intensity transaction application … with very few data per transaction … Memory is the solution.
About CPU the price quoted was for 24 Cores 2 x AMD Opteron 6174 2.2GHz 80W … it can
A server like this can save a lot of cash even if needing a UPS in my "thought setup " that I proposed.
And there is nothing out there that can even get close to the speed of RAM access.
SSD are not even close as fast as RAM and they have the “problem” … certain failure.

If possible I also recommend you consult with the architects of your dB. In my experience it’s far more important for the dB to be architected and optimized as best as possible, then from that try to understand the hardware requirements and then provision properly. If you don’t do that or do it in the wrong order, you’ll likely end up wasting a ton of money and likely getting a ton of unacceptable performance along the way.

A shortlist of stuff that should be part of your analysis
Short term vs intermediate term vs long term, how fast and far might grow and need to design for?
Application needs, types of queries. Run actual tests if possible
QoS needs, can the application be down for maintenance and/or emergencies?

And at the beginning and again at the end, How Much Money are you able to or want to spend to achieve your objectives?

HTH,
Tony

I am still analysing all requirements, this memory solution would be a done deal for the problem of I/O speed in the database. In case
Meaning: At the most the 1U server would receive 6Gbps on the dual digabit ethernet ports + possible 4 port expansion.
So … It can only handle 6Gbps of queries …

The application is a fast transaction financial system … intense transactions but very few actual data per transaction … and the competitor solution allows me to quote even one (maybe two) 2U servers with 512GB Ram … and 4 CPU’s each … 48 Cores/server. And that settles the problem.
So in terms of hardware I have a price advantage even if I push much higher.
My problem is that I do not have access to any real data for testing. Hence my caution about the problem. Also I have some parts of the actual business model, but not all. and that can be a world of difference.
I have tested on my desktop only a development version of what could be their tables and transactions in a functional system. But not having the all requirements makes it impossible to evaluate and very risky to make estimates. I will not do that … too many years in the business to know that is not healthy
So it really is not a problem the price for the hardware.
Actually the co-location price difference for the solution I am thinking about (in case it would be possible) more then compensates the RAM price in a 4 year period. So actually the RAM price on the server is just a up-front cost , nothing else.

Regards.

robin_listas · October 1, 2011, 3:03pm

On 2011-10-01 13:26, keyb user wrote:
> Also no matter what is Database tunning or optimization and the caching
> done for you ORM layer (Hibernate in my case ) one always reaches the
> same problem

I think you need a database engine that works in memory and does reliable
backups to disk as a background periodic task. I have seen this in the real
world, but it was an in house (expensive) app/engine. There must be engines
out there doing it, do not fix yourself on mysql if it doesn’t do it. The
solution of having a ram disk and dumping the files to disk on closure or
power off makes my hair stand up like you can not imagine, after learning
you want this for real business transactions.

–
Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

licehunter · October 1, 2011, 8:49pm

There must be engines out there doing it

I haven’t really thought this over, but it might be possible to do this by having two MySQL instances running on the same (or different) machines in a circular master/slave configuration, with the master using in-memory only storage and the slave using proper disk storage.

This way, in the event of sudden loss of power data loss should be minimised–proportional to the replication lag. In the usual cases where you can actually detect a power failure and have enough reaction time thanks to your UPS to shut things down cleanly, data loss should be zero.

Resynchronisation on power up is achieved via the circular nature of the replication.

Or did I miss your points by a mile?

keyb_user1 · October 2, 2011, 1:15am

Hi,

In Technical terms: other then someone taking out cables from the psu to the server there would be no power loss for the server ever.
All power losses are detected in time and the system would be backed-up/halted with more then sufficient time to do it.
But you got it exactly right.
Your solution is correct and I thought about it … but … it either means Two database instances running … double CPU load, or another server … double servers.
I am confident enough about this setup, given the initial requirements, to dispense with the second server/slave instance.
I just need development and testing time and more details

Regards.

keyb_user1 · October 2, 2011, 1:37am

Hi,

Like you said … expensive, very expensive. Also, the memory solutions out there are not mature/functional enough when compared with Mysql. (HSQLDB, Derby etc)
And also … I am more experienced with MySQL, my current code is optimized for MySQL … so for me to endeavor on a new solution with a new DB just seems too much risk.
I really want something to be a case closed for the possible IO issues.

The solution of having a ram disk and dumping the files to disk on closure or
power off makes my hair stand up like you can not imagine, after learning
you want this for real business transactions.

–
Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)

Lets think in a cold objective manner.
The Vast majority of sales people today (ignorant commercial folks) are trying to sell everyone a solution of SSD’s disks. That is Certain Disk Failure in Very Short time .
That does not seem to put anyones air standing up … it might even cause some aplause by the way
I understand your concern but the backup is not on power off … it is on the UPS signaling a power outage … this is more then sufficient time to trigger a backup and a application shutdown to the other app nodes in case it is needed.
What exactly do you think happens when, say a single Oracle DB is running and suddenly someone pulls the plug ?
If you ever tried that on a real system running … that would be frightening …

The difference between a DB running on volatile memory or hdd’s in case of catastrophe is only the amount of data lost.
And even that could be severely mitigated with a Master/slave system.

Actually in this case only in the situation of real server damage, power outages are not a problem at all.

Regards.

tsu2 · October 4, 2011, 1:47am

Recommend re-reading my original post responding to your query.

In it, you’ll find I mentioned various means for data to be served from RAM, some <do not> require more than the single database engine instance, others do suggest multiple instances.

I also noted when it would be relatively safe for data to reside in RAM and when it would be more risky, possibly irresponsibly risky.

But, as I noted IMO there’s a lot more than what you’re considering that goes into typical provisioning… Storage I/O may not be the bottleneck and I’d be surprised if anyone would authorize any scenario that puts any data significantly at risk. Whatever you decide to do, you will have to prove your understanding of the job at hand and it’s not a matter of just doing it with less hardware.

Good Luck,
Tony

robin_listas · October 4, 2011, 2:28am

On 2011-10-02 01:46, keyb user wrote:

> I understand your concern but the backup is not on power off … it is
> on the UPS signaling a power outage … this is more then sufficient
> time to trigger a backup and a application shutdown to the other app
> nodes in case it is needed.

Nevertheless… is too critical for my liking.

> What exactly do you think happens when, say a single Oracle DB is
> running and suddenly someone pulls the plug ?

I don’t know much about oracle, but I assume a db does atomic consistent
operations. If the power goes off, the files should be consistent and only
the in course operation could be lost.

> The difference between a DB running on volatile memory or hdd’s in case
> of catastrophe is only the amount of data lost.

More or less.

–
Cheers / Saludos,

Carlos E. R.
(from 11.4 x86_64 “Celadon” at Telcontar)