August 5
Beware of Rails Optimistic Locking and MySQL
There’s a caveat when using rails optimistic locking inside callbacks in a fault tolerant way with mysql’s default settings. Of course that’s a lot of things and sounds like a very specific scenario, but its not that much. Let me break it down:
Rails optimistic locking
Rails will automatically create locks around a record being updated in parallel if you insert a column named ‘lock_version’ which defaults to 0 in it.
The way it does that is by incrementing the ‘lock_version’ column on every update, and checking this value after each update. If the value read on the time of the update is not the same as the local copy you updated, it throws an ‘ActiveRecord::StaleObjectError’.
This is best exemplified by the following snippet:
>> user1 = User.first => #<User id: 1, user_id: 1, coins: 5247248, lock_version: 0> >> user2 = User.first => #<User id: 1, user_id: 1, coins: 5247248, lock_version: 0> >> user1.coins = 10 => 10 >> user1.save => true >> user2.coins = 20 => 20 >> user2.save ActiveRecord::StaleObjectError: Attempted to update a stale object
Rails callbacks and transactions
So, now that we know how optimistic locking works, lets take a quick look at callbacks and transactions.
In rails, there’s a multitude of callbacks (16 to be precise) that you can use to add hooks on the lifecycle of your object. You can check them out for yourself on the console:
>> ActiveRecord::Callbacks::CALLBACKS => ["after_find", "after_initialize", "before_save", "after_save", "before_create", "after_create", "before_update", "after_update", "before_validation", "after_validation", "before_validation_on_create", "after_validation_on_create", "before_validation_on_update", "after_validation_on_update", "before_destroy", "after_destroy"] >> ActiveRecord::Callbacks::CALLBACKS.size => 16
The interesting part for us is that the whole callback chain (that is, the method being hooked to (create, destroy, update) and all their callbacks (before_create, after_create, etc)) happens in the same database transaction. This means that rails opens a transaction before executing the first callback, and commits the transaction on the database only after all callbacks have been called without being rolled back.
Creating fault tolerant callback for your rails model
There are many different ways to make sure a user’s request won’t fail in the event that a ‘StaleObjectError’ is thrown in a optimistically locked object that’s been updated in parallel. One such approach for fields that are incremented / decremented, like the coins from our user below, is just trying again.
# class User < ActiveRecord::Base
def add_coins(coins)
begin
self.coins += coins
self.save!
rescue ActiveRecord::StaleObjectError
self.reload
retry
end
end
In the case above, if the user happens to have coins be added simultaneously by two different processes, the second will clash and throw a ‘StaleObjectError’. Though, it’ll reload the user, try to increment the coins again and finish successfully (assuming it doesn’t clash with another process again, in which case it will just retry until its the sole process updating the row. Remember this is *optimistic locking* afterall, so this shouldn’t happen so often).
So in the above case we can ensure multiple processes update the user record simultaneously without having to return an error for the user that read the stale object.
The infinite loop
All is not well though. An infinite loop will arise in case the code above is used inside a callback, if you are on a default mysql installation:
#class Monster < ActiveRecord::Base before_destroy :add_coins_to_user def add_coins_to_user user = find_user_which_dealt_last_blow user.add_coins(self.loot) end
In the example above, the ‘Monster’ class will add the amount of coins to the user that it has when its destroyed by it. The ‘add_coins_to_user’ will be called from the ‘before_destroy’ callback. Although it would seem this should work perfectly, and infinite loop will ensue in case two monsters are killed in parallel by the same user. To understand why, we need to revisit transaction isolation levels.
Transaction isolation levels in MySQL
There are four isolation levels available in most commonly in use databases. Those are:
- READ UNCOMMITTED – A transaction can read data from other transactions that have not been committed yet (dirty reads).
- READ COMMITTED – A transaction can read data from other transactions after those transactions have been committed.
- REPEATABLE READ – The data available for the transaction to read will be the same as it was when the transaction began even if other transactions change it.
- SERIALIZABLE – Places a lock on every read so that no transaction can read a row previously read by other transactions.
Although a detailed description of the use of each isolation level is material enough for another post, the important fact here is:
“Most other databases use the READ COMMITTED transaction level, but mysql uses the REPEATABLE READ transaction level by default.”
What this means is that, if two transactions read the same row containing the same version number (say, the version 0), even though the first transaction has saved a new version of the row (version 1), the second transaction will continue to read version 0 until its committed or rolled back. Associated with the retry clause on the rescue, the code will get stuck in the submit, read version 0, rescue, retry loop.
This means that, the algorithm proposed for fault tolerance above is only suited for the READ COMMITTED transaction level. In this level, after transaction 1 changes the version number, transaction 2 will read the version 1 on reload, and all will proceed as expected.
Changing MySQL’s default transaction level
Now, changing the default transaction level is not the only way of fixing this particular issue and might not even be the best way on your situation. You could reeengineer your code so the retry is called outside of a callback (and therefore outside of a transaction) or you could return false on the before filter (therefore rolling back the transaction) and just retrying outside of the callback (where a new transaction will read the new value), etc.
Though, I’ll show you how to update MySQL’s transaction level, so that it behaves like most other databases do by default. I hope I don’t have to tell you that if you attempt this in a codebase that expects the REPEATABLE READ transaction level, that code will break.
The way to change MySQL’s transaction level is my editing it’s config file, either on your user or the global config file. Here are where mine are located.
~ $ locate my.cnf /etc/mysql/my.cnf /home/mtoledo/.my.cnf
It is possible that your version of mysql has a bug where the user version of the file is supposed to be “my.cnf” instead of “.my.cnf”. If that’s the case, workaround it accordingly (or better yet, update your database).
The change is made in the [mysqld] section of the file. You only need to add the differences if its a local file, or add it inside the correct part of the file in the global config. I’ll show you both so you can pick whichever serves you best.
~ $ cat .my.cnf [mysqld] transaction_isolation = READ_COMMITED
~ $ grep -B 10 -A 4 transaction-isolation /etc/mysql/my.cnf [mysqld] # # * Basic Settings # # # * IMPORTANT # If you make changes to these settings and your system uses apparmor, you may # also need to also adjust /etc/apparmor.d/usr.sbin.mysqld. # transaction-isolation = READ-COMMITTED user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306
With those in place, the code will behave as expected, and no infinite loop will occur.
Finally, if you want to make sure your application has the right transaction isolation level it needs to work, you can add a file in initializers with the following script:
ActiveRecord::Base.connection.execute("select @@global.tx_isolation, @@tx_isolation").all_hashes.first.each do |var, val|
if val != 'READ-COMMITTED'
puts "#{var} != 'READ-COMMITTED'\n\tPlease add 'transaction-isolation = READ-COMMITTED' to your my.cnf"
exit
end
end
This will print a warning message in case the transaction level is not the one expected so that you can act accordingly.
Thanks a lot to coderrr for the help figuring this out, and for the warning script above.
Interesting post man.
Good piece of information.
Tks for sharing.
Useful information. Question about the example, though: in add_coins_to_user(user), where does the user argument come from, since it’s a callback?
@Mark haha, good point. Shouldn’t be making up those examples. Thanks for the heads up. :)
@Mark haha, good point. Shouldn’t be making up those examples. Thanks for the heads up. :)
Ooops, should have mentioned great post! Waiting for the next one!
Wow this is a pretty serious problem.
I think you should also recommend not retrying forever, but only a few times. This should avoid most of the problems without requiring one to change the transaction isolation level.
@Hongli yea definitely, I was going to suggest this at first with my other 2 suggestions (doing it outside of a callback or retrying outside), but retrying for a limited set of time would still eventually throw the error at the user after however many multiple attempts.
In any way, I’m with you that it’s much better than having it loop infinitely, and is good practice to limit the retry in a set number in all situations (specially with optimistic locking, where conflicts shouldn’t be that numerous) just in case something else unexpected goes wrong.
Thanks for your comment. :)
Actually, @Hongli, now that I think about it, retrying just a limited set of times instead of infinitely really does no difference, because if the transaction_isolation is REPEATABLE-READ all retries are going to be the same.
So, the only way of avoiding this situation without changing the isolation level is not retrying at all, and throwing a ActiveRecord::Rollback instead of retrying.
But, like I said above, unless you capture this outside of the callback to retry, this will be shown to the user as an error.