Thursday, March 12, 2009

MySQL optimization 101, part 1

One of MySQL’s key performance factor is disk I/O, and this is particularly felt on big tables where there is lots of data on disk to be scanned. You have to be careful when creating your tables. If you don’t pay attention at this step, you’ll end up having rows bigger than necessary.

You need to make your schema as compact as possible from the start. This is the first absolute rule. It’s like cleaning your teeth. You can’t have white teeth if you don’t clean them. Well you can, but it’s going to cost much more. This said, if you are a smoker, it doesn’t guarantee that you’ll have white teeth.

If you double your row size just because you use default types without paying attention, you won’t notice it for ten rows. However, with a 50 million rows table, that wheighs 3 Go, the overhead will start to be felt.

If your MySQL table is twice bigger on disk than needed, you can assume that a query will be twice as long (well, maybe). And additionnaly your sysadmin will hate you for making him buy new disks all the time.

A little anecdote to prove my point, that I read on MySQL’s website.

A company used VARCHAR(1) fields to store flags, like “Y” or “N” (yes or no). A VARCHAR field uses 1 byte to store the length of the field (if the size is below 255, two bytes otherwise).
A CHAR(1) field uses only one byte, no need to store it’s length since it’s fixed.

If you follow me, each flag used two bytes, instead of one. Taking into account that they had several of these flags in each row, and that they had gigantic tables, this small overhead started to really matter, and they converted all of their tables to use CHAR in order to optimize both size and response time.

Please, keep this in mind, but don’t go for extremes all the time. We once designed a table without foreseeing how much it would grow. At one point, a year later, our platform crashed. The reason was that we had reached a field’s max capacity (65k). Your schema should be able to go on for ten years being unattended.

Next post will continue on schema optimization.

No comments:

Post a Comment

Please leave your comment

Search Results