Thursday, March 26, 2009

SQL Optimization 101 – Part 2

A naming convention for database schemas.

I am often surprised to note that many projects and companies don’t have any naming convention for the fields in their database.

You can for example have a schema as follows:
  • Table user
    iduser_nameuser_passaddress_id

  • Table address
    idstreet_namecitypostal_code

The address_id field in user is a foreign key: the primary key of table address.

I hate when schemas are designed like this. Why can’t I use the same field name when requesting over address_id? Moreover, if there is another table with IP addresses for example, it can spawn confusion.

Finally, when I see the schema of the table, I don’t know the datatypes at a glance.

I have used the following naming convention for a long time and it as proven very clear, if a little verbose, maybe.

Table name: PROJECT_TABLE.
For example: FORUM_USER

Field name: TABLE_NAME_TYPE_FIELD_NAME
Where TYPE can be:
  • i for integer
  • d for date types
  • t for timestamp
  • e for enum
  • s for string types
You have to name the foreign keys exactly as they are named in their own table.
We sometimes use trigrams to shorten long names.

The schema presented above would then look like:
  • Table EXMPL_USER
    user_i_iduser_s_nameuser_s_passaddress_i_id

  • Table EXMPL_ADDRESS
    address_i_idaddress_s_streetaddress_s_cityaddress_i_pcode
It is more verbose, but you see at a glance the global types of your schema, and the relationship between tables. And whatever the table you are in, the address id of an address will be address_i_id, making it a no brainer to write requests.

Amaury (who was the first to use this norm in our team) advocates the use of trigrams. I find it sometimes confusing, so I like better long names. The poet in me, probably.

1 comment:

amaury.net said...

Thanks for citing me ;-)

To be honest, I did some improvements on the standard used in the company where I did my first internship.

Anyway, my convention is a bit different:
- I use an additionnal prefix ('t' for tables, 'p' for stored procedures, 'tr' for triggers).
- I don't repeat the database name in every table name.
- The prefix of field names is always 3 characters long.

It seems obvious when you have to write the full name of a field, using the database and table namespaces.
forum.

Mine : forum.tComment.com_i_id
Yours : forum.forum_comment.comment_i_id

This kind of naming convention is very convenient because you can see foreign keys in just one look ; same thing for the type of data. For exemple, if I have 2 tables, tUser and tCompany :

tUser
-----
use_i_id
use_s_name
use_d_birthday
use_b_admin
com_i_id

tCompany
--------
com_i_id
com_s_name

It's very easy to read :
- use_i_id is the primary key of tUser
- use_s_name is a string
- use_d_birthday is a date
- use_b_admin is a boolean
- com_i_id is a foreign key to tCompany
- com_s_name is a string

About the field type, I use these conventions :
- i for integers (TINYINT, SMALLINT, MEDIUMINT, INT, INTEGER, BIGINT)
- f for float numbers (FLOAT, DOUBLE, REAL)
- s for character strings (VARCHAR, CHAR, TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT)
- c for one character (CHAR(1) or VARCHAR(1))
- b for booleans (BIT, BOOL, TINYINT(1))
- d for dates (DATE, DATETIME, TIME, YEAR)
- t for timestamps (TIMESTAMP)
- e for enumerations (ENUM)
- ee for sets (SET)
- o for blobs (VARCHAR BINARY, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB)

Post a Comment

Please leave your comment

Search Results