Best Practices

Portable code

When developing applications against a single RDBMS there is no need to worry about portability. However, when you are developing with the idea that the application may be deployed against any of a range of RDBMSes then you need to start thinking about how you formulate your queries, table structures, etc. While it is common practice to have separate classes for working with different databases, there are some things you can do to make that even easier.

ENUMs are EVIL

Developers who spend a lot of time developing against MySQL get into the habit of creating ENUM fields. (I will admit a certain amount of guilt here, too). This is not only not portable, but it doesn’t work quite the way you would expect. Even though the values in the ENUM are meant to be the only values you can insert into that field, MySQL does not enforce this quite the way you might think it should.

As an example:

CREATE TABLE enumevil (
    id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
    evil ENUM('foo','bar','baz') NOT NULL DEFAULT 'foo'
);
Query OK, 0 rows affected (0.01 sec)
INSERT INTO enumevil (evil) VALUES ('bip');
Query OK, 1 row affected, 1 warning (0.02 sec)
SELECT * FROM enumevil;
+----+------+
| id | evil |
+----+------+
|  1 |      | 
+----+------+
1 row IN SET (0.00 sec)

Rather than spitting back an error, it quietly sets the value of the ENUM field to the empty string. That is not in the list of allowed values nor is it the default. So you are left with checking the values in code before inserting (which you should do anyway – see the next section on referential integrity.)

Instead of using an ENUM field consider this:

CREATE TABLE noevil_enum_choice (
    choice VARCHAR(3) NOT NULL PRIMARY KEY
);
INSERT INTO noevil_enum_choice VALUES ('foo');
INSERT INTO noevil_enum_choice VALUES ('bar');
INSERT INTO noevil_enum_choice VALUES ('baz');
CREATE TABLE noevil (
    id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
    choice VARCHAR(3) NOT NULL REFERENCES noevil_enum_choice.choice
);

Although MySQL does not enforce referential integrity, this is now something can be easily converted to PostgreSQL as:

CREATE TABLE noevil_enum_choice (
    choice VARCHAR(3) NOT NULL PRIMARY KEY
);
INSERT INTO noevil_enum_choice VALUES ('foo');
INSERT INTO noevil_enum_choice VALUES ('bar');
INSERT INTO noevil_enum_choice VALUES ('baz');
CREATE TABLE noevil (
    id SERIAL PRIMARY KEY,
    choice VARCHAR(3) NOT NULL REFERENCES noevil_enum_choice(choice)
);

And PostgreSQL does enforce referential integrity. This also makes it easier to expand your choice list by just inserting a new row.

Referential Integrity

On the flip-side of the coin, developers who spend a good deal of time developing against PostgreSQL come to rely on the referential integrity built in to that RDBMS. While there is nothing wrong with that, your code should enforce this as well, if you want to be able to move from one RDBMS to another. In the example above we could rely on PostgreSQL’s REFERENCES statement to kick back an error whenever a bad value (like ‘bip’) is inserted into the choice field of the noevil table. However, as soon as you move your application to MySQL or sqlite it will happily insert anything you like into that field (with MySQL truncating it to three characters).

This is why it is important for your applications to take into consideration their own referential integrity. Here’s some python to illustrate:

import string
from spdo import *
 
class noevil(object):
    __init__(self, choice):
        self.db = SPDO('test')
        self.choices = []
        mydict = self.db.select("SELECT * FROM noevil_enum_choice")
        for c in mydict:
            self.choices.append(c['choice'])
        if choice not in choices:
            ex[] = ["Invalid value for choice: ", choice, " Valid options are: "]
            for c in self.choices:
                ex.append[c]
                ex.append[" "]
            raise Exception(string.join(ex, ''))
        # continue with normal processing here

This is by no means the entirety of the topic, or even more than merest tip of the iceberg, but it is a good place to start.

3 comments Portable code

You do know that PostgreSQL supports ENUMs, don’t you? Here is the appropriate reference: http://www.postgresql.org/docs/8.3/static/datatype-enum.html.

I was aware of the ability to use CREATE TYPE to create ENUMs in PostgreSQL. However you still have some of the same issues as with using MySQL ENUMs – namely adding a value requires an ALTER statement, and if the table in which it is used is large this can take quite a bit of time. Using a lookup table allows you to simply insert rows when you need new values. Also keep in mind that the PostgreSQL enum datatype label is not directly treated as text. So you have to explicitly cast it as text before making comparisons of that sort. (WHERE ‘foo’ = evil::text).

I can’t say that I’ve ever ran into the big issues you’re talking about with enum. I think part of it has to be that I was a self learner – and never really read into the features of the db so much as I should have. When I started understanding the 1 to N relationships in dbs, I created fk type relationships all the time -never specifying a enum field. True, there were situations where I had really tiny tables – and that wasn’t the most efficient, but it canned this problem. In the case that I would try to insert something, I started using joins and subselects, and defaulting to zero on not found situations. While this isn’t the best practice either, I would hesitate to say that those who are used to working with mysql find themselves using enum columns a lot.

Either way though, interesting point of view. The way I handle your invalid choice is a left join once again, however, then checking if the return type is null. Seems to be a better way to do it than to create another query.

Comments are closed.