Friday, March 13, 2009

A random day

Funny, yesterday I wrote that note about how I wanted my team members to learn a scripting language.

Today, one of my guys comes to me because he has been struggling with a SQL request for most of the morning.
After a few minutes of struggling with him, I just got fed up with his huge jointure that what taking ages AND not working for some reason.

I wrote 30 lines of perl to compare the results of two simpler requests, found the missing rows we were looking for desperately, needed two minutes to understand why we couldn't find them with our SQL and fixed it.

It turns out that NULL fields don't equal other NULL fields and neither of us were aware of that.
NULL is never equal to anything, not even NULL. Well, it makes sense actually, but we had not thought of that.

Learn a scripting language.

6 comments:

Antoine Larcher said...

Would you achieve your task by using a FULL OUTER JOIN ?

Here's the explanation :

r a school. We might first have a list of courses and instructors, as shown below. (Note that, for simplicity's sake, we are making the assumption that there is only one section of each course and, therefore, that we can use it as the primary key.) In this case, the instructor Adams has not yet been assigned to a course and the Computers course has no instructor assigned yet.

Course Instructor
Math Smith
Science Jones
NULL Adams
Computers NULL
Now, let's assume that we have a table containing course registrations for students. Again, for simplicity's sake, we'll assume that each student must take one and only one course. Every student must take a class, so we will include a NULL record for any student whose course registration information is unknown. In this case, we're not sure what course Alan is taking.

Student Course
Ryan Math
Betty Science
Alan NULL
If we perform an INNER JOIN, NULL values do not match each other. For example, if we perform the following query:

SELECT *
FROM Instructors INNER JOIN Students
ON Instructors.Course = Students.Course

We'd get the following result:

Course Instructor Student
Math Smith Ryan
Science Jones Betty
The OUTER JOIN operations includes data from one or both tables that doesn't match data in the other table. There are three types of OUTER JOINs:

The LEFT OUTER JOIN includes rows from the table specified on the left side of the JOIN statement that don't match with rows from the table on the right side of the JOIN statement.
The RIGHT OUTER JOIN includes rows from the table specified on the right side of the JOIN statement that don't match with rows from the table on the left side of the JOIN statement.
The FULL OUTER JOIN includes rows from both tables that don't match data in the other table.
If we performed the following query:

SELECT *
FROM Instructors LEFT OUTER JOIN Students
ON Instructors.Course = Students.Course

We'd get the following result set:

Course Instructor Student
Math Smith Ryan
Science Jones Betty
NULL Adams NULL
Computers NULL NULL
Similarly, the query:

SELECT *
FROM Instructors RIGHT OUTER JOIN Students
ON Instructors.Course = Students.Course

Would give us:

Course Instructor Student
Math Smith Ryan
Science Jones Betty
NULL NULL Alan
And, finally, the query:

SELECT *
FROM Instructors FULL OUTER JOIN Students
ON Instructors.Course = Students.Course

Would yield:

Course Instructor Student
Math Smith Ryan
Science Jones Betty
Computers NULL NULL
NULL NULL Alan
NULL Adams NULL
And that's a look at how NULL values are treated by the JOIN operation!

Loïc said...

yes, probably a good idea.

Two remarks though:
First: the tables were huge and we didn't know that it was a NULL value problem, at first.

Second, we didn't want all the NULL values actually.

But your point is valid, still.

Antoine Larcher said...

The bise man.

amaury.net said...

Well, you should know that "field IS NULL" and "field IS NOT NULL" are right, whereas "field = NULL" and "field != NULL" are wrong. So, "field1 = field2" doesn't work if one of them is null.

MySQL has optimisations for non-null tables, so you should avoid null-able fields.

These things can be learnt without knowing a scripting language, but sure, every programmer MUST know a scripting language (at least SH).

Loïc said...

my point was more on: knowing a scripting language enabled me to tackle the problem from another simpler angle and solve it quickly.
The SQL problem, I knew it, but didn't think about it because I was a bit overwhelmed by the complexity of the query.

amaury.net said...

Yes, I understand your point.

I just said that the same try/error/retry process would be possible using the simple MySQL command line client.

It is necessary to know a scripting language for several usages:
- Do some tests, like in your example.
- Write a quick and dirty prototype of a new functionnality, before coding it "for real".
- Write some administration scripts. It's fool to put a Java program in crontab when a BASH script can do the same job.

It shouldn't be a technical challenge. It must be natural. At least, as natural as opening a log file to search for an error.

Post a Comment

Please leave your comment

Search Results