View and Union very slow

December 11th, 2008 robin Posted in MySQL No Comments »

I’ve found quite a few articles like this one from MySQL selling views as the perfect way to join archive data with live data to provide seamless access.

What I wanted: keep archive data in a compact, non modifiable table (I chose packed MyISAM tables), and live data in a transactional table (I chose InnoDB), possibly even on two different machines.  I was thrilled when I found the article linked above, with the sweet idea how to create a view which would combine both, and imagined that while it would make a performance hit, it shouldn’t be that bad, because the data in the packed, optimised MyISAM tables should have much faster access than the InnoDB, and given that the volume of the data is in there, and all select conditions are on indexed columns, it should be pretty fast.

WRONG!

Views using a UNION are very poorly optimised, and create complete temporary tables before matching any where conditions.  Even on a COUNT(*) which should not even require /any/ data to be retreived, it took significantly longer.

Here by example:

Get the sample employees database here, and load it.

Create a test database, and a salaries table in it:

(root@127.0.0.1) [employees]>CREATE DATABASE `test`;
(root@127.0.0.1) [employees]>CREATE TABLE `test`.`salaries` (
  `emp_no` int(11) NOT NULL,
  `salary` int(11) NOT NULL,
  `from_date` date NOT NULL,
  `to_date` date NOT NULL,
  PRIMARY KEY (`emp_no`,`from_date`),
  KEY `emp_no` (`emp_no`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Then split the table salaries:

(root@127.0.0.1) [employees]>INSERT INTO `test`.`salaries` SELECT * FROM `employees`.`salaries` WHERE emp_no<242293;
(root@127.0.0.1) [employees]>DELETE FROM `employees`.`salaries` WHERE emp_no<242293;

How long does it take to count each one?

(root@127.0.0.1) [employees]>SELECT SQL_NO_CACHE COUNT(*) FROM `employees`.`salaries`;
+----------+
| COUNT(*) |
+----------+
|  1492975 |
+----------+
1 row in set (0.30 sec)
(root@127.0.0.1) [employees]>SELECT SQL_NO_CACHE COUNT(*) FROM `test`.`salaries`;
+----------+
| COUNT(*) |
+----------+
|  1351072 |
+----------+
1 row in set (0.29 sec)

That’s ok, and we see from the explain that it is indeed using a SIMPLE select, using the index on epm_no:

(root@127.0.0.1) [employees]>EXPLAIN SELECT SQL_NO_CACHE COUNT(*) FROM `test`.`salaries`;
+----+-------------+----------+-------+---------------+--------+---------+------+---------+-------------+
| id | select_type | table    | type  | possible_keys | key    | key_len | ref  | rows    | Extra       |
+----+-------------+----------+-------+---------------+--------+---------+------+---------+-------------+
|  1 | SIMPLE      | salaries | index | NULL          | emp_no | 4       | NULL | 1351439 | Using index |
+----+-------------+----------+-------+---------------+--------+---------+------+---------+-------------+
1 row in set (0.01 sec)

So let’s make a view and join the tables, and try that again on the view:

(root@127.0.0.1) [employees]>CREATE VIEW salaries_all AS SELECT * FROM `employees`.`salaries` UNION ALL SELECT * FROM `test`.`salaries`;
(root@127.0.0.1) [employees]>EXPLAIN SELECT SQL_NO_CACHE COUNT(*) FROM `employees`.`salaries_all`;
+----+--------------+------------+------+---------------+------+---------+------+---------+------------------------------+
| id | select_type  | table      | type | possible_keys | key  | key_len | ref  | rows    | Extra                        |
+----+--------------+------------+------+---------------+------+---------+------+---------+------------------------------+
|  1 | PRIMARY      | NULL       | NULL | NULL          | NULL | NULL    | NULL |    NULL | Select tables optimized away |
|  2 | DERIVED      | salaries   | ALL  | NULL          | NULL | NULL    | NULL | 2713588 |                              |
|  3 | UNION        | salaries   | ALL  | NULL          | NULL | NULL    | NULL | 1351439 |                              |
| NULL | UNION RESULT | <union2,3> | ALL  | NULL          | NULL | NULL    | NULL |    NULL |                              |
+----+--------------+------------+------+---------------+------+---------+------+---------+------------------------------+
4 rows in set (9.85 sec)

Oh my god!

Even a query with sub-queries is faster by a factor of 20!

(root@127.0.0.1) [employees]>SELECT SQL_NO_CACHE ( SELECT COUNT(*) FROM `test`.`salaries` ) + ( SELECT COUNT(*) FROM `employees`.`salaries` );
+----------------------------------------------------------------------------------------------+
| ( SELECT COUNT(*) FROM `test`.`salaries` ) + ( SELECT COUNT(*) FROM `employees`.`salaries` ) |
+----------------------------------------------------------------------------------------------+
|                                                                                      2844047 |
+----------------------------------------------------------------------------------------------+
1 row in set (0.56 sec)
(root@127.0.0.1) [employees]>EXPLAIN SELECT SQL_NO_CACHE ( SELECT COUNT(*) FROM `test`.`salaries` ) + ( SELECT COUNT(*) FROM `employees`.`salaries` );
+----+-------------+----------+-------+---------------+--------+---------+------+---------+----------------+
| id | select_type | table    | type  | possible_keys | key    | key_len | ref  | rows    | Extra          |
+----+-------------+----------+-------+---------------+--------+---------+------+---------+----------------+
|  1 | PRIMARY     | NULL     | NULL  | NULL          | NULL   | NULL    | NULL |    NULL | No tables used |
|  3 | SUBQUERY    | salaries | index | NULL          | emp_no | 4       | NULL | 2713588 | Using index    |
|  2 | SUBQUERY    | salaries | index | NULL          | emp_no | 4       | NULL | 1351439 | Using index    |
+----+-------------+----------+-------+---------------+--------+---------+------+---------+----------------+
3 rows in set (0.00 sec)

I guess for now I’ll have to wait till views are optimised before I can use them like this…

AddThis Social Bookmark Button

A flood at work!

October 1st, 2008 robin Posted in Work, panic No Comments »

What a morning!  I was just introducing the new programmer to the company, showing him his desk and explaining the processes, when I heard water running… just a bit unusual in an office…  a glance behind towards my desk, and I see the start of a trickle coming out of the panel above my desk.  I quickly pulled my desk out of the way, as the trickle turned into a torrent!  As we found out later, the building contractors who are renovating the building have been re-connecting old heating pipes, and every time they have done it, a pipe has leaked somewhere… but rather than warn people, and give them a number where to call to have it turned off in an emergency, they just thought they’d try it out… so about 15 minutes later when we finally got the builders to turn off the water we already had several hundred liters of dirty old heating water in an expanding circle, only a small amount of it being caught by the buckets we quickly spread out underneath, or the plastic sheet with which we tried to guide it out the window with…  What a morning!

AddThis Social Bookmark Button