Now, creating the insert statements are a breeze with the "insert into tbl0(...) values(...),(...),..." MySQL syntax. I can push thousands of inserts in a single query, the only limit being the MySQL packet size, hence the 'ish' in the title. But what about updates? before now I've resorted to just running the updates as single queries, which lets face it, is slow. So lets look at how we can speed this up.
First create a test table:
create table test0(sid int, staffid varchar(255), name varchar(255), unique(sid, staffid));
Insert some data:
insert into test0(sid, staffid, name) values(1, 'ar', 'Andrew'), (1, 'jm', 'John'), (1, 'sl', 'Seb');
Now we want to turn:
update test1 set name = 'Andrew 2' where sid = 1 and staffid = 'ar';
update test1 set name = 'John 2' where sid = 1 and staffid = 'jm';
update test1 set name = 'Seb 2' where sid = 1 and staffid = 'sl';
Into a single query, and we do this like so:
insert into test0(sid, staffid, name) values(1, 'ar', 'Andrew 2'),
(1, 'jm', 'John 2'),
(1, 'sl', 'Seb 2') on
duplicate key update test0.name = values(name);
`test0` should now have been updated with the new values. The key to all this is using "on duplicate key update".
I should point out, for the uninitiated that you need a unique key for this to work. In this case it is (sid, staffid) plus there is no need to update the unique keys.
Initial tests show a two thirds increase in speed when we are:
- Hitting MySQL from a PHP script.
- Breaking the updates into 1024k chunks.
- Connecting to MySQL on the localhost
Drawbacks and Unknowns
- I've not tested this against multiple update queries sent in one go. i.e. "update tbl0 set col1 = 1 where col0 = 0; update tbl0 set col1 = 2 where col0 =1; ...".
- Increased memory use from the size of the insert statement. However I took the chunks down to 500k and even 250k and lost no more than 5-10% performance.
- If your using PHP watch the memory limit. Also keep your chunks below the max allowed packet size.