Removing lots of data from a DB table

by Karanbir Singh Email

If you need to remove large amounts of data from a table ( specially when you need to remove *all* the data in a table), remember its faster to drop the table, and recreate it rather than doing a 'delete from tablename;'.

eg: this is against a postgresql db:

drop table rawdata;

Response time: 0m 2s 433ms  Total time: 0m 2s 433ms

CREATE TABLE "rawdata" ("ip" char(33), "sdate" varchar(20) NOT NULL,
 "key" varchar(100) NOT NULL,"country" varchar(2), "region" varchar(2),
  "city" varchar(40), "machine" int);

Response time: 0m 0s 337ms  Total time: 0m 0s 337ms

Whereas:

delete from rawdata;
200000000 row(s) affected

Response time: 0m 15s 353ms  Total time: 0m 15s 353ms

In some cases, where you want to drop most ( eg > 80% ) of the data in the table, its faster to select the data to be retained into a temp table, drop the original table and rebuild the temp table to the original name.

- KB

3 comments

Comment from: artms [Visitor]
artmsYou can also "truncate table rawdata" which is also faster then delete
26/May/2009 @ 11:05
Comment from: Karanbir Singh [Member] Email
Karanbir Singhartms,

'truncate' works quite nicely too. Thanks for pointing that out - however I've had issues with truncate on mysql db's when there were external references / foreign keys.
26/May/2009 @ 12:00
Comment from: Scott H [Visitor]
Scott HJust be careful with this. Obviously if you're deleting a whole lot of data you must be "in charge" of that table: but DELETE is friendlier than DROP to things that rely on triggers such as replication. Marcus Hagander recently posted a good tip for administrators to keep on top of this sort of thing:

http://blog.hagander.net/archives/142-Why-are-you-not-logging-your-DDL.html
26/May/2009 @ 17:39