请推荐最佳的批量删除选项

时间:2023-01-16 15:10:35

I'm using PostgreSQL 8.1.4. I've 3 tables: one being the core (table1), others are dependents (table2,table3). I inserted 70000 records in table1 and appropriate related records in other 2 tables. As I'd used CASCADE, I could able to delete the related records using DELETE FROM table1; It works fine when the records are minimal in my current PostgreSQL version. When I've a huge volume of records, it tries to delete all but there is no sign of deletion progress for many hours! Whereas, bulk import, does in few minutes. I wish to do bulk-delete in reasonable minutes. I tried TRUNCATE also. Like, TRUNCATE table3, table2,table1; No change in performance though. It just takes more time, and no sign of completion! From the net, I got few options, like, deleting all constraints and then recreating the same would be fine. But, no query seems to be successfully run over 'table1' when it's loaded more data! Please recommend me the best solutions to delete all the records in minutes.

我正在使用PostgreSQL 8.1.4。我有3个表:一个是核心(table1),另一个是dependents(table2,table3)。我在table1中插入了70000条记录,在其他2个表中插入了相应的相关记录。由于我使用了CASCADE,我可以使用DELETE FROM table1删除相关记录;当我的当前PostgreSQL版本中的记录最小时,​​它工作正常。当我有大量的记录时,它会尝试删除所有记录,但是很多小时都没有删除进度的迹象!然而,批量导入,在几分钟内完成。我希望在合理的时间内进行批量删除。我也试过TRUNCATE。比如,TRUNCATE table3,table2,table1;虽然性能没有变化。它只需要更多时间,而且没有完成的迹象!从网上,我有几个选项,比如,删除所有约束,然后重新创建相同的将是好的。但是,当它加载更多数据时,似乎没有查询成功运行'table1'!请推荐我在几分钟内删除所有记录的最佳解决方案。

CREATE TABLE table1(
        t1_id   SERIAL PRIMARY KEY,
        disp_name       TEXT NOT NULL DEFAULT '',
        last_updated TIMESTAMP NOT NULL DEFAULT current_timestamp,
        UNIQUE(disp_name)
    ) WITHOUT OIDS;

CREATE UNIQUE INDEX disp_name_index on table1(upper(disp_name));

CREATE TABLE table2 (
        t2_id           SERIAL PRIMARY KEY,
        t1_id   INTEGER REFERENCES table1 ON DELETE CASCADE,
        type    TEXT
    ) WITHOUT OIDS;

CREATE TABLE table3 (
        t3_id           SERIAL PRIMARY KEY,
        t1_id   INTEGER REFERENCES table1 ON DELETE CASCADE,
        config_key      TEXT,
        config_value    TEXT
    ) WITHOUT OIDS;

Regards, Siva.

3 个解决方案

#1


2  

You can create an index on the columns on the child tables which reference the parent table:

您可以在引用父表的子表的列上创建索引:

on table2 create an index on the t1_id column

在table2上创建t1_id列的索引

on table3 create an index on the t1_id column

在table3上创建t1_id列的索引

that should speed things up slightly.

这应该会略微提高速度。

And/or, don't bother with the on delete cascade, make a delete stored procedure which deletes first from the child tables and then from the parent table, it may be faster than letting postgresql do it for you.

和/或者,不要打扰删除级联,使删除存储过程首先从子表中删除,然后从父表删除,它可能比让postgresql为您执行更快。

#2


0  

In SQL, the TRUNCATE TABLE statement is a Data Definition Language (DDL) operation that marks the extents of a table for deallocation (empty for reuse). The result of this operation quickly removes all data from a table, typically bypassing a number of integrity enforcing mechanisms. http://en.wikipedia.org/wiki/Truncate_(SQL)

在SQL中,TRUNCATE TABLE语句是一种数据定义语言(DDL)操作,它标记用于释放的表的范围(为了重用而为空)。此操作的结果会快速删除表中的所有数据,通常会绕过许多完整性强制机制。 http://en.wikipedia.org/wiki/Truncate_(SQL)

So truncate should be very fast. In your case, it looks like that you have a transaction which is not committed nor rollbacked. In that case your delete transaction will never finish.

截断应该非常快。在您的情况下,看起来您有一个未提交或回滚的事务。在这种情况下,您的删除交易永远不会完成。

To solve this problem, you should check your active transactions in your database. The easiest way (at least under SQL Server, it works) is to write "ROLLBACK COMMIT;" into the query window and execute it. If it executes without throwing an error, it means that there were actually an active transaction. If there is no active transaction remaining, it will give you an error.

要解决此问题,您应该检查数据库中的活动事务。最简单的方法(至少在SQL Server下,它可以工作)是写“ROLLBACK COMMIT;”进入查询窗口并执行它。如果它执行而不抛出错误,则意味着实际上存在活动事务。如果没有剩余活动交易,则会给您一个错误。

#3


0  

I would bet that you miss some indices on the database too.

我敢打赌你也错过了数据库中的一些索引。

If you issue the delete command from psql console, just hit Ctrl-C - the transaction will get interrupted and psql should inform you which query was being executed when you interrupted it.

如果从psql控制台发出delete命令,只需按Ctrl-C - 事务将被中断,psql应该在您中断时通知您正在执行哪个查询。

Then use EXPLAIN to check why the query takes so long.

然后使用EXPLAIN来检查查询为什么需要这么长时间。

I had a similar situation recently and adding an index solved the problem.

最近我遇到了类似的情况,添加一个索引解决了这个问题。

#1


2  

You can create an index on the columns on the child tables which reference the parent table:

您可以在引用父表的子表的列上创建索引:

on table2 create an index on the t1_id column

在table2上创建t1_id列的索引

on table3 create an index on the t1_id column

在table3上创建t1_id列的索引

that should speed things up slightly.

这应该会略微提高速度。

And/or, don't bother with the on delete cascade, make a delete stored procedure which deletes first from the child tables and then from the parent table, it may be faster than letting postgresql do it for you.

和/或者,不要打扰删除级联,使删除存储过程首先从子表中删除,然后从父表删除,它可能比让postgresql为您执行更快。

#2


0  

In SQL, the TRUNCATE TABLE statement is a Data Definition Language (DDL) operation that marks the extents of a table for deallocation (empty for reuse). The result of this operation quickly removes all data from a table, typically bypassing a number of integrity enforcing mechanisms. http://en.wikipedia.org/wiki/Truncate_(SQL)

在SQL中,TRUNCATE TABLE语句是一种数据定义语言(DDL)操作,它标记用于释放的表的范围(为了重用而为空)。此操作的结果会快速删除表中的所有数据,通常会绕过许多完整性强制机制。 http://en.wikipedia.org/wiki/Truncate_(SQL)

So truncate should be very fast. In your case, it looks like that you have a transaction which is not committed nor rollbacked. In that case your delete transaction will never finish.

截断应该非常快。在您的情况下,看起来您有一个未提交或回滚的事务。在这种情况下,您的删除交易永远不会完成。

To solve this problem, you should check your active transactions in your database. The easiest way (at least under SQL Server, it works) is to write "ROLLBACK COMMIT;" into the query window and execute it. If it executes without throwing an error, it means that there were actually an active transaction. If there is no active transaction remaining, it will give you an error.

要解决此问题,您应该检查数据库中的活动事务。最简单的方法(至少在SQL Server下,它可以工作)是写“ROLLBACK COMMIT;”进入查询窗口并执行它。如果它执行而不抛出错误,则意味着实际上存在活动事务。如果没有剩余活动交易,则会给您一个错误。

#3


0  

I would bet that you miss some indices on the database too.

我敢打赌你也错过了数据库中的一些索引。

If you issue the delete command from psql console, just hit Ctrl-C - the transaction will get interrupted and psql should inform you which query was being executed when you interrupted it.

如果从psql控制台发出delete命令,只需按Ctrl-C - 事务将被中断,psql应该在您中断时通知您正在执行哪个查询。

Then use EXPLAIN to check why the query takes so long.

然后使用EXPLAIN来检查查询为什么需要这么长时间。

I had a similar situation recently and adding an index solved the problem.

最近我遇到了类似的情况,添加一个索引解决了这个问题。