如何在PostgreSQL中删除一个固定的行数?

时间:2022-06-01 18:17:54

I'm trying to port some old MySQL queries to PostgreSQL, but I'm having trouble with this one:

我试图将一些旧的MySQL查询移植到PostgreSQL,但是我在这方面遇到了麻烦:

DELETE FROM logtable ORDER BY timestamp LIMIT 10;

PostgreSQL doesn't allow ordering or limits in its delete syntax, and the table doesn't have a primary key so I can't use a subquery. Additionally, I want to preserve the behavior where the query deletes exactly the given number or records -- for example, if the table contains 30 rows but they all have the same timestamp, I still want to delete 10, although it doesn't matter which 10.

PostgreSQL在其删除语法中不允许排序或限制,而且该表没有主键,因此我不能使用子查询。此外,我还希望保留查询删除给定数字或记录的行为——例如,如果表包含30行,但它们都有相同的时间戳,我仍然希望删除10行,尽管哪个10行并不重要。

So; how do I delete a fixed number of rows with sorting in PostgreSQL?

所以;如何删除PostgreSQL中的固定行数?

Edit: No primary key means there's no log_id column or similar. Ah, the joys of legacy systems!

编辑:没有主键意味着没有log_id列或类似的列。啊,遗留系统的乐趣!

5 个解决方案

#1


104  

You could try using the ctid:

你可以试试ctid:

DELETE FROM logtable
WHERE ctid IN (
    SELECT ctid
    FROM logtable
    ORDER BY timestamp
    LIMIT 10
)

The ctid is:

ctid是:

The physical location of the row version within its table. Note that although the ctid can be used to locate the row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore ctid is useless as a long-term row identifier.

在其表内的行版本的物理位置。请注意,尽管ctid可以非常快速地定位到行版本,但是如果它被更新或被真空填充,那么该行的ctid将会发生变化。因此ctid作为长期行标识符是无用的。

There's also oid but that only exists if you specifically ask for it when you create the table.

也有oid,但只有当你在创建表时特别要求它才会存在。

#2


30  

Postgres docs recommend to use array instead of IN and subquery. This should work much faster

Postgres文档建议使用数组而不是IN和subquery。这应该会快得多

DELETE FROM logtable 
WHERE id = any (array(SELECT id FROM logtable ORDER BY timestamp LIMIT 10));

This and some other tricks can be found here

这里还有其他一些技巧。

#3


10  

delete from logtable where log_id in (
    select log_id from logtable order by timestamp limit 10);

#4


2  

Assuming you want to delete ANY 10 records (without the ordering) you could do this:

假设您想删除任何10条记录(没有订购),您可以这样做:

DELETE FROM logtable as t1 WHERE t1.ctid < (select t2.ctid from logtable as t2  where (Select count(*) from logtable t3  where t3.ctid < t2.ctid ) = 10 LIMIT 1);

For my use case, deleting 10M records, this turned out to be faster.

对于我的用例来说,删除10M记录,结果会更快。

#5


1  

You could write a procedure which loops over the delete for individual lines, the procedure could take a parameter to specify the number of items you want to delete. But that's a bit overkill compared to MySQL.

您可以编写一个循环遍历删除的过程,该过程可以使用一个参数来指定要删除的项的数量。但与MySQL相比,这有点过头了。

#1


104  

You could try using the ctid:

你可以试试ctid:

DELETE FROM logtable
WHERE ctid IN (
    SELECT ctid
    FROM logtable
    ORDER BY timestamp
    LIMIT 10
)

The ctid is:

ctid是:

The physical location of the row version within its table. Note that although the ctid can be used to locate the row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore ctid is useless as a long-term row identifier.

在其表内的行版本的物理位置。请注意,尽管ctid可以非常快速地定位到行版本,但是如果它被更新或被真空填充,那么该行的ctid将会发生变化。因此ctid作为长期行标识符是无用的。

There's also oid but that only exists if you specifically ask for it when you create the table.

也有oid,但只有当你在创建表时特别要求它才会存在。

#2


30  

Postgres docs recommend to use array instead of IN and subquery. This should work much faster

Postgres文档建议使用数组而不是IN和subquery。这应该会快得多

DELETE FROM logtable 
WHERE id = any (array(SELECT id FROM logtable ORDER BY timestamp LIMIT 10));

This and some other tricks can be found here

这里还有其他一些技巧。

#3


10  

delete from logtable where log_id in (
    select log_id from logtable order by timestamp limit 10);

#4


2  

Assuming you want to delete ANY 10 records (without the ordering) you could do this:

假设您想删除任何10条记录(没有订购),您可以这样做:

DELETE FROM logtable as t1 WHERE t1.ctid < (select t2.ctid from logtable as t2  where (Select count(*) from logtable t3  where t3.ctid < t2.ctid ) = 10 LIMIT 1);

For my use case, deleting 10M records, this turned out to be faster.

对于我的用例来说,删除10M记录,结果会更快。

#5


1  

You could write a procedure which loops over the delete for individual lines, the procedure could take a parameter to specify the number of items you want to delete. But that's a bit overkill compared to MySQL.

您可以编写一个循环遍历删除的过程,该过程可以使用一个参数来指定要删除的项的数量。但与MySQL相比,这有点过头了。