表示关系数据库中的排序

时间:2022-10-03 16:17:44

I have a collection of objects in a database. Images in a photo gallery, products in a catalog, chapters in a book, etc. Each object is represented as a row. I want to be able to arbitrarily order these images, storing that ordering in the database so when I display the objects, they will be in the right order.

我在数据库中有一个对象集合。图片库中的图像、目录中的产品、书中的章节等等。每个对象都表示为一行。我想要能够任意地对这些图像进行排序,将排序存储在数据库中,这样当我显示对象时,它们的顺序就会是正确的。

For example, let's say I'm writing a book, and each chapter is an object. I write my book, and put the chapters in the following order:

例如,假设我正在写一本书,每一章都是一个对象。我写我的书,把章节按以下顺序排列:

Introduction, Accessibility, Form vs. Function, Errors, Consistency, Conclusion, Index

引言、易读性、形式与功能、错误、一致性、结论、索引

It goes to the editor, and comes back with the following suggested order:

它交给编辑,并返回如下建议的顺序:

Introduction, Form, Function, Accessibility, Consistency, Errors, Conclusion, Index

引言、形式、功能、可及性、一致性、错误、结论、指标

How can I store this ordering in the database in a robust, efficient way?

如何在数据库中以健壮、高效的方式存储此排序?

I've had the following ideas, but I'm not thrilled with any of them:

我有以下想法,但我对其中任何一个都不感冒:

  1. Array. Each row has an ordering ID, when order is changed (via a removal followed by an insertion), the order IDs are updated. This makes retrieval easy, since it's just ORDER BY, but it seems easy to break.

    数组中。每一行都有一个排序ID,当订单被更改时(通过删除然后插入),将更新订单ID。这使得检索很容易,因为它只是按顺序进行,但似乎很容易被破坏。

    // REMOVAL
    UPDATE ... SET orderingID=NULL WHERE orderingID=removedID
    UPDATE ... SET orderingID=orderingID-1 WHERE orderingID > removedID
    // INSERTION
    UPDATE ... SET orderingID=orderingID+1 WHERE orderingID > insertionID
    UPDATE ... SET orderID=insertionID WHERE ID=addedID

    / /删除更新……设置orderingID=NULL,其中orderingID=removedID更新…设置orderingID=orderingID-1,其中orderingID > removedID /插入更新…设置orderingID=orderingID+1,其中orderingID >插入id更新…设置orderID = insertionID ID = addedID的地方

  2. Linked list. Each row has a column for the id of the next row in the ordering. Traversal seems costly here, though there may by some way to use ORDER BY that I'm not thinking of.

    链表。每一行都有一列表示下一行的id。遍历在这里看起来很费钱,尽管在某种程度上可能会使用我没有想到的顺序。

  3. Spaced array. Set the orderingID (as used in #1) to be large, so the first object is 100, the second is 200, etc. Then when an insertion happens, you just place it at (objectBefore + objectAfter)/2. Of course, this would need to be rebalanced occasionally, so you don't have things too close together (even with floats, you'd eventually run into rounding errors).

    间隔的数组。将orderingID(如#1中所示)设置为大,因此第一个对象是100,第二个对象是200,等等。当然,这需要偶尔重新平衡,这样就不会让事情太过紧密(即使使用浮点数,最终也会遇到舍入错误)。

None of these seem particularly elegant to me. Does anyone have a better way to do it?

在我看来,这些都不是特别优雅的。有人有更好的方法吗?

10 个解决方案

#1


5  

An other alternative would be (if your RDBMS supports it) to use columns of type array. While this breaks the normalization rules, it can be useful in situations like this. One database which I know about that has arrays is PostgreSQL.

另一种选择是(如果您的RDBMS支持它)使用类型数组的列。虽然这违反了规范化规则,但在这种情况下,它是有用的。我知道的有数组的数据库是PostgreSQL。

#2


3  

The acts_as_list mixin in Rails handles this basically the way you outlined in #1. It looks for an INTEGER column called position (of which you can override to name of course) and using that to do an ORDER BY. When you want to re-order things you update the positions. It has served me just fine every time I've used it.

acts_as_list mixin Rails基本上按照您在#1中描述的方式处理这个问题。它查找一个名为position的整数列(当然可以将其重写为name),并使用它进行排序。当你想要重新排序时,你需要更新位置。每次我用它的时候它都很好。

As a side note, you can remove the need to always do re-positioning on INSERTS/DELETES by using sparse numbering -- kind of like basic back in the day... you can number your positions 10, 20, 30, etc. and if you need to insert something in between 10 and 20 you just insert it with a position of 15. Likewise when deleting you can just delete the row and leave the gap. You only need to do re-numbering when you actually change the order or if you try to do an insert and there is no appropriate gap to insert into.

作为补充说明,您可以通过使用稀疏编号(类似于以前的基本编号)来消除对插入/删除总是重新定位的需要。你可以把你的位置编号为10 20 30等等,如果你需要在10到20之间插入一些东西你只需要在15的位置插入。同样,在删除时,可以删除行并留下空白。您只需要在实际更改顺序或尝试插入时进行重新编号,并且不需要插入适当的间隙。

Of course depending on your particular situation (e.g. whether you have the other rows already loaded into memory or not) it may or may not make sense to use the gap approach.

当然,根据您的具体情况(例如,是否已经将其他行加载到内存中),使用间隔方法可能有意义,也可能没有意义。

#3


3  

Just a thought considering option #1 vs #3: doesn't the spaced array option (#3) only postpone the problem of the normal array (#1)? Whatever algorithm you choose, either it's broken, and you'll run into problems with #3 later, or it works, and then #1 should work just as well.

考虑选项1和选项3:难道间隔数组选项(#3)不只是推迟了普通数组(#1)的问题吗?不管你选择什么算法,要么它坏了,然后你会遇到#3的问题,要么它能工作,然后#1也能工作。

#4


2  

I'd do a consecutive number, with a trigger on the table that "makes room" for a priority if it already exists.

我将做一个连续的数字,在表上有一个触发器,如果优先级已经存在,它将为优先级“腾出空间”。

#5


2  

If the objects aren't heavily keyed by other tables, and the lists are short, deleting everything in the domain and just re-inserting the correct list is the easiest. But that's not practical if the lists are large and you have lots of constraints to slow down the delete. I think your first method is really the cleanest. If you run it in a transaction you can be sure nothing odd happens while you're in the middle of the update to screw up the order.

如果对象没有被其他表重键化,并且列表很短,那么删除域中的所有内容并重新插入正确的列表是最容易的。但是如果列表很大,并且有很多限制来减缓删除,那就不实际了。我认为你的第一种方法是最干净的。如果您在事务中运行它,您可以确保在更新的过程中没有发生任何奇怪的事情,从而破坏了订单。

#6


2  

Use a floating point number to represent the position of each item:

使用浮点数表示每个项目的位置:

Item 1 -> 0.0

项1 - > 0.0

Item 2 -> 1.0

项目2 - > 1.0

Item 3 -> 2.0

项目3 - > 2.0

Item 4 -> 3.0

项目4 - > 3.0

You can place any item between any other two items by simple bisection:

您可以通过简单的二分法在其他两个项目之间放置任何项目:

Item 1 -> 0.0

项1 - > 0.0

Item 4 -> 0.5

项目4 - > 0.5

Item 2 -> 1.0

项目2 - > 1.0

Item 3 -> 2.0

项目3 - > 2.0

(Moved item 4 between items 1 and 2).

(在项目1和2之间移动项目4)。

The bisection process can continue almost indefinitely due to the way floating point numbers are encoded in a computer system.

由于浮点数是在计算机系统中编码的,所以bisection进程几乎可以无限延长。

Item 4 -> 0.5

项目4 - > 0.5

Item 1 -> 0.75

项1 - > 0.75

Item 2 -> 1.0

项目2 - > 1.0

Item 3 -> 2.0

项目3 - > 2.0

(Move item 1 to the position just after Item 4)

(将项目1移到项目4后的位置)

#7


1  

I did this in my last project, but it was for a table that only occasionally needed to be specifically ordered, and wasn't accessed too often. I think the spaced array would be the best option, because it reordering would be cheapest in the average case, just involving a change to one value and a query on two).

我在上一个项目中做了这一点,但它是针对一个表的,只是偶尔需要特别排序,而且不太经常访问。我认为间隔数组是最好的选择,因为在一般情况下,它的重新排序是最便宜的,只涉及到对一个值的更改和对两个值的查询)。

Also, I would imagine ORDER BY would be pretty heavily optimized by database vendors, so leveraging that function would be advantageous for performance as opposed to the linked list implementation.

另外,我认为ORDER BY会被数据库供应商大量优化,所以利用这个功能比链表实现更有利于性能。

#8


1  

I had this problem as well. I was under heavy time pressure (aren't we all) and I went with option #1, and only updated rows that changed.

我也有这个问题。我承受着巨大的时间压力(不是我们所有人),我选择了选项1,只更新了更改过的行。

If you swap item 1 with item 10, just do two updates to update the order numbers of item 1 and item 10. I know it is algorithmically simple, and it is O(n) worst case, but that worst case is when you have a total permutation of the list. How often is that going to happen? That's for you to answer.

如果您将项目1与项目10交换,只需进行两次更新,以更新项目1和项目10的订单号。我知道这个算法很简单,它是O(n)最坏的情况,但最坏的情况是当你有列表的全部排列时。这种情况多久发生一次?这是你要回答的问题。

#9


1  

Since I've mostly run into this with Django, I've found this solution to be the most workable. It seems that there isn't any "right way" to do this in a relational database.

由于我主要是在Django中遇到这个问题,所以我发现这个解决方案是最可行的。在关系数据库中似乎没有任何“正确的方法”来实现这一点。

#10


0  

I had the same issue and have probably spent at least a week concerning myself about the proper data modeling, but I think I've finally got it. Using the array datatype in PostgreSQL, you can store the primary key of each ordered item and update that array accordingly using insertions or deletions when your order changes. Referencing a single row will allow you to map all your objects based on the ordering in the array column.

我也遇到了同样的问题,我可能花了至少一个星期的时间来考虑如何进行正确的数据建模,但我想我终于得到了答案。使用PostgreSQL中的数组数据类型,您可以存储每个有序项的主键,并在订单更改时使用插入或删除相应地更新该数组。引用一行可以根据数组列中的顺序映射所有对象。

It's still a bit choppy of a solution but it will likely work better than option #1, since option 1 requires updating the order number of all the other rows when ordering changes.

这仍然是一个有点不稳定的解决方案,但是它可能比选项#1工作得更好,因为选项1要求在排序更改时更新所有其他行的订单号。

#1


5  

An other alternative would be (if your RDBMS supports it) to use columns of type array. While this breaks the normalization rules, it can be useful in situations like this. One database which I know about that has arrays is PostgreSQL.

另一种选择是(如果您的RDBMS支持它)使用类型数组的列。虽然这违反了规范化规则,但在这种情况下,它是有用的。我知道的有数组的数据库是PostgreSQL。

#2


3  

The acts_as_list mixin in Rails handles this basically the way you outlined in #1. It looks for an INTEGER column called position (of which you can override to name of course) and using that to do an ORDER BY. When you want to re-order things you update the positions. It has served me just fine every time I've used it.

acts_as_list mixin Rails基本上按照您在#1中描述的方式处理这个问题。它查找一个名为position的整数列(当然可以将其重写为name),并使用它进行排序。当你想要重新排序时,你需要更新位置。每次我用它的时候它都很好。

As a side note, you can remove the need to always do re-positioning on INSERTS/DELETES by using sparse numbering -- kind of like basic back in the day... you can number your positions 10, 20, 30, etc. and if you need to insert something in between 10 and 20 you just insert it with a position of 15. Likewise when deleting you can just delete the row and leave the gap. You only need to do re-numbering when you actually change the order or if you try to do an insert and there is no appropriate gap to insert into.

作为补充说明,您可以通过使用稀疏编号(类似于以前的基本编号)来消除对插入/删除总是重新定位的需要。你可以把你的位置编号为10 20 30等等,如果你需要在10到20之间插入一些东西你只需要在15的位置插入。同样,在删除时,可以删除行并留下空白。您只需要在实际更改顺序或尝试插入时进行重新编号,并且不需要插入适当的间隙。

Of course depending on your particular situation (e.g. whether you have the other rows already loaded into memory or not) it may or may not make sense to use the gap approach.

当然,根据您的具体情况(例如,是否已经将其他行加载到内存中),使用间隔方法可能有意义,也可能没有意义。

#3


3  

Just a thought considering option #1 vs #3: doesn't the spaced array option (#3) only postpone the problem of the normal array (#1)? Whatever algorithm you choose, either it's broken, and you'll run into problems with #3 later, or it works, and then #1 should work just as well.

考虑选项1和选项3:难道间隔数组选项(#3)不只是推迟了普通数组(#1)的问题吗?不管你选择什么算法,要么它坏了,然后你会遇到#3的问题,要么它能工作,然后#1也能工作。

#4


2  

I'd do a consecutive number, with a trigger on the table that "makes room" for a priority if it already exists.

我将做一个连续的数字,在表上有一个触发器,如果优先级已经存在,它将为优先级“腾出空间”。

#5


2  

If the objects aren't heavily keyed by other tables, and the lists are short, deleting everything in the domain and just re-inserting the correct list is the easiest. But that's not practical if the lists are large and you have lots of constraints to slow down the delete. I think your first method is really the cleanest. If you run it in a transaction you can be sure nothing odd happens while you're in the middle of the update to screw up the order.

如果对象没有被其他表重键化,并且列表很短,那么删除域中的所有内容并重新插入正确的列表是最容易的。但是如果列表很大,并且有很多限制来减缓删除,那就不实际了。我认为你的第一种方法是最干净的。如果您在事务中运行它,您可以确保在更新的过程中没有发生任何奇怪的事情,从而破坏了订单。

#6


2  

Use a floating point number to represent the position of each item:

使用浮点数表示每个项目的位置:

Item 1 -> 0.0

项1 - > 0.0

Item 2 -> 1.0

项目2 - > 1.0

Item 3 -> 2.0

项目3 - > 2.0

Item 4 -> 3.0

项目4 - > 3.0

You can place any item between any other two items by simple bisection:

您可以通过简单的二分法在其他两个项目之间放置任何项目:

Item 1 -> 0.0

项1 - > 0.0

Item 4 -> 0.5

项目4 - > 0.5

Item 2 -> 1.0

项目2 - > 1.0

Item 3 -> 2.0

项目3 - > 2.0

(Moved item 4 between items 1 and 2).

(在项目1和2之间移动项目4)。

The bisection process can continue almost indefinitely due to the way floating point numbers are encoded in a computer system.

由于浮点数是在计算机系统中编码的,所以bisection进程几乎可以无限延长。

Item 4 -> 0.5

项目4 - > 0.5

Item 1 -> 0.75

项1 - > 0.75

Item 2 -> 1.0

项目2 - > 1.0

Item 3 -> 2.0

项目3 - > 2.0

(Move item 1 to the position just after Item 4)

(将项目1移到项目4后的位置)

#7


1  

I did this in my last project, but it was for a table that only occasionally needed to be specifically ordered, and wasn't accessed too often. I think the spaced array would be the best option, because it reordering would be cheapest in the average case, just involving a change to one value and a query on two).

我在上一个项目中做了这一点,但它是针对一个表的,只是偶尔需要特别排序,而且不太经常访问。我认为间隔数组是最好的选择,因为在一般情况下,它的重新排序是最便宜的,只涉及到对一个值的更改和对两个值的查询)。

Also, I would imagine ORDER BY would be pretty heavily optimized by database vendors, so leveraging that function would be advantageous for performance as opposed to the linked list implementation.

另外,我认为ORDER BY会被数据库供应商大量优化,所以利用这个功能比链表实现更有利于性能。

#8


1  

I had this problem as well. I was under heavy time pressure (aren't we all) and I went with option #1, and only updated rows that changed.

我也有这个问题。我承受着巨大的时间压力(不是我们所有人),我选择了选项1,只更新了更改过的行。

If you swap item 1 with item 10, just do two updates to update the order numbers of item 1 and item 10. I know it is algorithmically simple, and it is O(n) worst case, but that worst case is when you have a total permutation of the list. How often is that going to happen? That's for you to answer.

如果您将项目1与项目10交换,只需进行两次更新,以更新项目1和项目10的订单号。我知道这个算法很简单,它是O(n)最坏的情况,但最坏的情况是当你有列表的全部排列时。这种情况多久发生一次?这是你要回答的问题。

#9


1  

Since I've mostly run into this with Django, I've found this solution to be the most workable. It seems that there isn't any "right way" to do this in a relational database.

由于我主要是在Django中遇到这个问题,所以我发现这个解决方案是最可行的。在关系数据库中似乎没有任何“正确的方法”来实现这一点。

#10


0  

I had the same issue and have probably spent at least a week concerning myself about the proper data modeling, but I think I've finally got it. Using the array datatype in PostgreSQL, you can store the primary key of each ordered item and update that array accordingly using insertions or deletions when your order changes. Referencing a single row will allow you to map all your objects based on the ordering in the array column.

我也遇到了同样的问题,我可能花了至少一个星期的时间来考虑如何进行正确的数据建模,但我想我终于得到了答案。使用PostgreSQL中的数组数据类型,您可以存储每个有序项的主键,并在订单更改时使用插入或删除相应地更新该数组。引用一行可以根据数组列中的顺序映射所有对象。

It's still a bit choppy of a solution but it will likely work better than option #1, since option 1 requires updating the order number of all the other rows when ordering changes.

这仍然是一个有点不稳定的解决方案,但是它可能比选项#1工作得更好,因为选项1要求在排序更改时更新所有其他行的订单号。