Ruby Array按2种不同方式排序

I have an array of objects that I'm trying to sort by multiple criteria. Most of the comparisons are just by doing <=> on their hashes, so using sort_by is very fast, but one of them is more complex.

我有一系列对象,我试图按多个标准排序。大多数比较只是对它们的哈希值进行<=>,因此使用sort_by非常快,但其中一个更复杂。

The array is of soccer teams, and it's currently being sorted like this:

这个阵列是足球队的,目前它的排序方式如下:

teams.sort_by { |item| [item.points, item.goal_dif, item.goals] }

However, in case at last 2 teams have identical values on those 3 fields, I want the tiebreaker to be a function that I made, a_beat_b(teamA, teamB).

但是,如果最后2支球队在这3个领域拥有相同的值,我希望决胜局是我所做的一个功能,a_beat_b(teamA,teamB)。

I tried using Array.sort, but it's extremely slow compared to sort_by for those first few... My implementation was like this:

我尝试使用Array.sort,但与sort_by相比,前几个人的速度非常慢......我的实现是这样的:

teams.sort ( |a,b| [a.points, a.goals_dif, a.goals] <=> [b.points, b.goals_dif, b.goals])
It was very slow compared to sort_by. The functions for points, goals_dif and goals require some simple queries, but it gets bogged down if it has to do hundreds.

teams.sort(| a,b | [a.points,a.goals_dif,a.goals] <=> [b.points,b.goals_dif,b.goals])与sort_by相比,它非常慢。 points,goals_dif和目标的函数需要一些简单的查询,但如果它必须做数百个就会陷入困境。

I'm not very good at Ruby, so not sure where to put my a_beats_b in there. (It returns 1, 0 or -1 if A beat, drew or lost to B, repsectively)

我不擅长Ruby,所以不确定把我的a_beats_b放在哪里。 (如果A击败,则返回1,0或-1,对B进行抽取或丢失,具有代表性)

4 个解决方案

#1

I tried using Array.sort, but it's extremely slow compared to sort_by for those first few

我尝试使用Array.sort,但与sort_by相比,前几个版本的速度非常慢

This is because sort calls the given block several times. Here's an example to show what's going on under the hood: (sorting "apple", "pear" and "fig" by length)

这是因为sort多次调用给定的块。这是一个示例,以显示引擎盖下发生的事情:(按长度排序“苹果”,“梨”和“无花果”)

def length(str)
  puts "calculating #{str.inspect}.length"
  str.length
end

array = %w{apple pear fig}
array.sort { |a, b| length(a) <=> length(b) }
#=> ["fig", "pear", "apple"]

Output from our length method:

我们的长度方法输出:

calculating "apple".length
calculating "pear".length
calculating "apple".length
calculating "fig".length
calculating "pear".length
calculating "fig".length

As you can see, length is called multiple times during the sort. Imagine that these are database queries.

如您所见,在排序期间多次调用长度。想象一下,这些是数据库查询。

sort_by on the other hand calls the block once for each element, building an internal mapping:

另一方面,sort_by为每个元素调用一次块,构建一个内部映射:

array.sort_by { |a| length(a) }
#=> ["fig", "pear", "apple"]

Output:

calculating "apple".length
calculating "pear".length
calculating "fig".length

For expensive operations (like database queries), this is much faster. But it's also less flexible – you can't dynamically compare a and b any more.

对于昂贵的操作(如数据库查询),这要快得多。但它也不太灵活 - 你无法再动态比较a和b。

You can however store the results of your (expensive) operation, for example by using a hash: (this is called memoization)

但是,您可以存储(昂贵的)操作的结果,例如使用哈希:(这称为memoization)

hash = Hash.new { |h, k| h[k] = length(k) }

And use the hash within sort:

并在sort中使用hash:

array.sort { |a, b| hash[a] <=> hash[b] }
# calculating "apple".length
# calculating "pear".length
# calculating "fig".length
#=> ["fig", "pear", "apple"]

After the sort, our hash looks like this:

排序后,我们的哈希看起来像这样:

hash #=> {"apple"=>5, "pear"=>4, "fig"=>3}

Applied to your code, something like this should work:

应用于您的代码,这样的东西应该工作:

hash = Hash.new { |h, k| h[k] = [k.points, k.goal_dif, k.goals] }
teams.sort { |a, b| hash[a] == hash[b] ? a_beats_b(a, b) : hash[a] <=> hash[b] }

#2

Implementation of sort with a_beats_b included:

包含a_beats_b的排序的实现:

teams.sort do |a, b|
  result = [a.points, a.goals_dif, a.goals] <=> [b.points, b.goals_dif, b.goals]
  result.zero? ? a_beats_b(a, b) : result
end

#3

Here's another approach that, while somewhat complex, has been designed for efficiency. The method uses the following steps.

这是另一种方法,尽管有些复杂,但它的设计是为了提高效率。该方法使用以下步骤。

Convert each Team instance to to an array containing the instance and the three-element array on which the inexpensive sort is to be done.

将每个Team实例转换为包含实例和三元素数组的数组,在该数组上进行廉价排序。

Use Enumerable#sort_by to sort the arrays by the three-element arrays.

使用Enumerable#sort_by按三元素数组对数组进行排序。

Use Enumerable#chunk to group the two-element arrays that have equal three-element arrays.

使用Enumerable#chunk对具有相同三元素数组的两元素数组进行分组。

Map each chunked array element to the Team instance in the two-element array.

将每个chunked数组元素映射到两元素数组中的Team实例。

Use Enumerable#flat_map to map each chunked group of Team instances after it has been sorted by the method a_beat_b(a, b) (unless the group contains only one team, of course).

使用Enumerable#flat_map在按方法a_beat_b(a,b)对每个已分块的Team实例组进行排序后映射(当然,除非该组只包含一个团队)。

Code

def sort_em(teams)
  teams.map { |t| [t, [t.points, t.goal_dif, t.goals]] }.
        sort_by(&:last).
        chunk(&:last).
        map { |_,tied_teams| tied_teams.map(&:first) }.
        flat_map { |tied_teams| (tied_teams.size == 1) ?
          tied_teams.first : tied_teams.sort { |a,b| a_beat_b(a, b) } }
end

Example

class Team
  attr_reader :name, :points, :goal_dif, :goals
  def initialize(name, points, goal_dif, goals)
    @name, @points, @goal_dif, @goals = name, points, goal_dif, goals
  end
end

teams = [Team.new("bluebirds", 233, 25, 130),
         Team.new("eagles",    233, 18, 105),
         Team.new("jays",      233, 25, 130),
         Team.new("owls",      160, 12, 105),
         Team.new("sparrows",  233, 18, 105)
        ]
  #=> [#<Team:0x007ff2f900e5a8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2f900e530 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2f900e4b8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2f900e440 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    #<Team:0x007ff2f900e3c8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>]

def a_beat_b(a, b)
  a.name.size <=> b.name.size
end

sort_em(teams)
  #=> [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    #<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>]

Explanation

The steps are as follows.

步骤如下。

a = teams.map { |t| [t, [t.points, t.goal_dif, t.goals]] }
  #=> [[#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #     [233, 25, 130]],
  #    [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #     [233, 18, 105]],
  #    [#<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #     [233, 25, 130]],
  #    [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #     [160, 12, 105]],
  #    [#<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #     [233, 18, 105]]] 
b = a.sort_by(&:last)
  #=> [[#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    [160, 12, 105]],
  #   [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    [233, 18, 105]],
  #   [#<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #    [233, 18, 105]],
  #   [#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #    [233, 25, 130]],
  #   [#<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    [233, 25, 130]]
  #   ] 
c = b.chunk(&:last)
  #=> #<Enumerator: #<Enumerator::Generator:0x007ff2fa81dc20>:each>

To see what values are generated by the enumerator c we can convert it to an array.

要查看枚举器c生成的值,我们可以将其转换为数组。

c.to_a 
  #=> [[[160, 12, 105],
  #     [[#<Team:0x007ff2fa845630 @name="owls",@points=160,@goal_dif=12,@goals=105>,
  #       [160, 12, 105]
  #      ]
  #     ]
  #    ],
  #    [[233, 18, 105],
  #     [[#<Team:0x007ff2fa845720 @name="eagles",@points=233,@goal_dif=18,@goals=105>,
  #       [233, 18, 105]
  #      ],
  #     [#<Team:0x007ff2fa8455b8 @name="sparrows",@points=233,@goal_dif=18,@goals=105>,
  #       [233, 18, 105]
  #     ]
  #    ],
  #    [[233, 25, 130],
  #     [[#<Team:0x007ff2fa8457e8 @name="bluebirds",@points=233,@goal_dif=25,@goals=130>,
  #       [233, 25, 130]
  #      ],
  #      [#<Team:0x007ff2fa8456a8 @name="jays", @points=233,@goal_dif=25,@goals=130>,
  #       [233, 25, 130]
  #      ]
  #     ]
  #    ]
  #   ]

d = c.map { |_,tied_teams| tied_teams.map(&:first) }
  #=> [[#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>],
  #    [#<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #     #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>
  #    ],
  #    [#<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>,
  #     #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>
  #    ]
  #   ] 
d.flat_map { |tied_teams| (tied_teams.size == 1) ?
  tied_teams.first : tied_teams.sort { |a,b| a_beat_b(a, b) } }
  #=> [#<Team:0x007ff2fa845630 @name="owls", @points=160, @goal_dif=12, @goals=105>,
  #    #<Team:0x007ff2fa845720 @name="eagles", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8455b8 @name="sparrows", @points=233, @goal_dif=18, @goals=105>,
  #    #<Team:0x007ff2fa8456a8 @name="jays", @points=233, @goal_dif=25, @goals=130>,
  #    #<Team:0x007ff2fa8457e8 @name="bluebirds", @points=233, @goal_dif=25, @goals=130>
  #   ]

#4

Depending on the size and const of the sorting function this might be an approach:

根据排序函数的大小和常量,这可能是一种方法:

# First group the teams by standard sort:
groups = teams.group_by{|a| [a.points, a.goals_dif, a.goals] }

# For each group that has ties. Run the slow sorter on them:
groups.each{ |_,val| val.sort!{|teamA,teamB| a_beat_b(teamA, teamB)} if val.size > 1 }

# Finally run sort on the keys of the group by:
groups.sort.flat_map(&:last)

#1