我如何比较两组1000个数字之间的差异?

时间:2022-02-08 08:12:55

I must check approximately 1000 numbers against 1000 other numbers.

我必须核对大约1000个数字和1000个其他数字。

I loaded both and compared them server-side:

我将两者都加载并在服务器端进行比较:

foreach( $numbers1 as $n1 ) {
  foreach( $numbers2 as $n2 ) {
    if( $n1 == $n2 ) {
      doBla();
    }
  }
}

This took a long time, so I tried to do the same comparison client side using two hidden div elements. Then compared them using JavaScript. It still takes 45 seconds to load the page (using hidden div elements).

这花了很长时间,所以我尝试使用两个隐藏的div元素在客户端进行相同的比较。然后使用JavaScript对它们进行比较。加载页面仍然需要45秒(使用隐藏的div元素)。

I do not need to load the numbers that are not the same.

我不需要加载不相同的数字。

Is there a faster algorithm? I am thinking of comparing them database side and just load the error numbers, then do an Ajax call for the remaining non-error numbers. But is a MySQL database fast enough?

有没有更快的算法?我正在考虑比较它们的数据库端,只加载错误号,然后对其余的非错误号执行Ajax调用。但是MySQL数据库足够快吗?

26 个解决方案

#1


129  

Sort the lists first. Then you can walk up both lists from the start, comparing as you go.

首先对列表排序。然后你可以从一开始就沿着这两个列表往上走,然后一边走一边比较。

The loop would look something like this:

这个循环看起来是这样的:

var A = getFirstArray().sort(), B = getSecondArray().sort();

var i = 0, j = 0;
while (i < A.length && j < B.length) {
    if (A[i] === B[j]) {
        doBla(A[i]);
        i++; j++;
    }
    else if (A[i] < B[j]) {
        i++;
    }
    else
        j++;
}

(That's JavaScript; you could do it server-side too, but I don't know PHP.)

(这是JavaScript;你也可以在服务器端做,但我不知道PHP)。

Edit — just to be fair to all the hashtable fans (whom I respect of course), it's pretty easy to do that in JavaScript:

编辑——为了公平地对待所有的hashtable粉丝(我当然尊重他们),在JavaScript中很容易做到:

var map = {};
for (var i = 0; i < B.length; ++i) map[B[i]] = true; // Assume integers.
for (var i = 0; i < A.length; ++i) if (map[A[i]]) doBla(A[i]);

Or if the numbers are or might be floats:

或者如果数字是或可能是浮点数:

var map = {};
for (var i = 0; i < B.length; ++i) map['' + B[i]] = true; // Assume integers.
for (var i = 0; i < A.length; ++i) if (map['' + A[i]]) doBla(A[i]);

Since numbers are pretty cheap to hash (even in JavaScript, converting to string before hashing is surprisingly cheap), this would be pretty fast.

由于数字非常便宜(甚至在JavaScript中,在哈希之前转换为字符串非常便宜),这将是非常快的。

#2


85  

#3


27  

In database terms this can a join of 1000 rows to another 1000 rows. Any modern database system can handle this.

在数据库术语中,这可以将1000行连接到另1000行。任何现代数据库系统都可以处理这个问题。

select x from table1
inner join table2
on table1.x = table2.y

where table1 and table2 are the rows concerned and could be the same table.

其中表1和表2是相关的行,可以是同一个表。

#4


26  

What you have shouldnt take that long - what does doBla() do? I suspect that is taking the time? Comparing two sets of 1000000 numbers with the same algorithm takes no time at all..

你所拥有的不应该花那么长时间- doBla()做什么?我怀疑这是在浪费时间?用相同的算法比较两组1000000的数字根本不需要时间。

This is hilarious - the number of optimisation techniques as answers - the problem is not your algorithm - it is whatever doBla() does that is taking the time by a factor many times greater than any optimisation would help you :) esp. given the sets are only 1000 long and you have to sort them first..

这是滑稽——的数量优化技术的答案,这个问题不是你的算法——这是任何doBla()所花时间的一个因素很多倍比优化会帮助你:)特别是考虑到集只有1000长,你必须先排序。

#5


23  

Maybe just intersect the array values to find numbers existing in both arrays?

也许只需要与数组值相交就可以找到两个数组中的数字?

$result = array_intersect($numbers1, $numbers2);
foreach ($result as $val)
  doBla();

#6


9  

If you sort list2 first and then do a binary search for each number in list1 you'll see a huge speed increase.

如果你先对list2进行排序,然后对list1中的每个数字进行二分查找,你会看到一个巨大的速度增长。

I'm not a PHP guy, but this should give you the idea:

我不是一个喜欢PHP的人,但这应该能让你明白:

sort($numbers2);

foreach($numbers1 as $n1)
{
   if (BinarySearch($numbers2, $n1) >= 0) {
     doBla();
 }
}

Obviously not being a PHP guy I don't know the library, but I'm sure sorting and binary searching should be easy enough to find.

显然我不是一个PHP人,我不知道这个库,但我确信排序和二进制搜索应该足够容易找到。

Note: In case you're not familiar with a binary search; you're sorting list2 because binary searches need to operate on sorted lists.

注意:如果您不熟悉二进制搜索;对列表2进行排序是因为二进制搜索需要对已排序的列表进行操作。

#7


5  

Sort them first.

先排序。

#8


5  

I'm not a PHP expert, so this may need some debugging, but you can do this easily in O(n) time:

我不是PHP专家,所以这可能需要一些调试,但是您可以在O(n)时间内轻松完成:

// Load one array into a hashtable, keyed by the number: O(n).
$keys1 = [];
foreach($numbers1 as $n1) $keys1[$n1] = true;

// Find the intersections with the other array:
foreach($numbers2 as $n2) { // O(n)
  if (isset($keys1[$n2]) { // O(1)
     doBla();
  }
}

Regardless, the intersection isn't where your time is going. Even a bad O(n^2) implementation like you have now should be able to go through 1000 numbers in a second.

不管怎样,十字路口并不是你时间的方向。甚至是一个糟糕的O(n ^ 2)实现像你现在应该能够经历1000年数据。

#9


5  

Stop- why are you doing this?

停-你为什么要这么做?

If the numbers are already in a SQL database, then do a join and let the DB figure out the most efficient route.

如果数字已经在SQL数据库中,那么进行连接,让DB找出最有效的路由。

If they aren't in a database, then I'm betting you've gone off track somewhere and really ought to reconsider how you got here.

如果它们不在数据库中,那么我打赌你已经偏离了方向,真的应该重新考虑你是如何到达这里的。

#10


4  

$same_numbers = array_intersect($numbers1, $$numbers2);

foreach($same_numbers as $n)
{
  doBla();
}

#11


3  

Sort both lists, then walk both lists at the same time using the old-master new-master sequential update pattern. As long as you can sort the data it is the fastest way since your really only walking the list once, to the longest length of the largest list.

对这两个列表进行排序,然后使用老主控的新主顺序更新模式同时遍历两个列表。只要你能对数据进行排序,这是最快的方式,因为你真的只遍历一次列表,到最大列表的最长长度。

#12


2  

Your code is simply more complicated then in needs to be.

你的代码比需要的要复杂得多。

Assuming what you're looking for is that the numbers in each position match (and not just that the array contains the same numbers), you can flatten your loop to a single for.

假设您要查找的是每个位置上的数字匹配(而不仅仅是数组包含相同的数字),您可以将循环压平为单个for。

<?php
// Fill two arrays with random numbers as proof.
$first_array = array(1000);
$second_array = array(1000);
for($i=0; $i<1000; $i++) $first_array[$i] = rand(0, 1000);
for($i=0; $i<1000; $i++) $second_array[$i] = rand(0, 1000);

// The loop you care about.
for($i=0; $i<1000; $i++) if ($first_array[$i] != $second_array[$i]) echo "Error at $i: first_array was {$first_array[$i]}, second was {$second_array[$i]}<br>";

?>

Using the code above, you will only loop 1000 times, as opposed to looping 1000000 times.

使用上面的代码,您将只循环1000次,而不是循环1000000次。

Now, if you need to just check that a number appears or does not appear in the arrays, use array_diff and array_intersect as follows:

现在,如果需要检查数组中是否出现了数字,请使用array_diff和array_intersect:

<?php
// Fill two arrays with random numbers as proof.
$first_array = array(1000);
$second_array = array(1000);
for($i=0; $i<1000; $i++) $first_array[$i] = rand(0, 1000);
for($i=0; $i<1000; $i++) $second_array[$i] = rand(0, 1000);

$matches = array_intersect($first_array, $second_array);
$differences = array_diff($first_array, $second_array);

?>

#13


2  

Maybe I'm not seeing something here but this looks like a classic case of set intersection. Here's a few lines in perl that'll do it.

也许我没在这里看到什么但这看起来像是一个典型的集合交叉的例子。下面是perl中的几行代码。

foreach $e (@a, @b) { $union{$e}++ && $isect{$e}++ }

foreach $ e(@a @b){ $联盟{ $ e } + + & & $ isect { $ e } + + }

@union = keys %union; @isect = keys %isect;

@union =键%联盟;@isect = % isect键;

At the end of these lines of code @isect will contain all numbers that are in both @a and @b. I'm sure this is translatable to php more or less directly. FWIW, this is my favorite piece of code from the Perl Cookbook.

在这些代码行末尾,@isect将包含@a和@b中的所有数字。我确信这在php中或多或少是可以直接翻译的。FWIW,这是我最喜欢的Perl Cookbook中的代码片段。

#14


2  

You can do it in O(n) time if you use bucket sort. Assuming you know the maximum value the numbers can take (although there are ways around that).

如果使用bucket排序,可以在O(n)时间内完成。假设你知道数字可以取的最大值(尽管有办法)。

http://en.wikipedia.org/wiki/Bucket_sort

http://en.wikipedia.org/wiki/Bucket_sort

#15


1  

I think it would be much easier to use the built in array_intersect function. Using your example, you could do:

我认为使用内置的array_intersect函数要容易得多。用你的例子,你可以做到:

$results = array_intersect($numbers1, $numbers2);
foreach($results as $rk => $rv) {
    doSomething($rv);
}

#16


1  

A better way would be to do something like this:

更好的办法是这样做:

// 1. Create a hash map from one of the lists.
var hm = { };
for (var i in list1) {
  if (!hm[list1[i]]) {
    hm[list1[i]] = 1;
  } else { hm[list1[i]] += 1; }
}

// 2. Lookup each element in the other list.
for (var i in list2) {
  if (hm[list2[i]] >= 1) {
    for (var j = 0; j < hm[list2[i]]; ++j) {
      doBla();
    }
  }
}

This is guaranteed O(n) [assuming insertion an lookup in a hash map is O(1) amortized].

这是有保证的O(n)[假设在散列映射中插入一个查找是O(1)平摊的]。

Update: The worst case of this algorithm would be O(n2) and there is no way to reduce -- unless you change the semantics of the program. This is because in the worst case, the program will call doBla() n2 number of times if all the numbers in both the lists are the same. However, if both the lists have unique numbers (i.e. generally unique within a list), then the runtime would tend towards O(n).

更新:这种算法最糟糕的情况是O(n2),而且没有办法减少——除非您更改程序的语义。这是因为在最坏的情况下,如果两个列表中的所有数字都相同,程序将调用doBla() n2次。但是,如果两个列表都有唯一的数字(即在列表中通常是唯一的),那么运行时将趋向于O(n)。

#17


1  

I'll create a GUI interface in Visual Basic, see if I can track the numbers

我将在Visual Basic中创建一个GUI界面,看看是否能跟踪这些数字。

#18


1  

Mergesort both lists, start at the beginning of both lists, and then search through each list for similar numbers at the same time.

合并两个列表,从两个列表的开头开始,然后在每个列表中同时搜索相似的数字。

So, in pseudocode, it would go something like...

所以,在伪代码中,它会像。

Mergesort (List A);
Mergesort (list B)

$Apos = 0;
$Bpos = 0;

while( $Apos != A.Length && $Bpos != B.length) // while you have not reached the end of either list
{
if (A[$Apos] == B[$Bpos])// found a match
doSomething();

else if (A[$Apos] > B[$Bpos]) // B is lower than A, so have B try and catch up to A.
$Bpos++;

else if (A[$Apos] < B[$Bpos]) // the value at A is less than the value at B, so increment B
$Apos++;

}

If I'm right, the speed of this algorithm is O(n logn).

如果我是对的,这个算法的速度是O(n logn)

#19


1  

I'm not sure why Mrk Mnl was downvoted but the function call is the overhead here.

我不确定为什么Mrk Mnl被否决,但是函数调用是这里的开销。

Push out the matched numbers into another array and doBla() on them after the comparisons. As a test // out doBla() and see if you are experiencing the same performance issue.

将匹配的数字推入另一个数组,并在比较之后对它们执行doBla()操作。作为测试// out doBla(),看看您是否遇到了相同的性能问题。

#20


0  

Would it be possible to put these numbers into two database tables, and then do an INNER JOIN? This will be very efficient and provide only the numbers which are contained in both tables. This is a perfect task for a database.

是否可能将这些数字放入两个数据库表中,然后进行内部连接?这将非常有效,并且只提供包含在两个表中的数字。对于数据库来说,这是一项完美的任务。

#21


0  

  1. Create two duplicate collections, preferably ones with fast lookup times, like HashSet or perhaps TreeSet. Avoid Lists as they have very poor lookup times.

    创建两个重复的集合,最好是具有快速查找时间的集合,比如HashSet或TreeSet。避免使用列表,因为它们的查找时间很短。

  2. As you find elements, remove them from both sets. This can reduce lookup times by having fewer elements to sift through in later searches.

    当您找到元素时,从两个集合中删除它们。这可以通过在以后的搜索中减少元素的筛选来减少查找时间。

#22


0  

If you're trying to get a list of numbers without any duplicates, you can use a hash:

如果你想要一个没有重复的数字列表,你可以使用散列:

$unique = array();
foreach ($list1 as $num) {
  $unique[$num] = $num;
}
foreach ($list2 as $num) {
  $unique[$num] = $num;
}
$unique = array_keys($unique);

It's going to be slightly (very slightly) slower than the array walk method, but it's cleaner in my opinion.

它会比数组walk慢一点(非常慢),但在我看来它更简洁。

#23


0  

This code will call doBla() once for each time a value in $numbers1 is found in $numbers2:

这个代码将每次调用doBla()一次,在$ s2中发现$ 1中的值:

// get [val => occurences, ...] for $numbers2
$counts = array_count_values($numbers2);
foreach ($numbers1 as $n1) {
    // if $n1 occurs in $numbers2...
    if (isset($counts[$n1])) {
        // call doBla() once for each occurence
        for ($i=0; $i < $counts[$n1]; $i++) {
            doBla();
        }
    }
}

If you only need to call doBla() once if a match is found:

如果找到匹配项,只需调用doBla()一次:

foreach ($numbers1 as $n1) {
    if (in_array($n1, $numbers2))
        doBla();
}

If $numbers1 and $numbers2 will only contain unique values, or if the number of times any specific value occurs in both arrays is not important, array_intersect() will do the job:

如果$numbers1和$numbers2只包含唯一的值,或者如果两个数组中出现任何特定值的次数都不重要,那么array_intersect()将完成该工作:

$dups = array_intersect($numbers1, $numbers2);
foreach ($dups as $n)
    doBla();

I agree with several earlier posts that the calls to doBla() are probably taking more time than iterating over the arrays.

我同意前面的几个帖子,对doBla()的调用可能比遍历数组要花费更多的时间。

#24


0  

This problem can be break into 2 tasks. 1st task is finding all combinations (n^2-n)/2. For n=1000 the solution is x=499500. The 2nd task is to loop through all x numbers and compare them with the function doBla().

这个问题可以分为两个任务。1日的任务是找到所有组合(n ^ 2 n)/ 2。对于n=1000,解是x=499500。第二项任务是遍历所有的x值,并将它们与doBla()函数进行比较。

function getWayStr(curr) {
 var nextAbove = -1;
 for (var i = curr + 1; i < waypoints.length; ++i) {
  if (nextAbove == -1) {
    nextAbove = i;
   } else {
     wayStr.push(waypoints[i]);
     wayStr.push(waypoints[curr]);
   }
  }
  if (nextAbove != -1) {
    wayStr.push(waypoints[nextAbove]);
    getWayStr(nextAbove);
    wayStr.push(waypoints[curr]);
  }
 } 

#25


0  

Merge, sort and then count

<?php
    $first = array('1001', '1002', '1003', '1004', '1005');
    $second = array('1002', '1003', '1004', '1005', '1006');
    $merged = array_merge($first, $first, $second);
    sort($merged);
    print_r(array_count_values($merged));
?>

Output / the values with a count of three are the ones you want

Array
(
    [1001] => 2
    [1002] => 3
    [1003] => 3
    [1004] => 3
    [1005] => 3
    [1006] => 1
)

#26


0  

Use WebAssembly rather than JavaScript.

使用WebAssembly而不是JavaScript。

#1


129  

Sort the lists first. Then you can walk up both lists from the start, comparing as you go.

首先对列表排序。然后你可以从一开始就沿着这两个列表往上走,然后一边走一边比较。

The loop would look something like this:

这个循环看起来是这样的:

var A = getFirstArray().sort(), B = getSecondArray().sort();

var i = 0, j = 0;
while (i < A.length && j < B.length) {
    if (A[i] === B[j]) {
        doBla(A[i]);
        i++; j++;
    }
    else if (A[i] < B[j]) {
        i++;
    }
    else
        j++;
}

(That's JavaScript; you could do it server-side too, but I don't know PHP.)

(这是JavaScript;你也可以在服务器端做,但我不知道PHP)。

Edit — just to be fair to all the hashtable fans (whom I respect of course), it's pretty easy to do that in JavaScript:

编辑——为了公平地对待所有的hashtable粉丝(我当然尊重他们),在JavaScript中很容易做到:

var map = {};
for (var i = 0; i < B.length; ++i) map[B[i]] = true; // Assume integers.
for (var i = 0; i < A.length; ++i) if (map[A[i]]) doBla(A[i]);

Or if the numbers are or might be floats:

或者如果数字是或可能是浮点数:

var map = {};
for (var i = 0; i < B.length; ++i) map['' + B[i]] = true; // Assume integers.
for (var i = 0; i < A.length; ++i) if (map['' + A[i]]) doBla(A[i]);

Since numbers are pretty cheap to hash (even in JavaScript, converting to string before hashing is surprisingly cheap), this would be pretty fast.

由于数字非常便宜(甚至在JavaScript中,在哈希之前转换为字符串非常便宜),这将是非常快的。

#2


85  

#3


27  

In database terms this can a join of 1000 rows to another 1000 rows. Any modern database system can handle this.

在数据库术语中,这可以将1000行连接到另1000行。任何现代数据库系统都可以处理这个问题。

select x from table1
inner join table2
on table1.x = table2.y

where table1 and table2 are the rows concerned and could be the same table.

其中表1和表2是相关的行,可以是同一个表。

#4


26  

What you have shouldnt take that long - what does doBla() do? I suspect that is taking the time? Comparing two sets of 1000000 numbers with the same algorithm takes no time at all..

你所拥有的不应该花那么长时间- doBla()做什么?我怀疑这是在浪费时间?用相同的算法比较两组1000000的数字根本不需要时间。

This is hilarious - the number of optimisation techniques as answers - the problem is not your algorithm - it is whatever doBla() does that is taking the time by a factor many times greater than any optimisation would help you :) esp. given the sets are only 1000 long and you have to sort them first..

这是滑稽——的数量优化技术的答案,这个问题不是你的算法——这是任何doBla()所花时间的一个因素很多倍比优化会帮助你:)特别是考虑到集只有1000长,你必须先排序。

#5


23  

Maybe just intersect the array values to find numbers existing in both arrays?

也许只需要与数组值相交就可以找到两个数组中的数字?

$result = array_intersect($numbers1, $numbers2);
foreach ($result as $val)
  doBla();

#6


9  

If you sort list2 first and then do a binary search for each number in list1 you'll see a huge speed increase.

如果你先对list2进行排序,然后对list1中的每个数字进行二分查找,你会看到一个巨大的速度增长。

I'm not a PHP guy, but this should give you the idea:

我不是一个喜欢PHP的人,但这应该能让你明白:

sort($numbers2);

foreach($numbers1 as $n1)
{
   if (BinarySearch($numbers2, $n1) >= 0) {
     doBla();
 }
}

Obviously not being a PHP guy I don't know the library, but I'm sure sorting and binary searching should be easy enough to find.

显然我不是一个PHP人,我不知道这个库,但我确信排序和二进制搜索应该足够容易找到。

Note: In case you're not familiar with a binary search; you're sorting list2 because binary searches need to operate on sorted lists.

注意:如果您不熟悉二进制搜索;对列表2进行排序是因为二进制搜索需要对已排序的列表进行操作。

#7


5  

Sort them first.

先排序。

#8


5  

I'm not a PHP expert, so this may need some debugging, but you can do this easily in O(n) time:

我不是PHP专家,所以这可能需要一些调试,但是您可以在O(n)时间内轻松完成:

// Load one array into a hashtable, keyed by the number: O(n).
$keys1 = [];
foreach($numbers1 as $n1) $keys1[$n1] = true;

// Find the intersections with the other array:
foreach($numbers2 as $n2) { // O(n)
  if (isset($keys1[$n2]) { // O(1)
     doBla();
  }
}

Regardless, the intersection isn't where your time is going. Even a bad O(n^2) implementation like you have now should be able to go through 1000 numbers in a second.

不管怎样,十字路口并不是你时间的方向。甚至是一个糟糕的O(n ^ 2)实现像你现在应该能够经历1000年数据。

#9


5  

Stop- why are you doing this?

停-你为什么要这么做?

If the numbers are already in a SQL database, then do a join and let the DB figure out the most efficient route.

如果数字已经在SQL数据库中,那么进行连接,让DB找出最有效的路由。

If they aren't in a database, then I'm betting you've gone off track somewhere and really ought to reconsider how you got here.

如果它们不在数据库中,那么我打赌你已经偏离了方向,真的应该重新考虑你是如何到达这里的。

#10


4  

$same_numbers = array_intersect($numbers1, $$numbers2);

foreach($same_numbers as $n)
{
  doBla();
}

#11


3  

Sort both lists, then walk both lists at the same time using the old-master new-master sequential update pattern. As long as you can sort the data it is the fastest way since your really only walking the list once, to the longest length of the largest list.

对这两个列表进行排序,然后使用老主控的新主顺序更新模式同时遍历两个列表。只要你能对数据进行排序,这是最快的方式,因为你真的只遍历一次列表,到最大列表的最长长度。

#12


2  

Your code is simply more complicated then in needs to be.

你的代码比需要的要复杂得多。

Assuming what you're looking for is that the numbers in each position match (and not just that the array contains the same numbers), you can flatten your loop to a single for.

假设您要查找的是每个位置上的数字匹配(而不仅仅是数组包含相同的数字),您可以将循环压平为单个for。

<?php
// Fill two arrays with random numbers as proof.
$first_array = array(1000);
$second_array = array(1000);
for($i=0; $i<1000; $i++) $first_array[$i] = rand(0, 1000);
for($i=0; $i<1000; $i++) $second_array[$i] = rand(0, 1000);

// The loop you care about.
for($i=0; $i<1000; $i++) if ($first_array[$i] != $second_array[$i]) echo "Error at $i: first_array was {$first_array[$i]}, second was {$second_array[$i]}<br>";

?>

Using the code above, you will only loop 1000 times, as opposed to looping 1000000 times.

使用上面的代码,您将只循环1000次,而不是循环1000000次。

Now, if you need to just check that a number appears or does not appear in the arrays, use array_diff and array_intersect as follows:

现在,如果需要检查数组中是否出现了数字,请使用array_diff和array_intersect:

<?php
// Fill two arrays with random numbers as proof.
$first_array = array(1000);
$second_array = array(1000);
for($i=0; $i<1000; $i++) $first_array[$i] = rand(0, 1000);
for($i=0; $i<1000; $i++) $second_array[$i] = rand(0, 1000);

$matches = array_intersect($first_array, $second_array);
$differences = array_diff($first_array, $second_array);

?>

#13


2  

Maybe I'm not seeing something here but this looks like a classic case of set intersection. Here's a few lines in perl that'll do it.

也许我没在这里看到什么但这看起来像是一个典型的集合交叉的例子。下面是perl中的几行代码。

foreach $e (@a, @b) { $union{$e}++ && $isect{$e}++ }

foreach $ e(@a @b){ $联盟{ $ e } + + & & $ isect { $ e } + + }

@union = keys %union; @isect = keys %isect;

@union =键%联盟;@isect = % isect键;

At the end of these lines of code @isect will contain all numbers that are in both @a and @b. I'm sure this is translatable to php more or less directly. FWIW, this is my favorite piece of code from the Perl Cookbook.

在这些代码行末尾,@isect将包含@a和@b中的所有数字。我确信这在php中或多或少是可以直接翻译的。FWIW,这是我最喜欢的Perl Cookbook中的代码片段。

#14


2  

You can do it in O(n) time if you use bucket sort. Assuming you know the maximum value the numbers can take (although there are ways around that).

如果使用bucket排序,可以在O(n)时间内完成。假设你知道数字可以取的最大值(尽管有办法)。

http://en.wikipedia.org/wiki/Bucket_sort

http://en.wikipedia.org/wiki/Bucket_sort

#15


1  

I think it would be much easier to use the built in array_intersect function. Using your example, you could do:

我认为使用内置的array_intersect函数要容易得多。用你的例子,你可以做到:

$results = array_intersect($numbers1, $numbers2);
foreach($results as $rk => $rv) {
    doSomething($rv);
}

#16


1  

A better way would be to do something like this:

更好的办法是这样做:

// 1. Create a hash map from one of the lists.
var hm = { };
for (var i in list1) {
  if (!hm[list1[i]]) {
    hm[list1[i]] = 1;
  } else { hm[list1[i]] += 1; }
}

// 2. Lookup each element in the other list.
for (var i in list2) {
  if (hm[list2[i]] >= 1) {
    for (var j = 0; j < hm[list2[i]]; ++j) {
      doBla();
    }
  }
}

This is guaranteed O(n) [assuming insertion an lookup in a hash map is O(1) amortized].

这是有保证的O(n)[假设在散列映射中插入一个查找是O(1)平摊的]。

Update: The worst case of this algorithm would be O(n2) and there is no way to reduce -- unless you change the semantics of the program. This is because in the worst case, the program will call doBla() n2 number of times if all the numbers in both the lists are the same. However, if both the lists have unique numbers (i.e. generally unique within a list), then the runtime would tend towards O(n).

更新:这种算法最糟糕的情况是O(n2),而且没有办法减少——除非您更改程序的语义。这是因为在最坏的情况下,如果两个列表中的所有数字都相同,程序将调用doBla() n2次。但是,如果两个列表都有唯一的数字(即在列表中通常是唯一的),那么运行时将趋向于O(n)。

#17


1  

I'll create a GUI interface in Visual Basic, see if I can track the numbers

我将在Visual Basic中创建一个GUI界面,看看是否能跟踪这些数字。

#18


1  

Mergesort both lists, start at the beginning of both lists, and then search through each list for similar numbers at the same time.

合并两个列表,从两个列表的开头开始,然后在每个列表中同时搜索相似的数字。

So, in pseudocode, it would go something like...

所以,在伪代码中,它会像。

Mergesort (List A);
Mergesort (list B)

$Apos = 0;
$Bpos = 0;

while( $Apos != A.Length && $Bpos != B.length) // while you have not reached the end of either list
{
if (A[$Apos] == B[$Bpos])// found a match
doSomething();

else if (A[$Apos] > B[$Bpos]) // B is lower than A, so have B try and catch up to A.
$Bpos++;

else if (A[$Apos] < B[$Bpos]) // the value at A is less than the value at B, so increment B
$Apos++;

}

If I'm right, the speed of this algorithm is O(n logn).

如果我是对的,这个算法的速度是O(n logn)

#19


1  

I'm not sure why Mrk Mnl was downvoted but the function call is the overhead here.

我不确定为什么Mrk Mnl被否决,但是函数调用是这里的开销。

Push out the matched numbers into another array and doBla() on them after the comparisons. As a test // out doBla() and see if you are experiencing the same performance issue.

将匹配的数字推入另一个数组,并在比较之后对它们执行doBla()操作。作为测试// out doBla(),看看您是否遇到了相同的性能问题。

#20


0  

Would it be possible to put these numbers into two database tables, and then do an INNER JOIN? This will be very efficient and provide only the numbers which are contained in both tables. This is a perfect task for a database.

是否可能将这些数字放入两个数据库表中,然后进行内部连接?这将非常有效,并且只提供包含在两个表中的数字。对于数据库来说,这是一项完美的任务。

#21


0  

  1. Create two duplicate collections, preferably ones with fast lookup times, like HashSet or perhaps TreeSet. Avoid Lists as they have very poor lookup times.

    创建两个重复的集合,最好是具有快速查找时间的集合,比如HashSet或TreeSet。避免使用列表,因为它们的查找时间很短。

  2. As you find elements, remove them from both sets. This can reduce lookup times by having fewer elements to sift through in later searches.

    当您找到元素时,从两个集合中删除它们。这可以通过在以后的搜索中减少元素的筛选来减少查找时间。

#22


0  

If you're trying to get a list of numbers without any duplicates, you can use a hash:

如果你想要一个没有重复的数字列表,你可以使用散列:

$unique = array();
foreach ($list1 as $num) {
  $unique[$num] = $num;
}
foreach ($list2 as $num) {
  $unique[$num] = $num;
}
$unique = array_keys($unique);

It's going to be slightly (very slightly) slower than the array walk method, but it's cleaner in my opinion.

它会比数组walk慢一点(非常慢),但在我看来它更简洁。

#23


0  

This code will call doBla() once for each time a value in $numbers1 is found in $numbers2:

这个代码将每次调用doBla()一次,在$ s2中发现$ 1中的值:

// get [val => occurences, ...] for $numbers2
$counts = array_count_values($numbers2);
foreach ($numbers1 as $n1) {
    // if $n1 occurs in $numbers2...
    if (isset($counts[$n1])) {
        // call doBla() once for each occurence
        for ($i=0; $i < $counts[$n1]; $i++) {
            doBla();
        }
    }
}

If you only need to call doBla() once if a match is found:

如果找到匹配项,只需调用doBla()一次:

foreach ($numbers1 as $n1) {
    if (in_array($n1, $numbers2))
        doBla();
}

If $numbers1 and $numbers2 will only contain unique values, or if the number of times any specific value occurs in both arrays is not important, array_intersect() will do the job:

如果$numbers1和$numbers2只包含唯一的值,或者如果两个数组中出现任何特定值的次数都不重要,那么array_intersect()将完成该工作:

$dups = array_intersect($numbers1, $numbers2);
foreach ($dups as $n)
    doBla();

I agree with several earlier posts that the calls to doBla() are probably taking more time than iterating over the arrays.

我同意前面的几个帖子,对doBla()的调用可能比遍历数组要花费更多的时间。

#24


0  

This problem can be break into 2 tasks. 1st task is finding all combinations (n^2-n)/2. For n=1000 the solution is x=499500. The 2nd task is to loop through all x numbers and compare them with the function doBla().

这个问题可以分为两个任务。1日的任务是找到所有组合(n ^ 2 n)/ 2。对于n=1000,解是x=499500。第二项任务是遍历所有的x值,并将它们与doBla()函数进行比较。

function getWayStr(curr) {
 var nextAbove = -1;
 for (var i = curr + 1; i < waypoints.length; ++i) {
  if (nextAbove == -1) {
    nextAbove = i;
   } else {
     wayStr.push(waypoints[i]);
     wayStr.push(waypoints[curr]);
   }
  }
  if (nextAbove != -1) {
    wayStr.push(waypoints[nextAbove]);
    getWayStr(nextAbove);
    wayStr.push(waypoints[curr]);
  }
 } 

#25


0  

Merge, sort and then count

<?php
    $first = array('1001', '1002', '1003', '1004', '1005');
    $second = array('1002', '1003', '1004', '1005', '1006');
    $merged = array_merge($first, $first, $second);
    sort($merged);
    print_r(array_count_values($merged));
?>

Output / the values with a count of three are the ones you want

Array
(
    [1001] => 2
    [1002] => 3
    [1003] => 3
    [1004] => 3
    [1005] => 3
    [1006] => 1
)

#26


0  

Use WebAssembly rather than JavaScript.

使用WebAssembly而不是JavaScript。