How to find all the pairs that sum up to a certain values in any random array using Hashmap in C++

时间:2021-12-17 12:03:07

I want to find all the pairs that in an array that sum upto a certain number X using Hashmap.I know the basic solution which has O(n^2) complexity but somewhere I read that hashmap can provide O(n) solution.I have no idea how can I use hashmap to achieve the solution.Can someone provide me a pseudo code on how to do that?

我想找到一个数组中使用Hashmap总计达到一定数量X的所有对。我知道具有O(n ^ 2)复杂度的基本解决方案,但在某处我读到了hashmap可以提供O(n)解.I我不知道如何使用hashmap来实现解决方案。有人能为我提供一个如何做到这一点的伪代码吗?

2 个解决方案

#1


1  

If you were to use something like a std::set, then all values in the set would be unique. This would allow you to loop through the set and do a subtraction from the desired value to determine the other value you would need. You can then test to see if that value exists in the set. The operations would look like this:

如果你使用类似std :: set的东西,那么集合中的所有值都是唯一的。这将允许您遍历集合并从所需值中减去以确定您需要的其他值。然后,您可以测试以查看该值中是否存在该值。操作如下所示:

  1. Set iterator to first element in the set
  2. 将迭代器设置为集合中的第一个元素
  3. Get the value of the first element in the set
  4. 获取集合中第一个元素的值
  5. Subtract the value from the desired value to obtain the needed value
  6. 从期望值中减去该值以获得所需的值
  7. Test the set to see if the needed value is in the set
  8. 测试集合以查看所需的值是否在集合中
  9. If it is record both values
  10. 如果它记录两个值
  11. Increment the iterator and repeat from #2 until you reach the end of the set
  12. 递增迭代器并从#2开始重复,直到到达集合的末尾

#2


0  

For the moment, I'm going to assume you don't really care about including every possible repetition of identical results. If all the items in the input are exactly SUM/2, then you have N2 results (all identical).

目前,我将假设您并不真正关心包含所有可能重复的相同结果。如果输入中的所有项目都是SUM / 2,那么您将获得N2结果(全部相同)。

The basic idea is pretty simple: insert all the data into the map. Then for each item N in in the hash-map, look for sum-N, and if it's present, print out that pair.

基本思路非常简单:将所有数据插入到地图中。然后,对于散列图中的每个项目N in,查找sum-N,如果它存在,则打印出该对。

From a practical viewpoint, if the size of the input isn't immense, you might be better off with an O(N log N) solution. Start by sorting the input. Then start walking from the beginning and end of the sorted data. You start with the first and last items. If the sum is too large, walk backwards from the end until you get a result less than or equal to the desired sum. If it's equal, print out the pair. Either way, you then increment the beginning pointer and continue searching. Continue until the two pointers meet.

从实际的角度来看,如果输入的大小不是很大,那么使用O(N log N)解决方案可能会更好。首先对输入进行排序。然后从排序数据的开头和结尾开始走。您从第一个和最后一个项目开始。如果总和太大,则从末尾向后走,直到得到小于或等于所需总和的结果。如果相等,则打印出该对。无论哪种方式,您然后递增开始指针并继续搜索。继续,直到两个指针相遇。

Since this mostly accesses memory linearly, it tends to be very cache friendly. Accessing data in the hash table tends to have much poorer locality of reference, so even though each access has constant (expected) complexity, the constants involved may be high enough that the O(N log N) version may well be faster in practice.

由于这主要是线性访问内存,因此它往往非常缓存友好。访问散列表中的数据往往具有更差的引用局部性,因此即使每个访问具有恒定(预期)的复杂性,所涉及的常数可能足够高以至于O(N log N)版本在实践中可能更快。

#1


1  

If you were to use something like a std::set, then all values in the set would be unique. This would allow you to loop through the set and do a subtraction from the desired value to determine the other value you would need. You can then test to see if that value exists in the set. The operations would look like this:

如果你使用类似std :: set的东西,那么集合中的所有值都是唯一的。这将允许您遍历集合并从所需值中减去以确定您需要的其他值。然后,您可以测试以查看该值中是否存在该值。操作如下所示:

  1. Set iterator to first element in the set
  2. 将迭代器设置为集合中的第一个元素
  3. Get the value of the first element in the set
  4. 获取集合中第一个元素的值
  5. Subtract the value from the desired value to obtain the needed value
  6. 从期望值中减去该值以获得所需的值
  7. Test the set to see if the needed value is in the set
  8. 测试集合以查看所需的值是否在集合中
  9. If it is record both values
  10. 如果它记录两个值
  11. Increment the iterator and repeat from #2 until you reach the end of the set
  12. 递增迭代器并从#2开始重复,直到到达集合的末尾

#2


0  

For the moment, I'm going to assume you don't really care about including every possible repetition of identical results. If all the items in the input are exactly SUM/2, then you have N2 results (all identical).

目前,我将假设您并不真正关心包含所有可能重复的相同结果。如果输入中的所有项目都是SUM / 2,那么您将获得N2结果(全部相同)。

The basic idea is pretty simple: insert all the data into the map. Then for each item N in in the hash-map, look for sum-N, and if it's present, print out that pair.

基本思路非常简单:将所有数据插入到地图中。然后,对于散列图中的每个项目N in,查找sum-N,如果它存在,则打印出该对。

From a practical viewpoint, if the size of the input isn't immense, you might be better off with an O(N log N) solution. Start by sorting the input. Then start walking from the beginning and end of the sorted data. You start with the first and last items. If the sum is too large, walk backwards from the end until you get a result less than or equal to the desired sum. If it's equal, print out the pair. Either way, you then increment the beginning pointer and continue searching. Continue until the two pointers meet.

从实际的角度来看,如果输入的大小不是很大,那么使用O(N log N)解决方案可能会更好。首先对输入进行排序。然后从排序数据的开头和结尾开始走。您从第一个和最后一个项目开始。如果总和太大,则从末尾向后走,直到得到小于或等于所需总和的结果。如果相等,则打印出该对。无论哪种方式,您然后递增开始指针并继续搜索。继续,直到两个指针相遇。

Since this mostly accesses memory linearly, it tends to be very cache friendly. Accessing data in the hash table tends to have much poorer locality of reference, so even though each access has constant (expected) complexity, the constants involved may be high enough that the O(N log N) version may well be faster in practice.

由于这主要是线性访问内存,因此它往往非常缓存友好。访问散列表中的数据往往具有更差的引用局部性,因此即使每个访问具有恒定(预期)的复杂性,所涉及的常数可能足够高以至于O(N log N)版本在实践中可能更快。