如何找到两个序列之间的最小差异

时间:2023-01-09 21:32:04

I have an assignment for my class that I am completely lost on how to implement. I have added the problem below. I don't need code. I would just like any advice on how to approach the problem. I've googled it, but can't seem to find anything.

我有一个班级的任务,我完全迷失了如何实施。我在下面添加了问题。我不需要代码。我想就如何处理这个问题提出任何建议。我用谷歌搜索了它,但似乎找不到任何东西。

My brute force approach is to loop through every possibility and check to see which subsequence creates the minimum. This works for small inputs, but obviously is way too slow for larger ones. I also did pre-checks on values and cut them out if they became larger than the current minimum.

我的强力方法是遍历每种可能性并检查哪个子序列创建最小值。这适用于小输入,但对于较大的输入显然太慢了。我还对值进行了预先检查,如果它们变得大于当前最小值,则将它们删除。

For the given sequences a and b (not necessarily having the same lengths), find the subsequence a’ or a and subsequence b’ of b such that diff(a’, b’) is minimal. Return the difference.

对于给定的序列a和b(不一定具有相同的长度),找到b的子序列a'或a和子序列b',使得diff(a',b')最小。归还差异。

The difference is the absolute value of a[i] - b[i]. Add these differences for each value in the sequence to get the total. The goal is to find the sequence of 'b' that gives the smallest total absolute difference between it and the sequence for 'a'.

差异是a [i] - b [i]的绝对值。为序列中的每个值添加这些差异以获得总计。目标是找到'b'的序列,它给出了它与'a'序列之间最小的总绝对差。

Example: A = {1, 2, 6}, b = {0, 1, 3, 4, 5}, the output should be smallestDifference(a, b) = 2, where the best subsequence will be b’ = {1, 3, 5}

示例:A = {1,2,6},b = {0,1,3,4,5},输出应为minimumDifference(a,b)= 2,其中最佳子序列为b'= {1 ,3,5}

REQUIREMENTS

  1. Maximum time to compute the solution is 500ms (regardless of input size).
  2. 计算解决方案的最长时间为500毫秒(无论输入大小如何)。

  3. The length of a will be in the range [3, 1000].
  4. a的长度将在[3,1000]的范围内。

  5. The length of b will be in the range [a.length, 1000]
  6. b的长度将在[a.length,1000]的范围内

  7. The values in each array will be in the range [-1000, 1000]
  8. 每个数组中的值将在[-1000,1000]范围内

  9. Order of the sequences matters. Cannot sort
  10. 序列顺序很重要。无法排序

1 个解决方案

#1


0  

EDIT 2: I misunderstood the Questions Initially, the order matters here hence my solution is actually incorrect. I have now updated with a new solution Idea (And left the old one for reference for any one interested)!

编辑2:我误解了问题最初,订单在这里很重要,因此我的解决方案实际上是不正确的。我现在更新了一个新的解决方案Idea(留下旧的解决方案供任何感兴趣的人使用)!

NEW:

So we will define DP[i][j] as the minimum difference between two sub-sequences of A[1..i] and B[1..j].

因此,我们将DP [i] [j]定义为A [1..i]和B [1..j]的两个子序列之间的最小差异。

Now let's consider any two sub sequences of them a', b' respectively. There are 4 scenarios:

现在让我们分别考虑它们的任何两个子序列a',b'。有4种情况:

  1. a' contains A[i], b' contains B[j]
  2. a'包含A [i],b'包含B [j]

  3. a' contains A[i], b' doesn't contain B[j]
  4. a'包含A [i],b'不包含B [j]

  5. a' doesn't contain A[i], b' contains B[j]
  6. a'不包含A [i],b'包含B [j]

  7. a' doesn't contain A[i], b' doesn't contain B[j]
  8. a'不包含A [i],b'不包含B [j]

1 Corresponds to the recurrenceDP[i-1][j-1]+|A[i]-B[j]|. 2 Corresponds to the recurrence DP[i][j-1]. 3 Corresponds to the recurrence DP[i-1][j], finally 3 Corresponds to the recurrence DP[i-1][j-1].

1对应于recurrenceDP [i-1] [j-1] + | A [i] -B [j] |。 2对应于复发DP [i] [j-1]。 3对应于复发DP [i-1] [j],最后3对应于复发DP [i-1] [j-1]。

Since we want the minimum of all these scenario, we have that:

由于我们想要所有这些场景中的最小值,我们有:

DP[i][j]=min(DP[i-1][j-1]+|A[i]-B[j]|, DP[i][j-1], DP[i-1][j], DP[i-1][j-1])

DP [i] [j] = min(DP [i-1] [j-1] + | A [i] -B [j] |,DP [i] [j-1],DP [i-1] [j],DP [i-1] [j-1])

This runs in O(n^2) time. Can you build up from here?

这在O(n ^ 2)时间内运行。你能从这里建立吗?

OLD:

This is a variation of the subset sum problem. We will call Sum(a) as the sum of all elements in a. So we can reword your problem as follows:

这是子集和问题的变体。我们将Sum(a)称为a中所有元素的总和。所以我们可以按如下方式重新编写您的问题:

Given a set A, and a set B, We ideally want to find a sub sequence of B of sum Sum(A). If we can't find any, we want to find a subsequence of B that sums up to Sum(A)-1, ...., and so on.

给定集合A和集合B,理想地希望找到总和(B)的B的子序列。如果我们找不到任何东西,我们想找到B的子序列,它总结为Sum(A)-1,....,依此类推。

So, You first fill a table SubsetSum(i,j) which is true if there is a subset sum of elements B[1..i] with sum j and false otherwise. Fill the entire table. This takes O(nS) Time where n is the number of elements, and S is the total sum of A = Sum(A).

所以,你首先填写一个表SubsetSum(i,j),如果存在元素B [1..i]的子集和,则j为真,否则为false。填写整个表格。这需要O(nS)时间,其中n是元素的数量,S是A = Sum(A)的总和。

Next, start with sum = Sum(A). Then check SubsetSum(n,sum), if it is true, you're done, other wise, try SubsetSum(n, sum-1), if it is true then you're done, other wise, try SubsetSum(n, sum - 2), ..., and so on. This step runs in O(S)

接下来,以sum = Sum(A)开始。然后检查SubsetSum(n,sum),如果是真的,你就完成了,其他方面,尝试SubsetSum(n,sum-1),如果是真的那么你就完成了,其他方面,尝试SubsetSum(n, sum - 2),...等等。此步骤在O(S)中运行

That way, you are minimizing the difference between A and your subsequence of B.

这样,你最小化A和你的子序列之间的差异。

This runs in O(nS).

这在O(nS)中运行。

Does this get you started?

这会让你入门吗?

EDIT:

If you can also choose a sub sequence of A, then you can define S=max(Sum(A), Sum(B)). Then come up with two tables:

如果还可以选择A的子序列,则可以定义S = max(Sum(A),Sum(B))。然后拿出两张桌子:

Subset-Sum-A(i,j) and Subset-Sum-B(i,j)defined as above.

子集Sum-A(i,j)和Subset-Sum-B(i,j)如上定义。

Then set sum=S and check if Subset-Sum-A(n,sum) && Subset-Sum-B(n,sum), if not true, then Subset-Sum-A(n,sum-1) && Subset-Sum-B(n,sum-1) and so on.

然后设置sum = S并检查Subset-Sum-A(n,sum)&& Subset-Sum-B(n,sum),如果不是,则Subset-Sum-A(n,sum-1)&& Subset- Sum-B(n,sum-1)等。

#1


0  

EDIT 2: I misunderstood the Questions Initially, the order matters here hence my solution is actually incorrect. I have now updated with a new solution Idea (And left the old one for reference for any one interested)!

编辑2:我误解了问题最初,订单在这里很重要,因此我的解决方案实际上是不正确的。我现在更新了一个新的解决方案Idea(留下旧的解决方案供任何感兴趣的人使用)!

NEW:

So we will define DP[i][j] as the minimum difference between two sub-sequences of A[1..i] and B[1..j].

因此,我们将DP [i] [j]定义为A [1..i]和B [1..j]的两个子序列之间的最小差异。

Now let's consider any two sub sequences of them a', b' respectively. There are 4 scenarios:

现在让我们分别考虑它们的任何两个子序列a',b'。有4种情况:

  1. a' contains A[i], b' contains B[j]
  2. a'包含A [i],b'包含B [j]

  3. a' contains A[i], b' doesn't contain B[j]
  4. a'包含A [i],b'不包含B [j]

  5. a' doesn't contain A[i], b' contains B[j]
  6. a'不包含A [i],b'包含B [j]

  7. a' doesn't contain A[i], b' doesn't contain B[j]
  8. a'不包含A [i],b'不包含B [j]

1 Corresponds to the recurrenceDP[i-1][j-1]+|A[i]-B[j]|. 2 Corresponds to the recurrence DP[i][j-1]. 3 Corresponds to the recurrence DP[i-1][j], finally 3 Corresponds to the recurrence DP[i-1][j-1].

1对应于recurrenceDP [i-1] [j-1] + | A [i] -B [j] |。 2对应于复发DP [i] [j-1]。 3对应于复发DP [i-1] [j],最后3对应于复发DP [i-1] [j-1]。

Since we want the minimum of all these scenario, we have that:

由于我们想要所有这些场景中的最小值,我们有:

DP[i][j]=min(DP[i-1][j-1]+|A[i]-B[j]|, DP[i][j-1], DP[i-1][j], DP[i-1][j-1])

DP [i] [j] = min(DP [i-1] [j-1] + | A [i] -B [j] |,DP [i] [j-1],DP [i-1] [j],DP [i-1] [j-1])

This runs in O(n^2) time. Can you build up from here?

这在O(n ^ 2)时间内运行。你能从这里建立吗?

OLD:

This is a variation of the subset sum problem. We will call Sum(a) as the sum of all elements in a. So we can reword your problem as follows:

这是子集和问题的变体。我们将Sum(a)称为a中所有元素的总和。所以我们可以按如下方式重新编写您的问题:

Given a set A, and a set B, We ideally want to find a sub sequence of B of sum Sum(A). If we can't find any, we want to find a subsequence of B that sums up to Sum(A)-1, ...., and so on.

给定集合A和集合B,理想地希望找到总和(B)的B的子序列。如果我们找不到任何东西,我们想找到B的子序列,它总结为Sum(A)-1,....,依此类推。

So, You first fill a table SubsetSum(i,j) which is true if there is a subset sum of elements B[1..i] with sum j and false otherwise. Fill the entire table. This takes O(nS) Time where n is the number of elements, and S is the total sum of A = Sum(A).

所以,你首先填写一个表SubsetSum(i,j),如果存在元素B [1..i]的子集和,则j为真,否则为false。填写整个表格。这需要O(nS)时间,其中n是元素的数量,S是A = Sum(A)的总和。

Next, start with sum = Sum(A). Then check SubsetSum(n,sum), if it is true, you're done, other wise, try SubsetSum(n, sum-1), if it is true then you're done, other wise, try SubsetSum(n, sum - 2), ..., and so on. This step runs in O(S)

接下来,以sum = Sum(A)开始。然后检查SubsetSum(n,sum),如果是真的,你就完成了,其他方面,尝试SubsetSum(n,sum-1),如果是真的那么你就完成了,其他方面,尝试SubsetSum(n, sum - 2),...等等。此步骤在O(S)中运行

That way, you are minimizing the difference between A and your subsequence of B.

这样,你最小化A和你的子序列之间的差异。

This runs in O(nS).

这在O(nS)中运行。

Does this get you started?

这会让你入门吗?

EDIT:

If you can also choose a sub sequence of A, then you can define S=max(Sum(A), Sum(B)). Then come up with two tables:

如果还可以选择A的子序列,则可以定义S = max(Sum(A),Sum(B))。然后拿出两张桌子:

Subset-Sum-A(i,j) and Subset-Sum-B(i,j)defined as above.

子集Sum-A(i,j)和Subset-Sum-B(i,j)如上定义。

Then set sum=S and check if Subset-Sum-A(n,sum) && Subset-Sum-B(n,sum), if not true, then Subset-Sum-A(n,sum-1) && Subset-Sum-B(n,sum-1) and so on.

然后设置sum = S并检查Subset-Sum-A(n,sum)&& Subset-Sum-B(n,sum),如果不是,则Subset-Sum-A(n,sum-1)&& Subset- Sum-B(n,sum-1)等。