数组中数字的绝对差值之和

时间:2021-04-19 18:24:30

I want to calculate the sum of absolute differences of a number at index i with all integers up to index i-1 in o(n). But i am not able to think of any approach better than o(n^2) .

我想计算索引i处的数字与o(n)中索引i-1的所有整数的绝对差值之和。但我无法想到任何比o(n ^ 2)更好的方法。

For E.g. :

例如:

[3,5,6,7,1]

array with absolute sum will be(for integer at index i sum will be at index i in another array):

具有绝对和的数组将是(对于索引i的整数,sum将在另一个数组中的索引i处):

[0, 2, 4, 7, 17]

Can anyone help me to reduce the complexity to o(n) (if not possible then at least better optimization in terms of time complexity)?

任何人都可以帮助我降低o(n)的复杂性(如果不可能那么至少在时间复杂度方面更好的优化)?

Here my python code:

这是我的python代码:

a=[3,5,6,7,1]
n=5
absoluteSumArray=[]
for i in range(0,n):
  Sum=0
  for j in range(0,i):
     Sum+=abs(int(a[i])-int(a[j]))
  absoluteSumArray.append(Sum)

2 个解决方案

#1


16  

I can offer an O(n log n) solution for a start: Let fi be the i-th number of the result. We have:

我可以为一个开始提供一个O(n log n)解决方案:让fi成为结果的第i个数字。我们有:

数组中数字的绝对差值之和

When walking through the array from left to right and maintain a binary search tree of the elements a0 to ai-1, we can solve all parts of the formula in O(log n):

当从左到右遍历数组并维护元素a0到ai-1的二叉搜索树时,我们可以在O(log n)中求解公式的所有部分:

  • Keep subtree sizes to count the elements larger than/smaller than a given one
  • 保持子树大小以计算大于/小于给定元素的元素
  • Keep cumulative subtree sums to answer the sum queries for elements larger than/smaller than a given one
  • 保持累积子树总和以回答大于/小于给定元素的元素的总和查询

We can replace the augmented search tree with some simpler data structures if we want to avoid the implementation cost:

如果我们想要避免实现成本,我们可以用一些更简单的数据结构替换增强的搜索树:

  • Sort the array beforehand. Assign every number its rank in the sorted order
  • 事先对数组进行排序。按排序顺序为其排名分配每个数字
  • Keep a binary indexed tree of 0/1 values to calculate the number of elements smaller than a given value
  • 保留0/1值的二进制索引树,以计算小于给定值的元素数
  • Keep another binary indexed tree of the array values to calculate the sums of elements smaller than a given value
  • 保留数组值的另一个二进制索引树,以计算小于给定值的元素总和

TBH I don't think this can be solved in O(n) in the general case. At the very least you would need to sort the numbers at some point. But maybe the numbers are bounded or you have some other restriction, so you might be able to implement the sum and count operations in O(1).

TBH我不认为在一般情况下这可以在O(n)中解决。至少你需要在某个时候对数字进行排序。但也许这些数字有限,或者你有其他限制,所以你可以在O(1)中实现sum和count运算。

An implementation:

实施:

# binary-indexed tree, allows point updates and prefix sum queries
class Fenwick:
  def __init__(self, n):
    self.tree = [0]*(n+1)
    self.n = n
  def update_point(self, i, val):  # O(log n)
    i += 1
    while i <= self.n:
      self.tree[i] += val
      i += i & -i
  def read_prefix(self, i):        # O(log n)
    i += 1
    sum = 0
    while i > 0:
      sum += self.tree[i]
      i -= i & -i
    return sum

def solve(a):
  rank = { v : i for i, v in enumerate(sorted(a)) }
  res = []
  counts, sums = Fenwick(len(a)), Fenwick(len(a))
  total_sum = 0
  for i, x in enumerate(a):
    r = rank[x]
    num_smaller = counts.read_prefix(r)
    sum_smaller = sums.read_prefix(r)
    res.append(total_sum - 2*sum_smaller + x * (2*num_smaller - i))
    counts.update_point(r, 1)
    sums.update_point(r, x)
    total_sum += x
  return res

print(solve([3,5,6,7,1]))  # [0, 2, 4, 7, 17]
print(solve([2,0,1]))      # [0, 2, 2]

#2


5  

Here's an Omega(n log n)-comparison lower bound in the linear decision tree model. This rules out the possibility of a "nice" o(n log n)-time algorithm (two now-deleted answers both were in this class).

这是线性决策树模型中的Omega(n log n) - 比较下界。这排除了“漂亮”o(n log n)时间算法的可能性(两个现在删除的答案都在这个类中)。

There is a trivial reduction to this problem from the problem of computing

从计算问题中解决这个问题是微不足道的

f(x1, ..., xn) = sum_i sum_j |xi - xj|.

The function f is totally differentiable at x1, ..., xn if and only if x1, ..., xn are pairwise distinct. The set where f is totally differentiable thus has n! connected components, of which each leaf of the decision tree can handle at most one.

当且仅当x1,...,xn成对不同时,函数f在x1,...,xn处是完全可微的。 f完全可区分的集合因此具有n!连接组件,决策树的每个叶子最多可以处理一个组件。

#1


16  

I can offer an O(n log n) solution for a start: Let fi be the i-th number of the result. We have:

我可以为一个开始提供一个O(n log n)解决方案:让fi成为结果的第i个数字。我们有:

数组中数字的绝对差值之和

When walking through the array from left to right and maintain a binary search tree of the elements a0 to ai-1, we can solve all parts of the formula in O(log n):

当从左到右遍历数组并维护元素a0到ai-1的二叉搜索树时,我们可以在O(log n)中求解公式的所有部分:

  • Keep subtree sizes to count the elements larger than/smaller than a given one
  • 保持子树大小以计算大于/小于给定元素的元素
  • Keep cumulative subtree sums to answer the sum queries for elements larger than/smaller than a given one
  • 保持累积子树总和以回答大于/小于给定元素的元素的总和查询

We can replace the augmented search tree with some simpler data structures if we want to avoid the implementation cost:

如果我们想要避免实现成本,我们可以用一些更简单的数据结构替换增强的搜索树:

  • Sort the array beforehand. Assign every number its rank in the sorted order
  • 事先对数组进行排序。按排序顺序为其排名分配每个数字
  • Keep a binary indexed tree of 0/1 values to calculate the number of elements smaller than a given value
  • 保留0/1值的二进制索引树,以计算小于给定值的元素数
  • Keep another binary indexed tree of the array values to calculate the sums of elements smaller than a given value
  • 保留数组值的另一个二进制索引树,以计算小于给定值的元素总和

TBH I don't think this can be solved in O(n) in the general case. At the very least you would need to sort the numbers at some point. But maybe the numbers are bounded or you have some other restriction, so you might be able to implement the sum and count operations in O(1).

TBH我不认为在一般情况下这可以在O(n)中解决。至少你需要在某个时候对数字进行排序。但也许这些数字有限,或者你有其他限制,所以你可以在O(1)中实现sum和count运算。

An implementation:

实施:

# binary-indexed tree, allows point updates and prefix sum queries
class Fenwick:
  def __init__(self, n):
    self.tree = [0]*(n+1)
    self.n = n
  def update_point(self, i, val):  # O(log n)
    i += 1
    while i <= self.n:
      self.tree[i] += val
      i += i & -i
  def read_prefix(self, i):        # O(log n)
    i += 1
    sum = 0
    while i > 0:
      sum += self.tree[i]
      i -= i & -i
    return sum

def solve(a):
  rank = { v : i for i, v in enumerate(sorted(a)) }
  res = []
  counts, sums = Fenwick(len(a)), Fenwick(len(a))
  total_sum = 0
  for i, x in enumerate(a):
    r = rank[x]
    num_smaller = counts.read_prefix(r)
    sum_smaller = sums.read_prefix(r)
    res.append(total_sum - 2*sum_smaller + x * (2*num_smaller - i))
    counts.update_point(r, 1)
    sums.update_point(r, x)
    total_sum += x
  return res

print(solve([3,5,6,7,1]))  # [0, 2, 4, 7, 17]
print(solve([2,0,1]))      # [0, 2, 2]

#2


5  

Here's an Omega(n log n)-comparison lower bound in the linear decision tree model. This rules out the possibility of a "nice" o(n log n)-time algorithm (two now-deleted answers both were in this class).

这是线性决策树模型中的Omega(n log n) - 比较下界。这排除了“漂亮”o(n log n)时间算法的可能性(两个现在删除的答案都在这个类中)。

There is a trivial reduction to this problem from the problem of computing

从计算问题中解决这个问题是微不足道的

f(x1, ..., xn) = sum_i sum_j |xi - xj|.

The function f is totally differentiable at x1, ..., xn if and only if x1, ..., xn are pairwise distinct. The set where f is totally differentiable thus has n! connected components, of which each leaf of the decision tree can handle at most one.

当且仅当x1,...,xn成对不同时,函数f在x1,...,xn处是完全可微的。 f完全可区分的集合因此具有n!连接组件,决策树的每个叶子最多可以处理一个组件。