从阵列中删除偶数量的重复项

时间:2021-12-01 08:26:44

I have an array

我有一个阵列

[ 1, 0, 0, 0, 5, 2, 4, 5, 2, 2 ]

I need to delete even amounts of duplicates.

我需要删除大量的重复项。

That means, if a value appears an even number of times in the array then remove them all, but if it appears an odd number of times then keep just one.

这意味着,如果一个值在数组中出现偶数次,则将它们全部删除,但如果它出现奇数次,则只保留一次。

The result from the array above should be

上面数组的结果应该是

[ 1, 0, 2, 4 ]

How can I do that?

我怎样才能做到这一点?

4 个解决方案

#1


3  

This really isn't difficult, and it is very bad form to show no attempt at all at solving it yourself. I would like someone who posted questions like this to describe how they feel comfortable getting someone else to do their work for them. Even difficult crosswords don't get this flood of requests for a solution, but in this case presumably you are being paid for a solution written by someone else? Why is that not a problem to you?

这真的不难,而且自己完全没有尝试解决它是非常糟糕的形式。我希望有人发布这样的问题来描述他们如何让别人为他们的工作做得很舒服。即使是困难的填字游戏也不会得到大量的解决方案请求,但在这种情况下,大概是你为其他人写的解决方案付出了代价?为什么这不是问题?

  • Build a hash to calculate the current count for each value

    构建哈希以计算每个值的当前计数

  • use $_ % 2 do determine the new final count

    使用$ _%2确定新的最终计数

  • Deconstruct the hash to a new array

    将哈希解构为新数组

my $array = [ 1, 0, 0, 0, 5, 2, 4, 5, 2, 2 ];

my @new_array = do {

    my %counts;

    ++$counts{$_} for @$array;

    map {
        ( $_ ) x ( $counts{$_} % 2 )
    } sort { $a <=> $b } keys %counts;
};

use Data::Dump;
dd \@new_array;

output

[0, 1, 2, 4]

#2


5  

Removing duplicates is usually done as follows:

删除重复项通常如下:

use List::Util 1.44 qw( uniqnum );

@a = uniqnum @a;

or

要么

my %seen;
@a = grep { !$seen{$_}++ } @a;

To achieve what you want, we simply need chain grep that removes the other undesired elements.

为了实现你想要的,我们只需要链式grep来移除其他不需要的元素。

use List::Util 1.44 qw( uniqnum );

@a = uniqnum grep { $counts{$_} % 2 } @a;

or

要么

my %seen;
@a = grep { !$seen{$_}++ } grep { $counts{$_} % 2 } @a;

or

要么

my %seen;
@a = grep { ( $counts{$_} % 2 ) && !$seen{$_}++ } @a;

The above solutions rely on having the count of each value. To obtain that, we can use the following:

上述解决方案依赖于每个值的计数。为此,我们可以使用以下内容:

my %counts;
++$counts{$_} for @a;

All together:

全部一起:

my ( %counts, %seen );
++$counts{$_} for @a;
@a = grep { ( $counts{$_} % 2 ) && !$seen{$_}++ } @a;

Note that these methods of removing duplicates preserve the order of the elements (keeping the first duplicate). This is more efficient (O(N)) then involving sort (O(N log N)) to avoid producing something non-deterministic.

请注意,这些删除重复项的方法保留了元素的顺序(保留第一个副本)。这更有效(O(N))然后涉及排序(O(N log N))以避免产生非确定性的东西。

#3


1  

See the comments, to see how this possible solution does it.

请参阅注释,了解这种可能的解决方案是如何实现的。

#!/usr/bin/perl

use strict;
use warnings;

my @a = qw(1 0 0 0 5 2 4 5 2 2);

# Move through the array.
for (my $i = 0; $i < scalar(@a); ) {
  # Move through the positions at and ahead of current position $i
  # and collect all positions $j, that share the value at the
  # current position $i.
  my @indexes;
  for (my $j = $i; $j < scalar(@a); $j++) {
    if ($a[$j] == $a[$i]) {
      push(@indexes, $j);
    }
  }

  if (scalar(@indexes) % 2) {
    # If the number of positions collected is odd remove the first
    # position from the collection. The number of positions in the
    # collection is then even afterwards.
    shift(@indexes);
    # As we will keep the value at the current position $i no new
    # value will move into that position. Hence we have to advance
    # the current position.
    $i++;
  }

  # Move through the collected positions.
  for (my $k = 0; $k < scalar(@indexes); $k++) {
    # Remove the element at the position as indicated by the
    # $k'th element of the collect positions.
    # We have to subtract $k from the collected position, to
    # compensate for the movement of the remaining elements to the
    # left.
    splice(@a, $indexes[$k] - $k, 1);
  }
}

print("@a");

#4


0  

You have a bunch of answers, here's another:

你有一堆答案,这是另一个:

use strict;
use warnings;
use Data::Dumper;

my $input = [ 1, 0, 0, 0, 5, 2, 4, 5, 2, 2 ];
my $output = dedupe_evens($input);

print Data::Dumper->Dump([$input, $output], ['$input', '$output']);

exit;


sub dedupe_evens {
    my($input) = @_;

    my %seen;
    $seen{$_}++ foreach @$input;
    my @output = grep {
        my $count = delete $seen{$_};  # only want first occurrence
        $count && $count % 2;
    } @$input;

    return \@output;
}

Which produces this output (reformatted for brevity):

产生此输出(为简洁起见重新格式化):

$input  = [ 1, 0, 0, 0, 5, 2, 4, 5, 2, 2 ];
$output = [ 1, 0, 2, 4 ];

#1


3  

This really isn't difficult, and it is very bad form to show no attempt at all at solving it yourself. I would like someone who posted questions like this to describe how they feel comfortable getting someone else to do their work for them. Even difficult crosswords don't get this flood of requests for a solution, but in this case presumably you are being paid for a solution written by someone else? Why is that not a problem to you?

这真的不难,而且自己完全没有尝试解决它是非常糟糕的形式。我希望有人发布这样的问题来描述他们如何让别人为他们的工作做得很舒服。即使是困难的填字游戏也不会得到大量的解决方案请求,但在这种情况下,大概是你为其他人写的解决方案付出了代价?为什么这不是问题?

  • Build a hash to calculate the current count for each value

    构建哈希以计算每个值的当前计数

  • use $_ % 2 do determine the new final count

    使用$ _%2确定新的最终计数

  • Deconstruct the hash to a new array

    将哈希解构为新数组

my $array = [ 1, 0, 0, 0, 5, 2, 4, 5, 2, 2 ];

my @new_array = do {

    my %counts;

    ++$counts{$_} for @$array;

    map {
        ( $_ ) x ( $counts{$_} % 2 )
    } sort { $a <=> $b } keys %counts;
};

use Data::Dump;
dd \@new_array;

output

[0, 1, 2, 4]

#2


5  

Removing duplicates is usually done as follows:

删除重复项通常如下:

use List::Util 1.44 qw( uniqnum );

@a = uniqnum @a;

or

要么

my %seen;
@a = grep { !$seen{$_}++ } @a;

To achieve what you want, we simply need chain grep that removes the other undesired elements.

为了实现你想要的,我们只需要链式grep来移除其他不需要的元素。

use List::Util 1.44 qw( uniqnum );

@a = uniqnum grep { $counts{$_} % 2 } @a;

or

要么

my %seen;
@a = grep { !$seen{$_}++ } grep { $counts{$_} % 2 } @a;

or

要么

my %seen;
@a = grep { ( $counts{$_} % 2 ) && !$seen{$_}++ } @a;

The above solutions rely on having the count of each value. To obtain that, we can use the following:

上述解决方案依赖于每个值的计数。为此,我们可以使用以下内容:

my %counts;
++$counts{$_} for @a;

All together:

全部一起:

my ( %counts, %seen );
++$counts{$_} for @a;
@a = grep { ( $counts{$_} % 2 ) && !$seen{$_}++ } @a;

Note that these methods of removing duplicates preserve the order of the elements (keeping the first duplicate). This is more efficient (O(N)) then involving sort (O(N log N)) to avoid producing something non-deterministic.

请注意,这些删除重复项的方法保留了元素的顺序(保留第一个副本)。这更有效(O(N))然后涉及排序(O(N log N))以避免产生非确定性的东西。

#3


1  

See the comments, to see how this possible solution does it.

请参阅注释,了解这种可能的解决方案是如何实现的。

#!/usr/bin/perl

use strict;
use warnings;

my @a = qw(1 0 0 0 5 2 4 5 2 2);

# Move through the array.
for (my $i = 0; $i < scalar(@a); ) {
  # Move through the positions at and ahead of current position $i
  # and collect all positions $j, that share the value at the
  # current position $i.
  my @indexes;
  for (my $j = $i; $j < scalar(@a); $j++) {
    if ($a[$j] == $a[$i]) {
      push(@indexes, $j);
    }
  }

  if (scalar(@indexes) % 2) {
    # If the number of positions collected is odd remove the first
    # position from the collection. The number of positions in the
    # collection is then even afterwards.
    shift(@indexes);
    # As we will keep the value at the current position $i no new
    # value will move into that position. Hence we have to advance
    # the current position.
    $i++;
  }

  # Move through the collected positions.
  for (my $k = 0; $k < scalar(@indexes); $k++) {
    # Remove the element at the position as indicated by the
    # $k'th element of the collect positions.
    # We have to subtract $k from the collected position, to
    # compensate for the movement of the remaining elements to the
    # left.
    splice(@a, $indexes[$k] - $k, 1);
  }
}

print("@a");

#4


0  

You have a bunch of answers, here's another:

你有一堆答案,这是另一个:

use strict;
use warnings;
use Data::Dumper;

my $input = [ 1, 0, 0, 0, 5, 2, 4, 5, 2, 2 ];
my $output = dedupe_evens($input);

print Data::Dumper->Dump([$input, $output], ['$input', '$output']);

exit;


sub dedupe_evens {
    my($input) = @_;

    my %seen;
    $seen{$_}++ foreach @$input;
    my @output = grep {
        my $count = delete $seen{$_};  # only want first occurrence
        $count && $count % 2;
    } @$input;

    return \@output;
}

Which produces this output (reformatted for brevity):

产生此输出(为简洁起见重新格式化):

$input  = [ 1, 0, 0, 0, 5, 2, 4, 5, 2, 2 ];
$output = [ 1, 0, 2, 4 ];