找到数组中的第n个最小元素而不进行排序?

时间:2021-03-10 21:39:51

I want to write a program to find the n-th smallest element without using any sorting technique..

我想写一个程序来找到第n个最小的元素,而不使用任何排序技术。

Can we do it recursively, divide and conquer style like quick-sort?

我们可以递归地进行,分割和征服风格,如快速排序?

If not, how?

如果没有,怎么样?

6 个解决方案

#1


You can find information about that problem here: Selection algorithm.

您可以在此处找到有关该问题的信息:选择算法。

#2


What you are referring to is the Selection Algorithm, as previously noted. Specifically, your reference to quicksort suggests you are thinking of the partition based selection.

您所指的是选择算法,如前所述。具体来说,您对quicksort的引用表明您正在考虑基于分区的选择。

Here's how it works:

以下是它的工作原理:

  • Like in Quicksort, you start by picking a good pivot: something that you think is nearly half-way through your list. Then you go through your entire list of items swapping things back and forth until all the items less than your pivot are in the beginning of the list, and all things greater than your pivot are at the end. Your pivot goes into the leftover spot in the middle.
  • 就像在Quicksort中一样,你首先要选择一个好的支点:你认为这几乎是你列表中的一半。然后,您将浏览整个项目列表,这些项目来回交换,直到所有项目都不在您的数据库的开头,并且所有大于您的项目的项目都在最后。你的支点进入中间的剩余点。

  • Normally in a quicksort you'd recurse on both sides of the pivot, but for the Selection Algorithm you'll only recurse on the side that contains the index you are interested in. So, if you want to find the 3rd lowest value, recurse on whichever side contains index 2 (because index 0 is the 1st lowest value).
  • 通常在快速排序中你会在枢轴的两侧递归,但是对于选择算法,你只会在包含你感兴趣的索引的那一侧递归。所以,如果你想找到第三个最低值,那么,递归在哪一方包含索引2(因为索引0是第一个最低值)。

  • You can stop recursing when you've narrowed the region to just the one index. At the end, you'll have one unsorted list of the "m-1" smallest objects, and another unsorted list of the "n-m" largest objects. The "m"th object will be inbetween.
  • 当您将区域缩小到一个索引时,可以停止递归。最后,您将拥有一个“m-1”个最小对象的未排序列表,以及另一个“n-m”个最大对象的未排序列表。第m个对象将在其间。

This algorithm is also good for finding a sorted list of the highest m elements... just select the m'th largest element, and sort the list above it. Or, for an algorithm that is a little bit faster, do the Quicksort algorithm, but decline to recurse into regions not overlapping the region for which you want to find the sorted values.

此算法也适用于查找最高m个元素的排序列表...只需选择第m个最大元素,然后对其上方的列表进行排序。或者,对于速度稍快的算法,请执行Quicksort算法,但拒绝递归到与要查找已排序值的区域不重叠的区域。

The really neat thing about this is that it normally runs in O(n) time. The first time through, it sees the entire list. On the first recursion, it sees about half, then one quarter, etc. So, it looks at about 2n elements, therefore it runs in O(n) time. Unfortunately, as in quicksort, if you consistently pick a bad pivot, you'll be running in O(n2) time.

关于这一点的真正好处是它通常在O(n)时间内运行。第一次,它看到整个列表。在第一次递归时,它看到大约一半,然后是四分之一等等。因此,它查看大约2n个元素,因此它在O(n)时间内运行。不幸的是,就像在quicksort中一样,如果你一直选择一个糟糕的支点,你将在O(n2)时间内运行。

#3


This task is quite possible to complete within roughly O(n) time (n being the length of the list) by using a heap structure (specifically, a priority queue based on a Fibonacci heap), which gives O(1) insertion time and O(log n) removal time).

通过使用堆结构(具体地,基于Fibonacci堆的优先级队列),这个任务很可能在大约O(n)时间内完成(n是列表的长度),这给出了O(1)插入时间和O(log n)去除时间)。

Consider the task of retrieving the m-th smallest element from the list. By simply looping over the list and adding each item to the priority queue (of size m), you can effectively create a queue of each of the items in the list in O(n) time (or possibly fewer using some optimisations, though I'm not sure this is exceedingly helpful). Then, it is a straightforward matter of removing the element with lowest priority in the queue (highest priority being the smallest item), which only takes O(log m) time in total, and you're finished.

考虑从列表中检索第m个最小元素的任务。通过简单地循环遍历列表并将每个项目添加到优先级队列(大小为m),您可以在O(n)时间内有效地创建列表中每个项目的队列(或者使用一些优化可能更少,尽管我我不确定这是非常有帮助的。然后,删除队列中具有最低优先级的元素(最高优先级是最小项目)是一件简单的事情,它总共需要O(log m)时间,并且您已完成。

So overall, the time complexity of the algorithm would be O(n + log n), but since log n << n (i.e. n grows a lot faster than log n), this reduces to simply O(n). I don't think you'll be able to get anything significantly more efficient than this in the general case.

总的来说,算法的时间复杂度为O(n + log n),但由于log n << n(即n比log n增长快很多),这简化为O(n)。在一般情况下,我认为你不会得到比这更有效的东西。

#4


You can use Binary heap, if u dont want to use fibonacci heap.

你可以使用Binary堆,如果你不想使用fibonacci堆。

Algo:

  1. Contruct the min binary heap from the array this operation will take O(n) time.

    从数组构造最小二进制堆此操作将花费O(n)时间。

  2. Since this is a min binary heap, the element at the root is the minimum value.

    由于这是一个最小二进制堆,因此根元素是最小值。

  3. So keep on removing element frm root, till u get ur kth minimum value. o(1) operation

    所以继续删除元素frm root,直到你得到你的kth最小值。 o(1)操作

  4. Make sure after every remove you re-store the heap kO(logn) operation.

    确保在每次删除后重新存储堆kO(logn)操作。

So running time here is O(klogn) + O(n)............so it is O(klogn)...

所以这里的运行时间是O(klogn)+ O(n)............所以它是O(klogn)......

#5


Two stacks can be used like this to locate the Nth smallest number in one pass.

可以像这样使用两个堆栈来定位一次传递中的第N个最小数字。

  • Start with empty Stack-A and Stack-B
  • 从空Stack-A和Stack-B开始

  • PUSH the first number into Stack-A
  • 将第一个数字推入Stack-A

  • The next number onwards, choose to PUSH into Stack-A only if the number is smaller than its top
  • 接下来的数字,只有当数字小于其顶部时,才选择将PUSH推入Stack-A

  • When you have to PUSH into Stack-A, run through these steps
    • While TOP of Stack-A is larger than new number, POP TOP of Stack-A and push it into Stack-B
    • 当Stack-A的TOP大于新数字时,Stack-A的POP TOP将其推入Stack-B

    • When Stack-A goes empty or its TOP is smaller than new number, PUSH in the new number and restore the contents of Stack-B over it
    • 当Stack-A变空或其TOP小于新数字时,在新数字中按PUSH并在其上恢复Stack-B的内容

    • At this point you have inserted the new number to its correct (sorted) place in Stack-A and Stack-B is empty again
    • 此时,您已将新数字插入Stack-A中的正确(已排序)位置,而Stack-B再次为空

    • If Stack-A depth is now sufficient you have reached the end of your search
    • 如果Stack-A深度现已足够您已达到搜索结束

  • 当你必须推入Stack-A时,执行这些步骤当Stack-A的TOP大于新数字时,Stack-A的POP TOP并将其推入Stack-B当Stack-A变空或其TOP更小时新数字,新数字中的PUSH以及通过它恢复Stack-B的内容此时您已将新数字插入到Stack-A中的正确(已排序)位置,而Stack-B再次为空如果Stack-A深度已经足够你已经到了搜索的末尾


I generally agree to Noldorins' optimization analysis.
This stack solution is towards a simple scheme that will work (with relatively more data movement -- across the two stacks). The heap scheme reduces the fetch for Nth smallest number to a tree traversal (log m).

我普遍同意Noldorins的优化分析。这个堆栈解决方案是朝着一个简单的方案工作(相对更多的数据移动 - 跨越两个堆栈)。堆方案将第N个最小数的提取减少到树遍历(log m)。

If your target is an optimal solution (say for a large set of numbers or maybe for a programming assignment, where optimization and the demonstration of it are critical) you should use the heap technique.

如果您的目标是一个最佳解决方案(例如,对于大量数字或可能用于编程分配,优化和演示是至关重要的),您应该使用堆技术。

The stack solution can be compressed in space requirements by implementing the two stacks within the same space of K elements (where K is the size of your data set). So, the downside is just extra stack movement as you insert.

通过在K个元素的相同空间内实现两个堆栈(其中K是数据集的大小),可以在空间要求中压缩堆栈解决方案。因此,缺点是插入时额外的堆栈移动。

#6


Here is the Ans to find Kth smallest element from an array:

#include<stdio.h>
#include<conio.h>
#include<iostream>
using namespace std;
int Nthmin=0,j=0,i;
int GetNthSmall(int numbers[],int NoOfElements,int Nthsmall);
int main()
{
    int size;
    cout<<"Enter Size of array\n";
    cin>>size;
    int *arr=(int*)malloc(sizeof(int)*size);
    cout<<"\nEnter array elements\n";
    for(i=0;i<size;i++)
        cin>>*(arr+i);
    cout<<"\n";
    for(i=0;i<size;i++)
        cout<<*(arr+i)<<" ";
    cout<<"\n";
    int n=sizeof(arr)/sizeof(int);
    int result=GetNthSmall(arr,size,3);
    printf("Result = %d",result);
    getch();
    return 0;
}

int GetNthSmall(int numbers[],int NoOfElements,int Nthsmall)
{
    int min=numbers[0];
    while(j<Nthsmall)
    {
        Nthmin=numbers[0];
        for(i=1;i<NoOfElements;i++)
        {
            if(j==0)
            {
                if(numbers[i]<min)
                {
                    min=numbers[i];
                }
                Nthmin=min;
            }
            else
            {
                if(numbers[i]<Nthmin && numbers[i]>min)
                    Nthmin=numbers[i];
            }
        }
        min=Nthmin;
        j++;
    }
    return Nthmin;
}

#1


You can find information about that problem here: Selection algorithm.

您可以在此处找到有关该问题的信息:选择算法。

#2


What you are referring to is the Selection Algorithm, as previously noted. Specifically, your reference to quicksort suggests you are thinking of the partition based selection.

您所指的是选择算法,如前所述。具体来说,您对quicksort的引用表明您正在考虑基于分区的选择。

Here's how it works:

以下是它的工作原理:

  • Like in Quicksort, you start by picking a good pivot: something that you think is nearly half-way through your list. Then you go through your entire list of items swapping things back and forth until all the items less than your pivot are in the beginning of the list, and all things greater than your pivot are at the end. Your pivot goes into the leftover spot in the middle.
  • 就像在Quicksort中一样,你首先要选择一个好的支点:你认为这几乎是你列表中的一半。然后,您将浏览整个项目列表,这些项目来回交换,直到所有项目都不在您的数据库的开头,并且所有大于您的项目的项目都在最后。你的支点进入中间的剩余点。

  • Normally in a quicksort you'd recurse on both sides of the pivot, but for the Selection Algorithm you'll only recurse on the side that contains the index you are interested in. So, if you want to find the 3rd lowest value, recurse on whichever side contains index 2 (because index 0 is the 1st lowest value).
  • 通常在快速排序中你会在枢轴的两侧递归,但是对于选择算法,你只会在包含你感兴趣的索引的那一侧递归。所以,如果你想找到第三个最低值,那么,递归在哪一方包含索引2(因为索引0是第一个最低值)。

  • You can stop recursing when you've narrowed the region to just the one index. At the end, you'll have one unsorted list of the "m-1" smallest objects, and another unsorted list of the "n-m" largest objects. The "m"th object will be inbetween.
  • 当您将区域缩小到一个索引时,可以停止递归。最后,您将拥有一个“m-1”个最小对象的未排序列表,以及另一个“n-m”个最大对象的未排序列表。第m个对象将在其间。

This algorithm is also good for finding a sorted list of the highest m elements... just select the m'th largest element, and sort the list above it. Or, for an algorithm that is a little bit faster, do the Quicksort algorithm, but decline to recurse into regions not overlapping the region for which you want to find the sorted values.

此算法也适用于查找最高m个元素的排序列表...只需选择第m个最大元素,然后对其上方的列表进行排序。或者,对于速度稍快的算法,请执行Quicksort算法,但拒绝递归到与要查找已排序值的区域不重叠的区域。

The really neat thing about this is that it normally runs in O(n) time. The first time through, it sees the entire list. On the first recursion, it sees about half, then one quarter, etc. So, it looks at about 2n elements, therefore it runs in O(n) time. Unfortunately, as in quicksort, if you consistently pick a bad pivot, you'll be running in O(n2) time.

关于这一点的真正好处是它通常在O(n)时间内运行。第一次,它看到整个列表。在第一次递归时,它看到大约一半,然后是四分之一等等。因此,它查看大约2n个元素,因此它在O(n)时间内运行。不幸的是,就像在quicksort中一样,如果你一直选择一个糟糕的支点,你将在O(n2)时间内运行。

#3


This task is quite possible to complete within roughly O(n) time (n being the length of the list) by using a heap structure (specifically, a priority queue based on a Fibonacci heap), which gives O(1) insertion time and O(log n) removal time).

通过使用堆结构(具体地,基于Fibonacci堆的优先级队列),这个任务很可能在大约O(n)时间内完成(n是列表的长度),这给出了O(1)插入时间和O(log n)去除时间)。

Consider the task of retrieving the m-th smallest element from the list. By simply looping over the list and adding each item to the priority queue (of size m), you can effectively create a queue of each of the items in the list in O(n) time (or possibly fewer using some optimisations, though I'm not sure this is exceedingly helpful). Then, it is a straightforward matter of removing the element with lowest priority in the queue (highest priority being the smallest item), which only takes O(log m) time in total, and you're finished.

考虑从列表中检索第m个最小元素的任务。通过简单地循环遍历列表并将每个项目添加到优先级队列(大小为m),您可以在O(n)时间内有效地创建列表中每个项目的队列(或者使用一些优化可能更少,尽管我我不确定这是非常有帮助的。然后,删除队列中具有最低优先级的元素(最高优先级是最小项目)是一件简单的事情,它总共需要O(log m)时间,并且您已完成。

So overall, the time complexity of the algorithm would be O(n + log n), but since log n << n (i.e. n grows a lot faster than log n), this reduces to simply O(n). I don't think you'll be able to get anything significantly more efficient than this in the general case.

总的来说,算法的时间复杂度为O(n + log n),但由于log n << n(即n比log n增长快很多),这简化为O(n)。在一般情况下,我认为你不会得到比这更有效的东西。

#4


You can use Binary heap, if u dont want to use fibonacci heap.

你可以使用Binary堆,如果你不想使用fibonacci堆。

Algo:

  1. Contruct the min binary heap from the array this operation will take O(n) time.

    从数组构造最小二进制堆此操作将花费O(n)时间。

  2. Since this is a min binary heap, the element at the root is the minimum value.

    由于这是一个最小二进制堆,因此根元素是最小值。

  3. So keep on removing element frm root, till u get ur kth minimum value. o(1) operation

    所以继续删除元素frm root,直到你得到你的kth最小值。 o(1)操作

  4. Make sure after every remove you re-store the heap kO(logn) operation.

    确保在每次删除后重新存储堆kO(logn)操作。

So running time here is O(klogn) + O(n)............so it is O(klogn)...

所以这里的运行时间是O(klogn)+ O(n)............所以它是O(klogn)......

#5


Two stacks can be used like this to locate the Nth smallest number in one pass.

可以像这样使用两个堆栈来定位一次传递中的第N个最小数字。

  • Start with empty Stack-A and Stack-B
  • 从空Stack-A和Stack-B开始

  • PUSH the first number into Stack-A
  • 将第一个数字推入Stack-A

  • The next number onwards, choose to PUSH into Stack-A only if the number is smaller than its top
  • 接下来的数字,只有当数字小于其顶部时,才选择将PUSH推入Stack-A

  • When you have to PUSH into Stack-A, run through these steps
    • While TOP of Stack-A is larger than new number, POP TOP of Stack-A and push it into Stack-B
    • 当Stack-A的TOP大于新数字时,Stack-A的POP TOP将其推入Stack-B

    • When Stack-A goes empty or its TOP is smaller than new number, PUSH in the new number and restore the contents of Stack-B over it
    • 当Stack-A变空或其TOP小于新数字时,在新数字中按PUSH并在其上恢复Stack-B的内容

    • At this point you have inserted the new number to its correct (sorted) place in Stack-A and Stack-B is empty again
    • 此时,您已将新数字插入Stack-A中的正确(已排序)位置,而Stack-B再次为空

    • If Stack-A depth is now sufficient you have reached the end of your search
    • 如果Stack-A深度现已足够您已达到搜索结束

  • 当你必须推入Stack-A时,执行这些步骤当Stack-A的TOP大于新数字时,Stack-A的POP TOP并将其推入Stack-B当Stack-A变空或其TOP更小时新数字,新数字中的PUSH以及通过它恢复Stack-B的内容此时您已将新数字插入到Stack-A中的正确(已排序)位置,而Stack-B再次为空如果Stack-A深度已经足够你已经到了搜索的末尾


I generally agree to Noldorins' optimization analysis.
This stack solution is towards a simple scheme that will work (with relatively more data movement -- across the two stacks). The heap scheme reduces the fetch for Nth smallest number to a tree traversal (log m).

我普遍同意Noldorins的优化分析。这个堆栈解决方案是朝着一个简单的方案工作(相对更多的数据移动 - 跨越两个堆栈)。堆方案将第N个最小数的提取减少到树遍历(log m)。

If your target is an optimal solution (say for a large set of numbers or maybe for a programming assignment, where optimization and the demonstration of it are critical) you should use the heap technique.

如果您的目标是一个最佳解决方案(例如,对于大量数字或可能用于编程分配,优化和演示是至关重要的),您应该使用堆技术。

The stack solution can be compressed in space requirements by implementing the two stacks within the same space of K elements (where K is the size of your data set). So, the downside is just extra stack movement as you insert.

通过在K个元素的相同空间内实现两个堆栈(其中K是数据集的大小),可以在空间要求中压缩堆栈解决方案。因此,缺点是插入时额外的堆栈移动。

#6


Here is the Ans to find Kth smallest element from an array:

#include<stdio.h>
#include<conio.h>
#include<iostream>
using namespace std;
int Nthmin=0,j=0,i;
int GetNthSmall(int numbers[],int NoOfElements,int Nthsmall);
int main()
{
    int size;
    cout<<"Enter Size of array\n";
    cin>>size;
    int *arr=(int*)malloc(sizeof(int)*size);
    cout<<"\nEnter array elements\n";
    for(i=0;i<size;i++)
        cin>>*(arr+i);
    cout<<"\n";
    for(i=0;i<size;i++)
        cout<<*(arr+i)<<" ";
    cout<<"\n";
    int n=sizeof(arr)/sizeof(int);
    int result=GetNthSmall(arr,size,3);
    printf("Result = %d",result);
    getch();
    return 0;
}

int GetNthSmall(int numbers[],int NoOfElements,int Nthsmall)
{
    int min=numbers[0];
    while(j<Nthsmall)
    {
        Nthmin=numbers[0];
        for(i=1;i<NoOfElements;i++)
        {
            if(j==0)
            {
                if(numbers[i]<min)
                {
                    min=numbers[i];
                }
                Nthmin=min;
            }
            else
            {
                if(numbers[i]<Nthmin && numbers[i]>min)
                    Nthmin=numbers[i];
            }
        }
        min=Nthmin;
        j++;
    }
    return Nthmin;
}