C中数据结构和算法分析中的二进制搜索

时间:2021-12-21 20:03:18

When addressed the Binary_Search in chapter 2(2.4.4),the author mentioned that "

在第2章(2.4.4)中提到Binary_Search时,提交人提到“

Notice that the variables cannot be declared unsigned(why?).In cases where the unsigned qualifier is questionable, we will not use it. As an example, if the unsigned qualifier is dependent on an array not beginning at zero, we will discard it. we will not use it. As an example, if the unsigned qualifier is dependent on an array not beginning at zero, we will discard it. We will also avoid using the unsigned type for variables that are counters in a for loop, because it is common to change the direction of a loop counter from increasing to decreasing and the unsigned qualifier is typically appropriate for the former case only. For example, the code in Exercise 2.10 does not work if i is declared unsigned. ".

请注意,变量不能声明为无符号(为什么?)。如果unsigned限定符有问题,我们将不会使用它。例如,如果unsigned限定符依赖于一个不从零开始的数组,我们将丢弃它。我们不会用它。例如,如果unsigned限定符依赖于一个不从零开始的数组,我们将丢弃它。我们还将避免将无符号类型用于for循环中的计数器变量,因为通常将循环计数器的方向从增加更改为减少,而无符号限定符通常仅适用于前一种情况。例如,如果i声明为unsigned,则练习2.10中的代码不起作用。 ”。

The code as follow:

代码如下:

int binary_search( input_type a[ ], input_type x, unsigned int n )
{
int low, mid, high; /* Can't be unsigned; why? */
/*1*/ low = 0; high = n - 1;
/*2*/ while( low <= high )
{
/*3*/ mid = (low + high)/2;
/*4*/ if( a[mid] < x )
/*5*/ low = mid + 1;
else
/*6*/ if ( a[mid] < x )
/*7*/ high = mid - 1;
else
/*8*/ return( mid ); /* found */
}
/*9*/ return( NOT_FOUND );
}

Q: I can't understand that the variable cannot be declared unsigned.Why the unsigned qualifier is questionable? And how does unsigned qualifier change the direction of a loop counter from increasing to decreasing?

问:我无法理解变量不能被声明为无符号。为什么无符号限定符有问题?无符号限定符如何将循环计数器的方向从增加变为减少?

2 个解决方案

#1


3  

If mid is 0, you want the line high = mid - 1; to set high to -1, which will cause the loop to stop.

如果mid为0,则需要行high = mid - 1;将high设置为-1,这将导致循环停止。

If the variables were unsigned, high would wrap around to the maximum unsigned value which would cause a read past the end of the buffer and a likely crash.

如果变量是无符号的,则高值将回绕到最大无符号值,这将导致读取超过缓冲区的末尾并可能发生崩溃。

As for for loops which count down, the following loop will never end:

对于倒计时的for循环,以下循环永远不会结束:

for (unsigned i = START_VAL; i >= 0; i--) {...}  //WRONG

Because i is unsigned, the condition i >= 0 will always be true.

因为我是无符号的,条件i> = 0将始终为真。

#2


2  

The author of the book is wrong and it seems he is a weak programmer.:)

这本书的作者是错的,似乎他是一个弱的程序员。:)

First of all it is a bad idea to use the type int as the size of an array. He should use the type size_t or at least the type ptrdiff_t defined in the header <stddef.h> because the value of size of an array can be greater than the value that the type int can accommodate.

首先,使用int类型作为数组的大小是个坏主意。他应该使用类型size_t或至少在头文件 中定义的类型ptrdiff_t,因为数组的大小值可能大于int类型可以容纳的值。

Take into account that all C standard functions (as for example string functions) that deal with array sizes define corresponding parameters as having the type size_t.

考虑到处理数组大小的所有C标准函数(例如字符串函数)将相应的参数定义为类型size_t。

Here is the declaration of the standard function bsearch.

这是标准函数bsearch的声明。

void *bsearch(const void *key, const void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));

The both parameters, nmemb and size, have type size_t.

两个参数nmemb和size都具有size_t类型。

The problem is not with signed or unsigned int type used as the type of the array size. The problem is how the algorithm is implemented.

问题不在于使用有符号或无符号的int类型作为数组大小的类型。问题是如何实现算法。

For example he could implement the algorithm the following way as it is shown in the demonstrative program

例如,他可以按照以下方式实现算法,如演示程序中所示

#include <stdio.h>

size_t binary_search( const int a[], int x, size_t n )
{
    size_t low = 0, high = n;

    while ( low < high )
    {
        size_t middle = low + ( high - low ) / 2;

        if ( a[middle] < x )
        {
            low = middle + 1;
        }
        else if ( x < a[middle] )
        {
            high = middle;
        }
        else
        {
            return middle;
        }
    }

    return n;
}   

int main(void) 
{
    int a[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    size_t N = sizeof( a ) / sizeof( *a );

    for ( int i = -1; i <= a[N - 1] + 1; i++ )
    {
        printf( "x = %d: %zu\n", i, binary_search( a, i, N ) );
    }

    return 0;
}

The program output is

程序输出是

x = -1: 10
x = 0: 0
x = 1: 1
x = 2: 2
x = 3: 3
x = 4: 4
x = 5: 5
x = 6: 6
x = 7: 7
x = 8: 8
x = 9: 9
x = 10: 10

As you see if a value is not found in the array then the function returns the index that is equal to the size of the array.

如您所见,如果在数组中找不到值,则该函数返回的索引等于数组的大小。

Usually the binary search algorithm that returns the index of the target element is defined as the lower bound algorithm.

通常,返回目标元素索引的二进制搜索算法被定义为下限算法。

Here is an example of an implementation of the binary search algorithm that returns the lower position where the target element is present or could be inserted.

以下是二进制搜索算法的实现示例,该算法返回目标元素存在或可插入的较低位置。

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>

size_t binary_search( const int a[], int x, size_t n )
{
    size_t low = 0, high = n;

    while ( low < high )
    {
        size_t middle = low + ( high - low ) / 2;

        if ( a[middle] < x )
        {
            low = middle + 1;
        }
        else
        {
            high = middle;
        }
    }

    return high;
}   

int main(void) 
{
    const size_t N = 10;
    int a[N];

    srand( ( unsigned int )time( NULL ) );

    for ( size_t i = 0; i < N; i++ )
    {
        int value = rand() % ( int )N;

        size_t n = binary_search( a, value, i );

        if ( n != i )
        {
            memmove( a + n + 1, a + n, ( i - n ) * sizeof( int ) );
        }

        a[n] = value;

        for ( size_t j = 0; j < i + 1; j++ )
        {
            printf( "%d ", a[j] );
        }
        putchar( '\n' );
    }

    return 0;
}

The program output might look like

程序输出可能看起来像

8 
1 8 
1 5 8 
0 1 5 8 
0 1 5 5 8 
0 0 1 5 5 8 
0 0 1 2 5 5 8 
0 0 1 2 2 5 5 8 
0 0 1 2 2 5 5 8 9 
0 0 1 2 2 5 5 5 8 9 

As you see neither signed int type is used in the implementation of the algorithm.

如您所见,在算法的实现中没有使用signed int类型。

As for the loop shown in the other answer like this

至于这样的另一个答案中显示的循环

for (unsigned i = START_VAL; i >= 0; i--) {...}  //WRONG

then again it is just written incorrectly. Instead of the for loop in this case there should be used do-while loop as for example

然后它再次写错了。在这种情况下,应该使用do-while循环代替for循环

unsigned i = START_VAL;
do
{
    // ...
} while ( i-- != 0 );

#1


3  

If mid is 0, you want the line high = mid - 1; to set high to -1, which will cause the loop to stop.

如果mid为0,则需要行high = mid - 1;将high设置为-1,这将导致循环停止。

If the variables were unsigned, high would wrap around to the maximum unsigned value which would cause a read past the end of the buffer and a likely crash.

如果变量是无符号的,则高值将回绕到最大无符号值,这将导致读取超过缓冲区的末尾并可能发生崩溃。

As for for loops which count down, the following loop will never end:

对于倒计时的for循环,以下循环永远不会结束:

for (unsigned i = START_VAL; i >= 0; i--) {...}  //WRONG

Because i is unsigned, the condition i >= 0 will always be true.

因为我是无符号的,条件i> = 0将始终为真。

#2


2  

The author of the book is wrong and it seems he is a weak programmer.:)

这本书的作者是错的,似乎他是一个弱的程序员。:)

First of all it is a bad idea to use the type int as the size of an array. He should use the type size_t or at least the type ptrdiff_t defined in the header <stddef.h> because the value of size of an array can be greater than the value that the type int can accommodate.

首先,使用int类型作为数组的大小是个坏主意。他应该使用类型size_t或至少在头文件 中定义的类型ptrdiff_t,因为数组的大小值可能大于int类型可以容纳的值。

Take into account that all C standard functions (as for example string functions) that deal with array sizes define corresponding parameters as having the type size_t.

考虑到处理数组大小的所有C标准函数(例如字符串函数)将相应的参数定义为类型size_t。

Here is the declaration of the standard function bsearch.

这是标准函数bsearch的声明。

void *bsearch(const void *key, const void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *));

The both parameters, nmemb and size, have type size_t.

两个参数nmemb和size都具有size_t类型。

The problem is not with signed or unsigned int type used as the type of the array size. The problem is how the algorithm is implemented.

问题不在于使用有符号或无符号的int类型作为数组大小的类型。问题是如何实现算法。

For example he could implement the algorithm the following way as it is shown in the demonstrative program

例如,他可以按照以下方式实现算法,如演示程序中所示

#include <stdio.h>

size_t binary_search( const int a[], int x, size_t n )
{
    size_t low = 0, high = n;

    while ( low < high )
    {
        size_t middle = low + ( high - low ) / 2;

        if ( a[middle] < x )
        {
            low = middle + 1;
        }
        else if ( x < a[middle] )
        {
            high = middle;
        }
        else
        {
            return middle;
        }
    }

    return n;
}   

int main(void) 
{
    int a[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    size_t N = sizeof( a ) / sizeof( *a );

    for ( int i = -1; i <= a[N - 1] + 1; i++ )
    {
        printf( "x = %d: %zu\n", i, binary_search( a, i, N ) );
    }

    return 0;
}

The program output is

程序输出是

x = -1: 10
x = 0: 0
x = 1: 1
x = 2: 2
x = 3: 3
x = 4: 4
x = 5: 5
x = 6: 6
x = 7: 7
x = 8: 8
x = 9: 9
x = 10: 10

As you see if a value is not found in the array then the function returns the index that is equal to the size of the array.

如您所见,如果在数组中找不到值,则该函数返回的索引等于数组的大小。

Usually the binary search algorithm that returns the index of the target element is defined as the lower bound algorithm.

通常,返回目标元素索引的二进制搜索算法被定义为下限算法。

Here is an example of an implementation of the binary search algorithm that returns the lower position where the target element is present or could be inserted.

以下是二进制搜索算法的实现示例,该算法返回目标元素存在或可插入的较低位置。

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>

size_t binary_search( const int a[], int x, size_t n )
{
    size_t low = 0, high = n;

    while ( low < high )
    {
        size_t middle = low + ( high - low ) / 2;

        if ( a[middle] < x )
        {
            low = middle + 1;
        }
        else
        {
            high = middle;
        }
    }

    return high;
}   

int main(void) 
{
    const size_t N = 10;
    int a[N];

    srand( ( unsigned int )time( NULL ) );

    for ( size_t i = 0; i < N; i++ )
    {
        int value = rand() % ( int )N;

        size_t n = binary_search( a, value, i );

        if ( n != i )
        {
            memmove( a + n + 1, a + n, ( i - n ) * sizeof( int ) );
        }

        a[n] = value;

        for ( size_t j = 0; j < i + 1; j++ )
        {
            printf( "%d ", a[j] );
        }
        putchar( '\n' );
    }

    return 0;
}

The program output might look like

程序输出可能看起来像

8 
1 8 
1 5 8 
0 1 5 8 
0 1 5 5 8 
0 0 1 5 5 8 
0 0 1 2 5 5 8 
0 0 1 2 2 5 5 8 
0 0 1 2 2 5 5 8 9 
0 0 1 2 2 5 5 5 8 9 

As you see neither signed int type is used in the implementation of the algorithm.

如您所见,在算法的实现中没有使用signed int类型。

As for the loop shown in the other answer like this

至于这样的另一个答案中显示的循环

for (unsigned i = START_VAL; i >= 0; i--) {...}  //WRONG

then again it is just written incorrectly. Instead of the for loop in this case there should be used do-while loop as for example

然后它再次写错了。在这种情况下,应该使用do-while循环代替for循环

unsigned i = START_VAL;
do
{
    // ...
} while ( i-- != 0 );