在C中优化搜索算法

时间:2022-09-13 08:01:01

Can the performance of this sequential search algorithm (taken from The Practice of Programming) be improved using any of C's native utilities, e.g. if I set the i variable to be a register variable ?

可以使用C的任何本机实用程序来改进这种顺序搜索算法(取自编程实践)的性能,例如:如果我将i变量设置为寄存器变量?

int lookup(char *word, char*array[])
{
    int i

    for (i = 0; array[i] != NULL; i++)
        if (strcmp(word, array[i]) == 0)
            return i;

    return -1;
}

10 个解决方案

#1


24  

Yes, but only very slightly. A much bigger performance improvement can be achieved by using better algorithms (for example keeping the list sorted and doing a binary search).

是的,但只是非常轻微。通过使用更好的算法可以实现更大的性能提升(例如,保持列表排序并进行二分查找)。

In general optimizing a given algorithm only gets you so far. Choosing a better algorithm (even if it's not completely optimized) can give you a considerable (order of magnitude) performance improvement.

一般来说,优化给定的算法只能让你到目前为止。选择更好的算法(即使它没有完全优化)可以为您提供相当大的(数量级)性能改进。

#2


2  

I think, it will not make much of a difference. The compiler will already optimize it in that direction.

我认为,它不会产生太大的影响。编译器已经在那个方向上优化它。

Besides, the variable i does not have much impact, word stays constant throughout the function and the rest is too large to fit in any register. It is only a matter how large the cache is and if the whole array might fit in there.

此外,变量i没有太大的影响,单词在整个函数中保持不变,其余的太大而不适合任何寄存器。只有缓存有多大以及整个阵列是否适合那里。

String comparisons are rather expensive computationally.

字符串比较在计算上相当昂贵。

Can you perhaps use some kind of hashing for the array before searching?

您可以在搜索之前对阵列使用某种散列吗?

#3


2  

There is well-known technique as sentinal method. To use sentinal method, you must know about the length of "array[]". You can remove "array[i] != NULL" comparing by using sentinal.

作为sentinal方法有众所周知的技术。要使用sentinal方法,您必须知道“array []”的长度。您可以使用sentinal删除“array [i]!= NULL”进行比较。

int lookup(char *word, char*array[], int array_len)
{
    int i = 0;
    array[array_len] = word;
    for (;; ++i)
        if (strcmp(word, array[i]) == 0) 
            break;
    array[array_len] = NULL;
    return (i != array_len) ? i : -1;
}

#4


1  

If you're reading TPOP, you will next see how they make this search many times faster with different data structures and algorithms.

如果您正在阅读TPOP,接下来将会看到他们如何使用不同的数据结构和算法将搜索速度提高许多倍。

But you can make things a bit faster by replacing things like

但是你可以通过替换类似的东西来加快速度

for (i = 0; i < n; ++i)
    foo(a[i]);

with

char **p = a;
for (i = 0; i < n; ++i)
    foo(*p);
    ++p;

If there is a known value at the end of the array (e.g. NULL) you can eliminate the loop counter:

如果数组末尾有一个已知值(例如NULL),则可以消除循环计数器:

for (p = a; *p != NULL; ++p)
    foo(*p)

Good luck, that's a great book!

祝你好运,这是一本好书!

#5


0  

To optimize that code the best bet would be to rewrite the strcmp routine since you are only checking for equality and don't need to evaluate the entire word.

要优化该代码,最好的办法是重写strcmp例程,因为您只检查相等性而不需要评估整个单词。

Other than that you can't do much else. You can't sort as it appears you are looking for text within a larger text. Binary search won't work either since the text is unlikely to be sorted.

除此之外,你不能做太多其他事情。您无法排序,因为您在更大的文本中寻找文本。二进制搜索也不起作用,因为文本不太可能被排序。

My 2p (C-psuedocode):

我的2p(C-psuedocode):

wrd_end = wrd_ptr + wrd_len;
arr_end = arr_ptr - wrd_len;
while (arr_ptr < arr_end)
{
    wrd_beg = wrd_ptr; arr_beg = arr_ptr;
    while (wrd_ptr == arr_ptr)
    {
        wrd_ptr++; arr_ptr++;
        if (wrd_ptr == wrd_en)
            return wrd_beg;
    }
    wrd_ptr++;
}

#6


0  

Mark Harrison: Your for loop will never terminate! (++p is indented, but is not actually within the for :-)

马克哈里森:你的for循环永远不会终止! (++ p缩进,但实际上不在for :-)

Also, switching between pointers and indexing will generally have no effect on performance, nor will adding register keywords (as mat already mentions) -- the compiler is smart enough to apply these transformations where appropriate, and if you tell it enough about your cpu arch, it will do a better job of these than manual psuedo-micro-optimizations.

此外,在指针和索引之间切换通常不会影响性能,也不会添加注册关键字(如已经提到的那样) - 编译器足够聪明,可以在适当的地方应用这些转换,如果你对cpu arch有足够的了解,它会比手动的伪微优化更好地完成这些工作。

#7


0  

A faster way to match strings would be to store them Pascal style. If you don't need more than 255 characters per string, store them roughly like this, with the count in the first byte:

匹配字符串的更快方法是将它们存储为Pascal样式。如果每个字符串不需要超过255个字符,请将它们大致存储起来,并将计数放在第一个字节中:

char s[] = "\x05Hello";

Then you can do:

然后你可以这样做:

for(i=0; i<len; ++i) {
    s_len = strings[i][0];
    if(
        s_len == match_len
        && strings[i][s_len] == match[s_len-1]
        && 0 == memcmp(strings[i]+1, match, s_len-1)
    ) {
        return 1;
    }
}

And to get really fast, add memory prefetch hints for string start + 64, + 128 and the start of the next string. But that's just crazy. :-)

为了获得非常快的速度,请为字符串start + 64,+ 128和下一个字符串的开头添加内存预取提示。但那太疯狂了。 :-)

#8


0  

Another fast way to do it is to get your compiler to use a SSE2 optimized memcmp. Use fixed-length char arrays and align so the string starts on a 64-byte alignment. Then I believe you can get the good memcmp functions if you pass const char match[64] instead of const char *match into the function, or strncpy match into a 64,128,256,whatever byte array.

另一种快速方法是让编译器使用SSE2优化的memcmp。使用固定长度的char数组并对齐,以便字符串以64字节对齐方式开始。然后我相信你可以获得良好的memcmp函数,如果你将const char匹配[64]而不是const char * match传递给函数,或者strncpy匹配到64,128,256,无论字节数组是什么。

Thinking a bit more about this, these SSE2 match functions might be part of packages like Intel's and AMD's accelerator libraries. Check them out.

考虑到这一点,这些SSE2匹配功能可能是英特尔和AMD的加速器库等软件包的一部分。去看一下。

#9


0  

Realistically, setting I to be a register variable won't do anything that the compiler wouldn't do already.

实际上,将I设置为寄存器变量将不会执行编译器不会执行的任何操作。

If you are willing to spend some time upfront preprocessing the reference array, you should google "The World's Fastest Scrabble Program" and implement that. Spoiler: it's a DAG optimized for character lookups.

如果你愿意花一些时间预先处理参考数组,你应该谷歌“世界上最快的拼字游戏程序”并实现它。 Spoiler:它是针对字符查找优化的DAG。

#10


-1  

/* there is no more quick  */
int lookup(char *word, char*array[])
{
    int i;
    for(i=0; *(array++) != NULL;i++)
        if (strcmp(word, *array) == 0)
            return i;
    return -1;
}

#1


24  

Yes, but only very slightly. A much bigger performance improvement can be achieved by using better algorithms (for example keeping the list sorted and doing a binary search).

是的,但只是非常轻微。通过使用更好的算法可以实现更大的性能提升(例如,保持列表排序并进行二分查找)。

In general optimizing a given algorithm only gets you so far. Choosing a better algorithm (even if it's not completely optimized) can give you a considerable (order of magnitude) performance improvement.

一般来说,优化给定的算法只能让你到目前为止。选择更好的算法(即使它没有完全优化)可以为您提供相当大的(数量级)性能改进。

#2


2  

I think, it will not make much of a difference. The compiler will already optimize it in that direction.

我认为,它不会产生太大的影响。编译器已经在那个方向上优化它。

Besides, the variable i does not have much impact, word stays constant throughout the function and the rest is too large to fit in any register. It is only a matter how large the cache is and if the whole array might fit in there.

此外,变量i没有太大的影响,单词在整个函数中保持不变,其余的太大而不适合任何寄存器。只有缓存有多大以及整个阵列是否适合那里。

String comparisons are rather expensive computationally.

字符串比较在计算上相当昂贵。

Can you perhaps use some kind of hashing for the array before searching?

您可以在搜索之前对阵列使用某种散列吗?

#3


2  

There is well-known technique as sentinal method. To use sentinal method, you must know about the length of "array[]". You can remove "array[i] != NULL" comparing by using sentinal.

作为sentinal方法有众所周知的技术。要使用sentinal方法,您必须知道“array []”的长度。您可以使用sentinal删除“array [i]!= NULL”进行比较。

int lookup(char *word, char*array[], int array_len)
{
    int i = 0;
    array[array_len] = word;
    for (;; ++i)
        if (strcmp(word, array[i]) == 0) 
            break;
    array[array_len] = NULL;
    return (i != array_len) ? i : -1;
}

#4


1  

If you're reading TPOP, you will next see how they make this search many times faster with different data structures and algorithms.

如果您正在阅读TPOP,接下来将会看到他们如何使用不同的数据结构和算法将搜索速度提高许多倍。

But you can make things a bit faster by replacing things like

但是你可以通过替换类似的东西来加快速度

for (i = 0; i < n; ++i)
    foo(a[i]);

with

char **p = a;
for (i = 0; i < n; ++i)
    foo(*p);
    ++p;

If there is a known value at the end of the array (e.g. NULL) you can eliminate the loop counter:

如果数组末尾有一个已知值(例如NULL),则可以消除循环计数器:

for (p = a; *p != NULL; ++p)
    foo(*p)

Good luck, that's a great book!

祝你好运,这是一本好书!

#5


0  

To optimize that code the best bet would be to rewrite the strcmp routine since you are only checking for equality and don't need to evaluate the entire word.

要优化该代码,最好的办法是重写strcmp例程,因为您只检查相等性而不需要评估整个单词。

Other than that you can't do much else. You can't sort as it appears you are looking for text within a larger text. Binary search won't work either since the text is unlikely to be sorted.

除此之外,你不能做太多其他事情。您无法排序,因为您在更大的文本中寻找文本。二进制搜索也不起作用,因为文本不太可能被排序。

My 2p (C-psuedocode):

我的2p(C-psuedocode):

wrd_end = wrd_ptr + wrd_len;
arr_end = arr_ptr - wrd_len;
while (arr_ptr < arr_end)
{
    wrd_beg = wrd_ptr; arr_beg = arr_ptr;
    while (wrd_ptr == arr_ptr)
    {
        wrd_ptr++; arr_ptr++;
        if (wrd_ptr == wrd_en)
            return wrd_beg;
    }
    wrd_ptr++;
}

#6


0  

Mark Harrison: Your for loop will never terminate! (++p is indented, but is not actually within the for :-)

马克哈里森:你的for循环永远不会终止! (++ p缩进,但实际上不在for :-)

Also, switching between pointers and indexing will generally have no effect on performance, nor will adding register keywords (as mat already mentions) -- the compiler is smart enough to apply these transformations where appropriate, and if you tell it enough about your cpu arch, it will do a better job of these than manual psuedo-micro-optimizations.

此外,在指针和索引之间切换通常不会影响性能,也不会添加注册关键字(如已经提到的那样) - 编译器足够聪明,可以在适当的地方应用这些转换,如果你对cpu arch有足够的了解,它会比手动的伪微优化更好地完成这些工作。

#7


0  

A faster way to match strings would be to store them Pascal style. If you don't need more than 255 characters per string, store them roughly like this, with the count in the first byte:

匹配字符串的更快方法是将它们存储为Pascal样式。如果每个字符串不需要超过255个字符,请将它们大致存储起来,并将计数放在第一个字节中:

char s[] = "\x05Hello";

Then you can do:

然后你可以这样做:

for(i=0; i<len; ++i) {
    s_len = strings[i][0];
    if(
        s_len == match_len
        && strings[i][s_len] == match[s_len-1]
        && 0 == memcmp(strings[i]+1, match, s_len-1)
    ) {
        return 1;
    }
}

And to get really fast, add memory prefetch hints for string start + 64, + 128 and the start of the next string. But that's just crazy. :-)

为了获得非常快的速度,请为字符串start + 64,+ 128和下一个字符串的开头添加内存预取提示。但那太疯狂了。 :-)

#8


0  

Another fast way to do it is to get your compiler to use a SSE2 optimized memcmp. Use fixed-length char arrays and align so the string starts on a 64-byte alignment. Then I believe you can get the good memcmp functions if you pass const char match[64] instead of const char *match into the function, or strncpy match into a 64,128,256,whatever byte array.

另一种快速方法是让编译器使用SSE2优化的memcmp。使用固定长度的char数组并对齐,以便字符串以64字节对齐方式开始。然后我相信你可以获得良好的memcmp函数,如果你将const char匹配[64]而不是const char * match传递给函数,或者strncpy匹配到64,128,256,无论字节数组是什么。

Thinking a bit more about this, these SSE2 match functions might be part of packages like Intel's and AMD's accelerator libraries. Check them out.

考虑到这一点,这些SSE2匹配功能可能是英特尔和AMD的加速器库等软件包的一部分。去看一下。

#9


0  

Realistically, setting I to be a register variable won't do anything that the compiler wouldn't do already.

实际上,将I设置为寄存器变量将不会执行编译器不会执行的任何操作。

If you are willing to spend some time upfront preprocessing the reference array, you should google "The World's Fastest Scrabble Program" and implement that. Spoiler: it's a DAG optimized for character lookups.

如果你愿意花一些时间预先处理参考数组,你应该谷歌“世界上最快的拼字游戏程序”并实现它。 Spoiler:它是针对字符查找优化的DAG。

#10


-1  

/* there is no more quick  */
int lookup(char *word, char*array[])
{
    int i;
    for(i=0; *(array++) != NULL;i++)
        if (strcmp(word, *array) == 0)
            return i;
    return -1;
}