为什么std::string操作执行不好?

时间:2022-10-24 19:57:10

I made a test to compare string operations in several languages for choosing a language for the server-side application. The results seemed normal until I finally tried C++, which surprised me a lot. So I wonder if I had missed any optimization and come here for help.

我做了一个测试来比较几种语言中的字符串操作,以便为服务器端应用程序选择一种语言。结果似乎很正常,直到我终于尝试了c++,这让我很惊讶。所以我想知道我是否错过了优化,来这里寻求帮助。

The test are mainly intensive string operations, including concatenate and searching. The test is performed on Ubuntu 11.10 amd64, with GCC's version 4.6.1. The machine is Dell Optiplex 960, with 4G RAM, and Quad-core CPU.

测试主要是密集的字符串操作,包括连接和搜索。测试是在Ubuntu 11.10 amd64上执行的,GCC的版本是4.6.1。这台机器是Dell Optiplex 960,有4G内存和四核CPU。

in Python (2.7.2):

def test():
    x = ""
    limit = 102 * 1024
    while len(x) < limit:
        x += "X"
        if x.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) > 0:
            print("Oh my god, this is impossible!")
    print("x's length is : %d" % len(x))

test()

which gives result:

使结果:

x's length is : 104448

real    0m8.799s
user    0m8.769s
sys     0m0.008s

in Java (OpenJDK-7):

public class test {
    public static void main(String[] args) {
        int x = 0;
        int limit = 102 * 1024;
        String s="";
        for (; s.length() < limit;) {
            s += "X";
            if (s.indexOf("ABCDEFGHIJKLMNOPQRSTUVWXYZ") > 0)
            System.out.printf("Find!\n");
        }
        System.out.printf("x's length = %d\n", s.length());
    }
}

which gives result:

使结果:

x's length = 104448

real    0m50.436s
user    0m50.431s
sys     0m0.488s

in Javascript (Nodejs 0.6.3)

function test()
{
    var x = "";
    var limit = 102 * 1024;
    while (x.length < limit) {
        x += "X";
        if (x.indexOf("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) > 0)
            console.log("OK");
    }
    console.log("x's length = " + x.length);
}();

which gives result:

使结果:

x's length = 104448

real    0m3.115s
user    0m3.084s
sys     0m0.048s

in C++ (g++ -Ofast)

It's not surprising that Nodejs performas better than Python or Java. But I expected libstdc++ would give much better performance than Nodejs, whose result really suprised me.

Nodejs performas比Python或Java更好一点也不奇怪。但我希望libstdc++能比Nodejs提供更好的性能,其结果确实让我感到惊讶。

#include <iostream>
#include <string>
using namespace std;
void test()
{
    int x = 0;
    int limit = 102 * 1024;
    string s("");
    for (; s.size() < limit;) {
        s += "X";
        if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != string::npos)
            cout << "Find!" << endl;
    }
    cout << "x's length = " << s.size() << endl;
}

int main()
{
    test();
}

which gives result:

使结果:

x length = 104448

real    0m5.905s
user    0m5.900s
sys     0m0.000s

Brief Summary

OK, now let's see the summary:

好的,现在让我们看一下总结:

  • javascript on Nodejs(V8): 3.1s
  • Nodejs javascript(V8):3.1秒
  • Python on CPython 2.7.2 : 8.8s
  • Python对CPython 2.7.2: 8.8s。
  • C++ with libstdc++: 5.9s
  • c++ libstdc + +:5.9 s
  • Java on OpenJDK 7: 50.4s
  • Java在OpenJDK 7: 50.4s。

Surprisingly! I tried "-O2, -O3" in C++ but noting helped. C++ seems about only 50% performance of javascript in V8, and even poor than CPython. Could anyone explain to me if I had missed some optimization in GCC or is this just the case? Thank you a lot.

令人惊讶的!我在c++中尝试了“-O2, -O3”,但没有注意到。在V8中,c++似乎只有50%的性能,甚至比CPython还差。有没有人能跟我解释一下,如果我错过了海湾合作委员会的一些优化,或者仅仅是这个案例?谢谢你很多。

12 个解决方案

#1


70  

It's not that std::string performs poorly (as much as I dislike C++), it's that string handling is so heavily optimized for those other languages.

不是std::string的性能很差(就像我不喜欢c++一样),它是字符串处理对其他语言进行了大量优化。

Your comparisons of string performance are misleading, and presumptuous if they are intended to represent more than just that.

你对字符串性能的比较是有误导性的,如果他们想要代表的不仅仅是这个,那么就太冒昧了。

I know for a fact that Python string objects are completely implemented in C, and indeed on Python 2.7, numerous optimizations exist due to the lack of separation between unicode strings and bytes. If you ran this test on Python 3.x you will find it considerably slower.

我知道,Python字符串对象完全是在C中实现的,实际上在Python 2.7中,由于unicode字符串和字节之间缺乏分离,所以存在许多优化。如果您在python3上运行这个测试。你会发现它慢得多。

Javascript has numerous heavily optimized implementations. It's to be expected that string handling is excellent here.

Javascript有大量优化的实现。这里的字符串处理非常好。

Your Java result may be due to improper string handling, or some other poor case. I expect that a Java expert could step in and fix this test with a few changes.

您的Java结果可能是由于字符串处理不当或其他一些糟糕的情况造成的。我希望Java专家能够介入并通过一些更改来修复这个测试。

As for your C++ example, I'd expect performance to slightly exceed the Python version. It does the same operations, with less interpreter overhead. This is reflected in your results. Preceding the test with s.reserve(limit); would remove reallocation overhead.

对于您的c++示例,我希望性能略微超过Python版本。它执行相同的操作,开销更小。这反映在你的结果中。在测试前与。reserve(限制);将重新分配开销。

I'll repeat that you're only testing a single facet of the languages' implementations. The results for this test do not reflect the overall language speed.

我将重申,您只是在测试语言实现的一个方面。这个测试的结果并不反映整体的语言速度。

I've provided a C version to show how silly such pissing contests can be:

我提供了一个C版本,以显示这样的pissing竞赛是多么愚蠢:

#define _GNU_SOURCE
#include <string.h>
#include <stdio.h>

void test()
{
    int limit = 102 * 1024;
    char s[limit];
    size_t size = 0;
    while (size < limit) {
        s[size++] = 'X';
        if (memmem(s, size, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", 26)) {
            fprintf(stderr, "zomg\n");
            return;
        }
    }
    printf("x's length = %zu\n", size);
}

int main()
{
    test();
    return 0;
}

Timing:

时间:

matt@stanley:~/Desktop$ time ./smash 
x's length = 104448

real    0m0.681s
user    0m0.680s
sys     0m0.000s

#2


34  

So I went and played a bit with this on ideone.org.

所以我在ideone.org上玩了一些。

Here a slightly modified version of your original C++ program, but with the appending in the loop eliminated, so it only measures the call to std::string::find(). Note that I had to cut the number of iterations to ~40%, otherwise ideone.org would kill the process.

这里有一个稍微修改过的c++程序版本,但是在循环中添加了append,所以它只测量了std:: find()的调用。请注意,我必须将迭代次数减少到~40%,否则ideone.org将会扼杀这个过程。

#include <iostream>
#include <string>

int main()
{
    const std::string::size_type limit = 42 * 1024;
    unsigned int found = 0;

    //std::string s;
    std::string s(limit, 'X');
    for (std::string::size_type i = 0; i < limit; ++i) {
        //s += 'X';
        if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != std::string::npos)
            ++found;
    }

    if(found > 0)
        std::cout << "Found " << found << " times!\n";
    std::cout << "x's length = " << s.size() << '\n';

    return 0;
}

My results at ideone.org are time: 3.37s. (Of course, this is highly questionably, but indulge me for a moment and wait for the other result.)

我在ideone.org的结果是时间:3.37秒。(当然,这是很有问题的,但请先让我享受一下,等待另一个结果。)

Now we take this code and swap the commented lines, to test appending, rather than finding. Note that, this time, I had increased the number of iterations tenfold in trying to see any time result at all.

现在,我们使用这个代码并交换注释的行,以测试appending,而不是查找。注意,这一次,我在尝试查看任何时间结果时增加了十倍的迭代次数。

#include <iostream>
#include <string>

int main()
{
    const std::string::size_type limit = 1020 * 1024;
    unsigned int found = 0;

    std::string s;
    //std::string s(limit, 'X');
    for (std::string::size_type i = 0; i < limit; ++i) {
        s += 'X';
        //if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != std::string::npos)
        //    ++found;
    }

    if(found > 0)
        std::cout << "Found " << found << " times!\n";
    std::cout << "x's length = " << s.size() << '\n';

    return 0;
}

My results at ideone.org, despite the tenfold increase in iterations, are time: 0s.

我在ideone.org网站上的结果,尽管迭代次数增加了十倍,但却是时间:0。

My conclusion: This benchmark is, in C++, highly dominated by the searching operation, the appending of the character in the loop has no influence on the result at all. Was that really your intention?

我的结论是:在c++中,这个基准是由搜索操作高度控制的,在循环中添加字符对结果没有影响。那真的是你的意图吗?

#3


14  

The idiomatic C++ solution would be:

惯用的c++解决方案是:

#include <iostream>
#include <string>
#include <algorithm>

int main()
{
    const int limit = 102 * 1024;
    std::string s;
    s.reserve(limit);

    const std::string pattern("ABCDEFGHIJKLMNOPQRSTUVWXYZ");

    for (int i = 0; i < limit; ++i) {
        s += 'X';
        if (std::search(s.begin(), s.end(), pattern.begin(), pattern.end()) != s.end())
            std::cout << "Omg Wtf found!";
    }
    std::cout << "X's length = " << s.size();
    return 0;
}

I could speed this up considerably by putting the string on the stack, and using memmem -- but there seems to be no need. Running on my machine, this is over 10x the speed of the python solution already..

我可以通过将字符串放在堆栈上,并使用memmem来加快这个速度,但是似乎没有必要。在我的机器上运行,这已经超过了python解决方案速度的10x。

[On my laptop]

(在我的笔记本电脑)

time ./test X's length = 104448 real 0m2.055s user 0m2.049s sys 0m0.001s

时间。/测试X的长度= 104448 real 0m2.055s用户0m2.049s sys 0m0.001s。

#4


8  

That is the most obvious one: please try to do s.reserve(limit); before main loop.

这是最明显的一个:请试着做。在主循环。

Documentation is here.

文档在这里。

I should mention that direct usage of standard classes in C++ in the same way you are used to do it in Java or Python will often give you sub-par performance if you are unaware of what is done behind the desk. There is no magical performance in language itself, it just gives you right tools.

我应该提到,在c++中直接使用标准类,就像在Java或Python中使用一样,如果您不知道在这张桌子后面所做的事情,那么它通常会给您低于标准的性能。语言本身并没有什么神奇的表现,它只是给你正确的工具。

#5


6  

What you are missing here is the inherent complexity of the find search.

这里缺少的是find search的内在复杂性。

You are executing the search 102 * 1024 (104 448) times. A naive search algorithm will, each time, try to match the pattern starting from the first character, then the second, etc...

您正在执行搜索102 * 1024(104 448)次。一个简单的搜索算法,每次尝试匹配从第一个字符开始的模式,然后第二个,等等…

Therefore, you have a string that is going from length 1 to N, and at each step you search the pattern against this string, which is a linear operation in C++. That is N * (N+1) / 2 = 5 454 744 576 comparisons. I am not as surprised as you are that this would take some time...

因此,你有一个从1到N的字符串,在每一步中,你都要在这个字符串中搜索这个模式,这是一个c++的线性操作。这是N * (N+1) / 2 = 5 454 744 576比较。我不像你那么惊讶这需要一些时间…

Let us verify the hypothesis by using the overload of find that searches for a single A:

让我们用“find”的重载来验证这个假设:

Original: 6.94938e+06 ms
Char    : 2.10709e+06 ms

About 3 times faster, so we are within the same order of magnitude. Therefore the use of a full string is not really interesting.

大约快3倍,所以我们的数量级是相同的。因此,使用完整字符串并不十分有趣。

Conclusion ? Maybe that find could be optimized a bit. But the problem is not worth it.

结论?也许这个发现可以优化一点。但这个问题不值得。

Note: and to those who tout Boyer Moore, I am afraid that the needle is too small, so it won't help much. May cut an order of magnitude (26 characters), but no more.

注意:对于那些兜售摩尔的人,我担心针太小了,所以不会有太大帮助。可以切割一个数量级(26个字符),但不再。

#6


5  

My first thought is that there isn't a problem.

我的第一个想法是没有问题。

C++ gives second-best performance, nearly ten times faster than Java. Maybe all but Java are running close to the best performance achievable for that functionality, and you should be looking at how to fix the Java issue (hint - StringBuilder).

c++提供了第二佳的性能,几乎是Java的10倍。也许除了Java之外,几乎所有的功能都接近于该功能的最佳性能,您应该考虑如何修复Java问题(提示- StringBuilder)。

In the C++ case, there are some things to try to improve performance a bit. In particular...

在c++的例子中,有一些东西可以尝试提高性能。特别是……

  • s += 'X'; rather than s += "X";
  • s + =“X”;而不是s += "X";
  • Declare string searchpattern ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"); outside the loop, and pass this for the find calls. An std::string instance knows it's own length, whereas a C string requires a linear-time check to determine that, and this may (or may not) be relevant to std::string::find performance.
  • 声明字符串searchpattern(“ABCDEFGHIJKLMNOPQRSTUVWXYZ”);在循环之外,并将其传递给find调用。一个std::string实例知道它自己的长度,而C字符串需要一个线性时间检查来确定它,而这个可能(或者可能不)与std有关::查找性能。
  • Try using std::stringstream, for a similar reason to why you should be using StringBuilder for Java, though most likely the repeated conversions back to string will create more problems.
  • 尝试使用std::stringstream,因为类似的原因,为什么您应该使用StringBuilder作为Java,尽管很有可能重复的转换会导致更多的问题。

Overall, the result isn't too surprising though. JavaScript, with a good JIT compiler, may be able to optimise a little better than C++ static compilation is allowed to in this case.

总的来说,结果并不令人惊讶。有一个好的JIT编译器的JavaScript,在这种情况下可能会比c++静态编译好一点。

With enough work, you should always be able to optimise C++ better than JavaScript, but there will always be cases where that doesn't just naturally happen and where it may take a fair bit of knowledge and effort to achieve that.

有了足够的工作,您应该总是能够比JavaScript更好地优化c++,但是总会有这样的情况,不只是自然发生,而且可能需要一些知识和努力来实现这一点。

#7


4  

For C++, try to use std::string for "ABCDEFGHIJKLMNOPQRSTUVWXYZ" - in my implementation string::find(const charT* s, size_type pos = 0) const calculates length of string argument.

对于c++,尝试使用std::string For“ABCDEFGHIJKLMNOPQRSTUVWXYZ”——在我的实现字符串中:find(const charT* s, size_type pos = 0) const计算字符串参数的长度。

#8


3  

I just tested the C++ example myself. If I remove the the call to std::sting::find, the program terminates in no time. Thus the allocations during string concatenation is no problem here.

我自己测试了c++的例子。如果我把呼叫转移到std::sting::发现,程序马上终止。因此,在字符串连接期间的分配是没有问题的。

If I add a variable sdt::string abc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" and replace the occurence of "ABC...XYZ" in the call of std::string::find, the program needs almost the same time to finish as the original example. This again shows that allocation as well as computing the string's length does not add much to the runtime.

如果我添加一个变量sdt::string abc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"并替换" abc…"在std的调用中:::find,该程序几乎需要与原始示例相同的时间完成。这再次说明了分配以及计算字符串长度对运行时的影响不大。

Therefore, it seems that the string search algorithm used by libstdc++ is not as fast for your example as the search algorithms of javascript or python. Maybe you want to try C++ again with your own string search algorithm which fits your purpose better.

因此,libstdc++所使用的字符串搜索算法似乎不像javascript或python的搜索算法那样快。也许你想用你自己的字符串搜索算法来尝试c++,它更适合你的目的。

#9


3  

C/C++ language are not easy and take years make fast programs.

C语言是不容易的,需要几年的时间来制定快速的程序。

with strncmp(3) version modified from c version:

从c版本修改的strncmp(3)版本:

#define _GNU_SOURCE
#include <string.h>
#include <stdio.h>

void test()
{
    int limit = 102 * 1024;
    char s[limit];
    size_t size = 0;
    while (size < limit) {
        s[size++] = 'X';
        if (!strncmp(s, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", 26)) {
            fprintf(stderr, "zomg\n");
            return;
        }
    }
    printf("x's length = %zu\n", size);
}

int main()
{
    test();
    return 0;
}

#10


1  

It seems that in nodejs better algorithm for search substrings. You can implement self and try it.

在nodejs中,搜索子字符串的算法似乎更好。您可以实现self并尝试它。

#11


1  

Your test code is checking a pathological scenario of excessive string concatenation. (The string-search part of the test could have probably been omitted, I bet you it contributes almost nothing to the final results.) Excessive string concatenation is a pitfall that most languages warn very strongly against, and provide very well known alternatives for, (i.e. StringBuilder,) so what you are essentially testing here is how badly these languages fail under scenarios of perfectly expected failure. That's pointless.

您的测试代码正在检查过度字符串连接的病理情况。(测试的字符串搜索部分可能被省略了,我打赌它对最终结果几乎没有贡献。)过多的字符串连接是大多数语言强烈反对的一个陷阱,并且提供了非常有名的替代方法(例如StringBuilder),所以您实际上在测试的是这些语言在完全预期的失败场景下会有多么糟糕。这是毫无意义的。

An example of a similarly pointless test would be to compare the performance of various languages when throwing and catching an exception in a tight loop. All languages warn that exception throwing and catching is abysmally slow. They do not specify how slow, they just warn you not to expect anything. Therefore, to go ahead and test precisely that, would be pointless.

一个类似无意义的测试的例子是,在一个紧凑的循环中抛出和捕获一个异常时,比较各种语言的性能。所有语言都警告说,抛出和捕获异常缓慢。他们没有具体说明他们的反应有多慢,只是警告你不要期待任何事情。因此,要进行精确的测试,就毫无意义了。

So, it would make a lot more sense to repeat your test substituting the mindless string concatenation part (s += "X") with whatever construct is offered by each one of these languages precisely for avoiding string concatenation. (Such as class StringBuilder.)

因此,如果使用这些语言中的每一种语言提供的任何构造都可以精确地避免字符串连接,那么重复您的测试就更有意义了。(如类StringBuilder。)

#12


0  

As mentioned by sbi, the test case is dominated by the search operation. I was curious how fast the text allocation compares between C++ and Javascript.

正如sbi所提到的,测试用例以搜索操作为主。我很好奇,在c++和Javascript之间的文本分配有多快。

System: Raspberry Pi 2, g++ 4.6.3, node v0.12.0, g++ -std=c++0x -O2 perf.cpp

系统:树莓Pi 2, g++ 4.6.3,节点v0.12.0, g++ -std=c++0x -O2性能。

C++ : 770ms

c++:770 ms

C++ without reserve: 1196ms

毫无保留地c++:1196 ms

Javascript: 2310ms

Javascript:2310毫秒

C++

c++

#include <iostream>
#include <string>
#include <chrono>
using namespace std;
using namespace std::chrono;

void test()
{
    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    int x = 0;
    int limit = 1024 * 1024 * 100;
    string s("");
    s.reserve(1024 * 1024 * 101);
    for(int i=0; s.size()< limit; i++){
        s += "SUPER NICE TEST TEXT";
    }

    high_resolution_clock::time_point t2 = high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>( t2 - t1 ).count();
    cout << duration << endl;
}

int main()
{
    test();
}

JavaScript

JavaScript

function test()
{
    var time = process.hrtime();
    var x = "";
    var limit = 1024 * 1024 * 100;
    for(var i=0; x.length < limit; i++){
        x += "SUPER NICE TEST TEXT";
    }

    var diff = process.hrtime(time);
    console.log('benchmark took %d ms', diff[0] * 1e3 + diff[1] / 1e6 );
}

test();

#1


70  

It's not that std::string performs poorly (as much as I dislike C++), it's that string handling is so heavily optimized for those other languages.

不是std::string的性能很差(就像我不喜欢c++一样),它是字符串处理对其他语言进行了大量优化。

Your comparisons of string performance are misleading, and presumptuous if they are intended to represent more than just that.

你对字符串性能的比较是有误导性的,如果他们想要代表的不仅仅是这个,那么就太冒昧了。

I know for a fact that Python string objects are completely implemented in C, and indeed on Python 2.7, numerous optimizations exist due to the lack of separation between unicode strings and bytes. If you ran this test on Python 3.x you will find it considerably slower.

我知道,Python字符串对象完全是在C中实现的,实际上在Python 2.7中,由于unicode字符串和字节之间缺乏分离,所以存在许多优化。如果您在python3上运行这个测试。你会发现它慢得多。

Javascript has numerous heavily optimized implementations. It's to be expected that string handling is excellent here.

Javascript有大量优化的实现。这里的字符串处理非常好。

Your Java result may be due to improper string handling, or some other poor case. I expect that a Java expert could step in and fix this test with a few changes.

您的Java结果可能是由于字符串处理不当或其他一些糟糕的情况造成的。我希望Java专家能够介入并通过一些更改来修复这个测试。

As for your C++ example, I'd expect performance to slightly exceed the Python version. It does the same operations, with less interpreter overhead. This is reflected in your results. Preceding the test with s.reserve(limit); would remove reallocation overhead.

对于您的c++示例,我希望性能略微超过Python版本。它执行相同的操作,开销更小。这反映在你的结果中。在测试前与。reserve(限制);将重新分配开销。

I'll repeat that you're only testing a single facet of the languages' implementations. The results for this test do not reflect the overall language speed.

我将重申,您只是在测试语言实现的一个方面。这个测试的结果并不反映整体的语言速度。

I've provided a C version to show how silly such pissing contests can be:

我提供了一个C版本,以显示这样的pissing竞赛是多么愚蠢:

#define _GNU_SOURCE
#include <string.h>
#include <stdio.h>

void test()
{
    int limit = 102 * 1024;
    char s[limit];
    size_t size = 0;
    while (size < limit) {
        s[size++] = 'X';
        if (memmem(s, size, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", 26)) {
            fprintf(stderr, "zomg\n");
            return;
        }
    }
    printf("x's length = %zu\n", size);
}

int main()
{
    test();
    return 0;
}

Timing:

时间:

matt@stanley:~/Desktop$ time ./smash 
x's length = 104448

real    0m0.681s
user    0m0.680s
sys     0m0.000s

#2


34  

So I went and played a bit with this on ideone.org.

所以我在ideone.org上玩了一些。

Here a slightly modified version of your original C++ program, but with the appending in the loop eliminated, so it only measures the call to std::string::find(). Note that I had to cut the number of iterations to ~40%, otherwise ideone.org would kill the process.

这里有一个稍微修改过的c++程序版本,但是在循环中添加了append,所以它只测量了std:: find()的调用。请注意,我必须将迭代次数减少到~40%,否则ideone.org将会扼杀这个过程。

#include <iostream>
#include <string>

int main()
{
    const std::string::size_type limit = 42 * 1024;
    unsigned int found = 0;

    //std::string s;
    std::string s(limit, 'X');
    for (std::string::size_type i = 0; i < limit; ++i) {
        //s += 'X';
        if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != std::string::npos)
            ++found;
    }

    if(found > 0)
        std::cout << "Found " << found << " times!\n";
    std::cout << "x's length = " << s.size() << '\n';

    return 0;
}

My results at ideone.org are time: 3.37s. (Of course, this is highly questionably, but indulge me for a moment and wait for the other result.)

我在ideone.org的结果是时间:3.37秒。(当然,这是很有问题的,但请先让我享受一下,等待另一个结果。)

Now we take this code and swap the commented lines, to test appending, rather than finding. Note that, this time, I had increased the number of iterations tenfold in trying to see any time result at all.

现在,我们使用这个代码并交换注释的行,以测试appending,而不是查找。注意,这一次,我在尝试查看任何时间结果时增加了十倍的迭代次数。

#include <iostream>
#include <string>

int main()
{
    const std::string::size_type limit = 1020 * 1024;
    unsigned int found = 0;

    std::string s;
    //std::string s(limit, 'X');
    for (std::string::size_type i = 0; i < limit; ++i) {
        s += 'X';
        //if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != std::string::npos)
        //    ++found;
    }

    if(found > 0)
        std::cout << "Found " << found << " times!\n";
    std::cout << "x's length = " << s.size() << '\n';

    return 0;
}

My results at ideone.org, despite the tenfold increase in iterations, are time: 0s.

我在ideone.org网站上的结果,尽管迭代次数增加了十倍,但却是时间:0。

My conclusion: This benchmark is, in C++, highly dominated by the searching operation, the appending of the character in the loop has no influence on the result at all. Was that really your intention?

我的结论是:在c++中,这个基准是由搜索操作高度控制的,在循环中添加字符对结果没有影响。那真的是你的意图吗?

#3


14  

The idiomatic C++ solution would be:

惯用的c++解决方案是:

#include <iostream>
#include <string>
#include <algorithm>

int main()
{
    const int limit = 102 * 1024;
    std::string s;
    s.reserve(limit);

    const std::string pattern("ABCDEFGHIJKLMNOPQRSTUVWXYZ");

    for (int i = 0; i < limit; ++i) {
        s += 'X';
        if (std::search(s.begin(), s.end(), pattern.begin(), pattern.end()) != s.end())
            std::cout << "Omg Wtf found!";
    }
    std::cout << "X's length = " << s.size();
    return 0;
}

I could speed this up considerably by putting the string on the stack, and using memmem -- but there seems to be no need. Running on my machine, this is over 10x the speed of the python solution already..

我可以通过将字符串放在堆栈上,并使用memmem来加快这个速度,但是似乎没有必要。在我的机器上运行,这已经超过了python解决方案速度的10x。

[On my laptop]

(在我的笔记本电脑)

time ./test X's length = 104448 real 0m2.055s user 0m2.049s sys 0m0.001s

时间。/测试X的长度= 104448 real 0m2.055s用户0m2.049s sys 0m0.001s。

#4


8  

That is the most obvious one: please try to do s.reserve(limit); before main loop.

这是最明显的一个:请试着做。在主循环。

Documentation is here.

文档在这里。

I should mention that direct usage of standard classes in C++ in the same way you are used to do it in Java or Python will often give you sub-par performance if you are unaware of what is done behind the desk. There is no magical performance in language itself, it just gives you right tools.

我应该提到,在c++中直接使用标准类,就像在Java或Python中使用一样,如果您不知道在这张桌子后面所做的事情,那么它通常会给您低于标准的性能。语言本身并没有什么神奇的表现,它只是给你正确的工具。

#5


6  

What you are missing here is the inherent complexity of the find search.

这里缺少的是find search的内在复杂性。

You are executing the search 102 * 1024 (104 448) times. A naive search algorithm will, each time, try to match the pattern starting from the first character, then the second, etc...

您正在执行搜索102 * 1024(104 448)次。一个简单的搜索算法,每次尝试匹配从第一个字符开始的模式,然后第二个,等等…

Therefore, you have a string that is going from length 1 to N, and at each step you search the pattern against this string, which is a linear operation in C++. That is N * (N+1) / 2 = 5 454 744 576 comparisons. I am not as surprised as you are that this would take some time...

因此,你有一个从1到N的字符串,在每一步中,你都要在这个字符串中搜索这个模式,这是一个c++的线性操作。这是N * (N+1) / 2 = 5 454 744 576比较。我不像你那么惊讶这需要一些时间…

Let us verify the hypothesis by using the overload of find that searches for a single A:

让我们用“find”的重载来验证这个假设:

Original: 6.94938e+06 ms
Char    : 2.10709e+06 ms

About 3 times faster, so we are within the same order of magnitude. Therefore the use of a full string is not really interesting.

大约快3倍,所以我们的数量级是相同的。因此,使用完整字符串并不十分有趣。

Conclusion ? Maybe that find could be optimized a bit. But the problem is not worth it.

结论?也许这个发现可以优化一点。但这个问题不值得。

Note: and to those who tout Boyer Moore, I am afraid that the needle is too small, so it won't help much. May cut an order of magnitude (26 characters), but no more.

注意:对于那些兜售摩尔的人,我担心针太小了,所以不会有太大帮助。可以切割一个数量级(26个字符),但不再。

#6


5  

My first thought is that there isn't a problem.

我的第一个想法是没有问题。

C++ gives second-best performance, nearly ten times faster than Java. Maybe all but Java are running close to the best performance achievable for that functionality, and you should be looking at how to fix the Java issue (hint - StringBuilder).

c++提供了第二佳的性能,几乎是Java的10倍。也许除了Java之外,几乎所有的功能都接近于该功能的最佳性能,您应该考虑如何修复Java问题(提示- StringBuilder)。

In the C++ case, there are some things to try to improve performance a bit. In particular...

在c++的例子中,有一些东西可以尝试提高性能。特别是……

  • s += 'X'; rather than s += "X";
  • s + =“X”;而不是s += "X";
  • Declare string searchpattern ("ABCDEFGHIJKLMNOPQRSTUVWXYZ"); outside the loop, and pass this for the find calls. An std::string instance knows it's own length, whereas a C string requires a linear-time check to determine that, and this may (or may not) be relevant to std::string::find performance.
  • 声明字符串searchpattern(“ABCDEFGHIJKLMNOPQRSTUVWXYZ”);在循环之外,并将其传递给find调用。一个std::string实例知道它自己的长度,而C字符串需要一个线性时间检查来确定它,而这个可能(或者可能不)与std有关::查找性能。
  • Try using std::stringstream, for a similar reason to why you should be using StringBuilder for Java, though most likely the repeated conversions back to string will create more problems.
  • 尝试使用std::stringstream,因为类似的原因,为什么您应该使用StringBuilder作为Java,尽管很有可能重复的转换会导致更多的问题。

Overall, the result isn't too surprising though. JavaScript, with a good JIT compiler, may be able to optimise a little better than C++ static compilation is allowed to in this case.

总的来说,结果并不令人惊讶。有一个好的JIT编译器的JavaScript,在这种情况下可能会比c++静态编译好一点。

With enough work, you should always be able to optimise C++ better than JavaScript, but there will always be cases where that doesn't just naturally happen and where it may take a fair bit of knowledge and effort to achieve that.

有了足够的工作,您应该总是能够比JavaScript更好地优化c++,但是总会有这样的情况,不只是自然发生,而且可能需要一些知识和努力来实现这一点。

#7


4  

For C++, try to use std::string for "ABCDEFGHIJKLMNOPQRSTUVWXYZ" - in my implementation string::find(const charT* s, size_type pos = 0) const calculates length of string argument.

对于c++,尝试使用std::string For“ABCDEFGHIJKLMNOPQRSTUVWXYZ”——在我的实现字符串中:find(const charT* s, size_type pos = 0) const计算字符串参数的长度。

#8


3  

I just tested the C++ example myself. If I remove the the call to std::sting::find, the program terminates in no time. Thus the allocations during string concatenation is no problem here.

我自己测试了c++的例子。如果我把呼叫转移到std::sting::发现,程序马上终止。因此,在字符串连接期间的分配是没有问题的。

If I add a variable sdt::string abc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" and replace the occurence of "ABC...XYZ" in the call of std::string::find, the program needs almost the same time to finish as the original example. This again shows that allocation as well as computing the string's length does not add much to the runtime.

如果我添加一个变量sdt::string abc = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"并替换" abc…"在std的调用中:::find,该程序几乎需要与原始示例相同的时间完成。这再次说明了分配以及计算字符串长度对运行时的影响不大。

Therefore, it seems that the string search algorithm used by libstdc++ is not as fast for your example as the search algorithms of javascript or python. Maybe you want to try C++ again with your own string search algorithm which fits your purpose better.

因此,libstdc++所使用的字符串搜索算法似乎不像javascript或python的搜索算法那样快。也许你想用你自己的字符串搜索算法来尝试c++,它更适合你的目的。

#9


3  

C/C++ language are not easy and take years make fast programs.

C语言是不容易的,需要几年的时间来制定快速的程序。

with strncmp(3) version modified from c version:

从c版本修改的strncmp(3)版本:

#define _GNU_SOURCE
#include <string.h>
#include <stdio.h>

void test()
{
    int limit = 102 * 1024;
    char s[limit];
    size_t size = 0;
    while (size < limit) {
        s[size++] = 'X';
        if (!strncmp(s, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", 26)) {
            fprintf(stderr, "zomg\n");
            return;
        }
    }
    printf("x's length = %zu\n", size);
}

int main()
{
    test();
    return 0;
}

#10


1  

It seems that in nodejs better algorithm for search substrings. You can implement self and try it.

在nodejs中,搜索子字符串的算法似乎更好。您可以实现self并尝试它。

#11


1  

Your test code is checking a pathological scenario of excessive string concatenation. (The string-search part of the test could have probably been omitted, I bet you it contributes almost nothing to the final results.) Excessive string concatenation is a pitfall that most languages warn very strongly against, and provide very well known alternatives for, (i.e. StringBuilder,) so what you are essentially testing here is how badly these languages fail under scenarios of perfectly expected failure. That's pointless.

您的测试代码正在检查过度字符串连接的病理情况。(测试的字符串搜索部分可能被省略了,我打赌它对最终结果几乎没有贡献。)过多的字符串连接是大多数语言强烈反对的一个陷阱,并且提供了非常有名的替代方法(例如StringBuilder),所以您实际上在测试的是这些语言在完全预期的失败场景下会有多么糟糕。这是毫无意义的。

An example of a similarly pointless test would be to compare the performance of various languages when throwing and catching an exception in a tight loop. All languages warn that exception throwing and catching is abysmally slow. They do not specify how slow, they just warn you not to expect anything. Therefore, to go ahead and test precisely that, would be pointless.

一个类似无意义的测试的例子是,在一个紧凑的循环中抛出和捕获一个异常时,比较各种语言的性能。所有语言都警告说,抛出和捕获异常缓慢。他们没有具体说明他们的反应有多慢,只是警告你不要期待任何事情。因此,要进行精确的测试,就毫无意义了。

So, it would make a lot more sense to repeat your test substituting the mindless string concatenation part (s += "X") with whatever construct is offered by each one of these languages precisely for avoiding string concatenation. (Such as class StringBuilder.)

因此,如果使用这些语言中的每一种语言提供的任何构造都可以精确地避免字符串连接,那么重复您的测试就更有意义了。(如类StringBuilder。)

#12


0  

As mentioned by sbi, the test case is dominated by the search operation. I was curious how fast the text allocation compares between C++ and Javascript.

正如sbi所提到的,测试用例以搜索操作为主。我很好奇,在c++和Javascript之间的文本分配有多快。

System: Raspberry Pi 2, g++ 4.6.3, node v0.12.0, g++ -std=c++0x -O2 perf.cpp

系统:树莓Pi 2, g++ 4.6.3,节点v0.12.0, g++ -std=c++0x -O2性能。

C++ : 770ms

c++:770 ms

C++ without reserve: 1196ms

毫无保留地c++:1196 ms

Javascript: 2310ms

Javascript:2310毫秒

C++

c++

#include <iostream>
#include <string>
#include <chrono>
using namespace std;
using namespace std::chrono;

void test()
{
    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    int x = 0;
    int limit = 1024 * 1024 * 100;
    string s("");
    s.reserve(1024 * 1024 * 101);
    for(int i=0; s.size()< limit; i++){
        s += "SUPER NICE TEST TEXT";
    }

    high_resolution_clock::time_point t2 = high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>( t2 - t1 ).count();
    cout << duration << endl;
}

int main()
{
    test();
}

JavaScript

JavaScript

function test()
{
    var time = process.hrtime();
    var x = "";
    var limit = 1024 * 1024 * 100;
    for(var i=0; x.length < limit; i++){
        x += "SUPER NICE TEST TEXT";
    }

    var diff = process.hrtime(time);
    console.log('benchmark took %d ms', diff[0] * 1e3 + diff[1] / 1e6 );
}

test();