在c ++中只使用std :: string而不是char数组和std :: vector / list而不是数组有任何实际限制吗?

时间:2022-09-01 23:51:31

I use vectors, lists, strings and wstrings obsessively in my code. Are there any catch 22s involved that should make me more interested in using arrays from time to time, chars and wchars instead?

我在代码中痴迷地使用向量,列表,字符串和字符串。是否有任何捕获22会让我对使用数组,chars和wchars更感兴趣?

Basically, if working in an environment which supports the standard template library is there any case using the primitive types is actually better?

基本上,如果在支持标准模板库的环境中工作,那么使用基元类型的情况实际上更好吗?

7 个解决方案

#1


For 99% of the time and for 99% of Standard Library implementations, you will find that std::vectors will be fast enough, and the convenience and safety you get from using them will more than outweigh any small performance cost.

对于99%的时间和99%的标准库实现,您会发现std :: vectors足够快,使用它们所带来的便利性和安全性将超过任何小的性能成本。

For those very rare cases when you really need bare-metal code, you can treat a vector like a C-style array:

对于那些非常罕见的情况,当你真的需要裸机代码时,你可以像C风格的数组一样对待矢量:

vector <int> v( 100 );
int * p = &v[0];
p[3] = 42;

The C++ standard guarantees that vectors are allocated contiguously, so this is guaranteed to work.

C ++标准保证向量是连续分配的,因此保证可以正常工作。

Regarding strings, the convenience factor becomes almnost overwhelming, and the performance issues tend to go away. If you go beack to C-style strings, you are also going back to the use of functions like strlen(), which are inherently very inefficent themselves.

关于字符串,便利因素变得令人难以置信,性能问题往往会消失。如果你去攻击C风格的字符串,你也会回到strlen()这样的函数的使用,这些函数本身就非常缺乏。

As for lists, you should think twice, and probably thrice, before using them at all, whether your own implementation or the standard. The vast majority of computing problems are better solved using a vector/array. The reason lists appear so often in the literature is to a large part because they are a convenient data structure for textbook and training course writers to use to explain pointers and dynamic allocation in one go. I speak here as an ex training course writer.

至于列表,在使用它们之前,你应该三思而后,甚至三次,无论你自己的实现还是标准。使用矢量/数组可以更好地解决绝大多数计算问题。列表中常见的原因在很大程度上是因为它们是教科书和培训课程编写者用来一次性解释指针和动态分配的便捷数据结构。我作为前培训课程作家在这里发言。

#2


I would stick to STL classes (vectors, strings, etc). They are safer, easier to use, more productive, with less probability to have memory leaks and, AFAIK, they make some additional, run-time checking of boundaries, at least at DEBUG time (Visual C++).

我会坚持使用STL类(向量,字符串等)。它们更安全,更易于使用,更高效,内存泄漏的可能性更小,而且,AFAIK,它们至少在DEBUG时间(Visual C ++)进行一些额外的运行时边界检查。

Then, measure the performance. If you identify the bottleneck(s) is on STL classes, then move to C style strings and arrays usage.

然后,测量性能。如果您确定瓶颈在STL类上,则转到C样式字符串和数组用法。

From my experience, the chances to have the bottleneck on vector or string usage are very low.

根据我的经验,在矢量或字符串使用方面存在瓶颈的可能性非常低。

#3


One problem is the overhead when accessing elements. Even with vector and string when you access an element by index you need to first retrieve the buffer address, then add the offset (you don't do it manually, but the compiler emits such code). With raw array you already have the buffer address. This extra indirection can lead to significant overhead in certain cases and is subject to profiling when you want to improve performance.

一个问题是访问元素时的开销。即使使用向量和字符串,当您通过索引访问元素时,您需要首先检索缓冲区地址,然后添加偏移量(您不要手动执行,但编译器会发出此类代码)。使用原始数组,您已经拥有缓冲区地址。这种额外的间接性可能会在某些情况下导致显着的开销,并且在您希望提高性能时需要进行性能分析。

#4


If you don't need real time responses, stick with your approach. They are safer than chars.

如果您不需要实时响应,请坚持使用您的方法。它们比字符更安全。

#5


You can occasionally encounter scenarios where you'll get better performance or memory usage from doing some stuff yourself (example, std::string typically has about 24 bytes of overhead, 12 bytes for the pointers in the std::string itself, and a header block on its dynamically allocated piece).

您可以偶尔遇到这样的情况,即您自己做一些事情会获得更好的性能或内存使用(例如,std :: string通常有大约24个字节的开销,12个字节用于std :: string本身的指针,以及动态分配的块上的标题块)。

I have worked on projects where converting from std::string to const char* saved noticeable memory (10's of MB). I don't believe these projects are what you would call typical.

我曾经在从std :: string转换为const char *的项目中保存了明显的内存(10的MB)。我不相信这些项目是你所谓的典型项目。

Oh, using STL will hurt your compile times, and at some point that may be an issue. When your project results in over a GB of object files being passed to the linker, you might want to consider how much of that is template bloat.

哦,使用STL会损害您的编译时间,并且在某些时候可能会出现问题。当您的项目导致超过GB的目标文件传递给链接器时,您可能需要考虑其中有多少是模板膨胀。

#6


I've worked on several projects where the memory overhead for strings has become problematic.

我曾经在几个项目中工作过,因为字符串的内存开销已经成为问题。

It's worth considering in advance how your application needs to scale. If you need to be storing an unbounded number of strings, using const char*s into a globally managed string table can save you huge amounts of memory.

值得提前考虑您的应用程序需要如何扩展。如果您需要存储无限数量的字符串,则将const char * s用于全局管理的字符串表可以节省大量内存。

But generally, definitely use STL types unless there's a very good reason to do otherwise.

但一般来说,绝对使用STL类型,除非有一个很好的理由不这样做。

#7


I believe the default memory allocation technique is a buffer for vectors and strings is one that allocates double the amount of memory each time the currently allocated memory gets used up. This can be wasteful. You can provide a custom allocator of course...

我相信默认的内存分配技术是向量的缓冲区,字符串是每次当前分配的内存用完时分配双倍内存量的缓冲区。这可能是浪费。您当然可以提供自定义分配器...

The other thing to consider is stack vs. heap. Staticly sized arrays and strings can sit on the stack, or at least the compiler handles the memory management for you. Newer compilers will handle dynamically sized arrays for you too if they provide the relevant C99/C++0x feature. Vectors and strings will always use the heap, and this can introduce performance issues if you have really tight constraints.

另一件需要考虑的事情是堆栈与堆。静态大小的数组和字符串可以放在堆栈上,或者至少编译器会为您处理内存管理。如果新的编译器提供相关的C99 / C ++ 0x功能,它们也将为您处理动态大小的数组。向量和字符串将始终使用堆,如果您有非常严格的约束,这可能会引入性能问题。

As a rule of thumb use whats already there unless it hurts your project with its speed/memory overhead... you'll probably find that for 99% of stuff the STL provided classes save you time and effort with little to no impact on your applications performance. (i.e. "avoid premature optimisation")

根据经验,使用什么已经存在,除非它以其速度/内存开销伤害你的项目......你可能会发现,对于99%的东西,STL提供的课程可以节省你的时间和精力,对你的影响很小甚至没有影响应用性能。 (即“避免过早优化”)

#1


For 99% of the time and for 99% of Standard Library implementations, you will find that std::vectors will be fast enough, and the convenience and safety you get from using them will more than outweigh any small performance cost.

对于99%的时间和99%的标准库实现,您会发现std :: vectors足够快,使用它们所带来的便利性和安全性将超过任何小的性能成本。

For those very rare cases when you really need bare-metal code, you can treat a vector like a C-style array:

对于那些非常罕见的情况,当你真的需要裸机代码时,你可以像C风格的数组一样对待矢量:

vector <int> v( 100 );
int * p = &v[0];
p[3] = 42;

The C++ standard guarantees that vectors are allocated contiguously, so this is guaranteed to work.

C ++标准保证向量是连续分配的,因此保证可以正常工作。

Regarding strings, the convenience factor becomes almnost overwhelming, and the performance issues tend to go away. If you go beack to C-style strings, you are also going back to the use of functions like strlen(), which are inherently very inefficent themselves.

关于字符串,便利因素变得令人难以置信,性能问题往往会消失。如果你去攻击C风格的字符串,你也会回到strlen()这样的函数的使用,这些函数本身就非常缺乏。

As for lists, you should think twice, and probably thrice, before using them at all, whether your own implementation or the standard. The vast majority of computing problems are better solved using a vector/array. The reason lists appear so often in the literature is to a large part because they are a convenient data structure for textbook and training course writers to use to explain pointers and dynamic allocation in one go. I speak here as an ex training course writer.

至于列表,在使用它们之前,你应该三思而后,甚至三次,无论你自己的实现还是标准。使用矢量/数组可以更好地解决绝大多数计算问题。列表中常见的原因在很大程度上是因为它们是教科书和培训课程编写者用来一次性解释指针和动态分配的便捷数据结构。我作为前培训课程作家在这里发言。

#2


I would stick to STL classes (vectors, strings, etc). They are safer, easier to use, more productive, with less probability to have memory leaks and, AFAIK, they make some additional, run-time checking of boundaries, at least at DEBUG time (Visual C++).

我会坚持使用STL类(向量,字符串等)。它们更安全,更易于使用,更高效,内存泄漏的可能性更小,而且,AFAIK,它们至少在DEBUG时间(Visual C ++)进行一些额外的运行时边界检查。

Then, measure the performance. If you identify the bottleneck(s) is on STL classes, then move to C style strings and arrays usage.

然后,测量性能。如果您确定瓶颈在STL类上,则转到C样式字符串和数组用法。

From my experience, the chances to have the bottleneck on vector or string usage are very low.

根据我的经验,在矢量或字符串使用方面存在瓶颈的可能性非常低。

#3


One problem is the overhead when accessing elements. Even with vector and string when you access an element by index you need to first retrieve the buffer address, then add the offset (you don't do it manually, but the compiler emits such code). With raw array you already have the buffer address. This extra indirection can lead to significant overhead in certain cases and is subject to profiling when you want to improve performance.

一个问题是访问元素时的开销。即使使用向量和字符串,当您通过索引访问元素时,您需要首先检索缓冲区地址,然后添加偏移量(您不要手动执行,但编译器会发出此类代码)。使用原始数组,您已经拥有缓冲区地址。这种额外的间接性可能会在某些情况下导致显着的开销,并且在您希望提高性能时需要进行性能分析。

#4


If you don't need real time responses, stick with your approach. They are safer than chars.

如果您不需要实时响应,请坚持使用您的方法。它们比字符更安全。

#5


You can occasionally encounter scenarios where you'll get better performance or memory usage from doing some stuff yourself (example, std::string typically has about 24 bytes of overhead, 12 bytes for the pointers in the std::string itself, and a header block on its dynamically allocated piece).

您可以偶尔遇到这样的情况,即您自己做一些事情会获得更好的性能或内存使用(例如,std :: string通常有大约24个字节的开销,12个字节用于std :: string本身的指针,以及动态分配的块上的标题块)。

I have worked on projects where converting from std::string to const char* saved noticeable memory (10's of MB). I don't believe these projects are what you would call typical.

我曾经在从std :: string转换为const char *的项目中保存了明显的内存(10的MB)。我不相信这些项目是你所谓的典型项目。

Oh, using STL will hurt your compile times, and at some point that may be an issue. When your project results in over a GB of object files being passed to the linker, you might want to consider how much of that is template bloat.

哦,使用STL会损害您的编译时间,并且在某些时候可能会出现问题。当您的项目导致超过GB的目标文件传递给链接器时,您可能需要考虑其中有多少是模板膨胀。

#6


I've worked on several projects where the memory overhead for strings has become problematic.

我曾经在几个项目中工作过,因为字符串的内存开销已经成为问题。

It's worth considering in advance how your application needs to scale. If you need to be storing an unbounded number of strings, using const char*s into a globally managed string table can save you huge amounts of memory.

值得提前考虑您的应用程序需要如何扩展。如果您需要存储无限数量的字符串,则将const char * s用于全局管理的字符串表可以节省大量内存。

But generally, definitely use STL types unless there's a very good reason to do otherwise.

但一般来说,绝对使用STL类型,除非有一个很好的理由不这样做。

#7


I believe the default memory allocation technique is a buffer for vectors and strings is one that allocates double the amount of memory each time the currently allocated memory gets used up. This can be wasteful. You can provide a custom allocator of course...

我相信默认的内存分配技术是向量的缓冲区,字符串是每次当前分配的内存用完时分配双倍内存量的缓冲区。这可能是浪费。您当然可以提供自定义分配器...

The other thing to consider is stack vs. heap. Staticly sized arrays and strings can sit on the stack, or at least the compiler handles the memory management for you. Newer compilers will handle dynamically sized arrays for you too if they provide the relevant C99/C++0x feature. Vectors and strings will always use the heap, and this can introduce performance issues if you have really tight constraints.

另一件需要考虑的事情是堆栈与堆。静态大小的数组和字符串可以放在堆栈上,或者至少编译器会为您处理内存管理。如果新的编译器提供相关的C99 / C ++ 0x功能,它们也将为您处理动态大小的数组。向量和字符串将始终使用堆,如果您有非常严格的约束,这可能会引入性能问题。

As a rule of thumb use whats already there unless it hurts your project with its speed/memory overhead... you'll probably find that for 99% of stuff the STL provided classes save you time and effort with little to no impact on your applications performance. (i.e. "avoid premature optimisation")

根据经验,使用什么已经存在,除非它以其速度/内存开销伤害你的项目......你可能会发现,对于99%的东西,STL提供的课程可以节省你的时间和精力,对你的影响很小甚至没有影响应用性能。 (即“避免过早优化”)