字符串和char[]类型在c++中的区别。

时间:2022-09-18 16:07:12

I know a little C and now I'm taking a look at C++. I'm used to char arrays for dealing with C strings, but while I look at C++ code I see there are examples using both string type and char arrays:

我知道一点C,现在我来看看c++。我习惯用char数组来处理C字符串,但是当我查看c++代码时,我发现有一些例子同时使用了字符串类型和char数组:

#include <iostream>
#include <string>
using namespace std;

int main () {
  string mystr;
  cout << "What's your name? ";
  getline (cin, mystr);
  cout << "Hello " << mystr << ".\n";
  cout << "What is your favorite team? ";
  getline (cin, mystr);
  cout << "I like " << mystr << " too!\n";
  return 0;
}

and

#include <iostream>
using namespace std;

int main () {
  char name[256], title[256];

  cout << "Enter your name: ";
  cin.getline (name,256);

  cout << "Enter your favourite movie: ";
  cin.getline (title,256);

  cout << name << "'s favourite movie is " << title;

  return 0;
}

(both examples from http://www.cplusplus.com)

从http://www.cplusplus.com(例子)

I suppose this is a widely asked and answered (obvious?) question, but it would be nice if someone could tell me what's exactly the difference between that two ways for dealing with strings in C++ (performance, API integration, the way each one is better, ...).

我想这是一个被广泛问和回答的问题(显然?),但是如果有人能告诉我在c++中处理字符串的两种方式的确切区别是什么,那就太好了。

Thank you.

谢谢你!

7 个解决方案

#1


146  

A char array is just that - an array of characters:

一个字符数组就是这样——一个字符数组:

  • If allocated on the stack (like in your example), it will always occupy eg. 256 bytes no matter how long the text it contains is
  • 如果分配到堆栈上(如您的示例),它将始终占据eg。256字节,不管它包含的文本有多长
  • If allocated on the heap (using malloc() or new char[]) you're responsible for releasing the memory afterwards and you will always have the overhead of a heap allocation.
  • 如果在堆上分配(使用malloc()或新的char[]),那么您将负责随后释放内存,并且始终存在堆分配的开销。
  • If you copy a text of more than 256 chars into the array, it might crash, produce ugly assertion messages or cause unexplainable (mis-)behavior somewhere else in your program.
  • 如果您将超过256个字符的文本复制到数组中,它可能会崩溃、产生难看的断言消息或导致程序中其他地方的不可解释(错误)行为。
  • To determine the text's length, the array has to be scanned, character by character, for a \0 character.
  • 要确定文本的长度,必须对数组进行扫描,每个字符扫描一个\0字符。

A string is a class that contains a char array, but automatically manages it for you. Most string implementations have a built-in array of 16 characters (so short strings don't fragment the heap) and use the heap for longer strings.

字符串是一个类,它包含一个char数组,但是会自动为您管理它。大多数字符串实现都有一个内置的16个字符数组(所以短字符串不分段堆),并使用堆来长字符串。

You can access a string's char array like this:

可以访问字符串的char数组,如下所示:

std::string myString = "Hello World";
const char *myStringChars = myString.c_str();

C++ strings can contain embedded \0 characters, know their length without counting, are faster than heap-allocated char arrays for short texts and protect you from buffer overruns. Plus they're more readable and easier to use.

c++字符串可以包含嵌入式\0字符,知道它们的长度而不计算,对于短文本,它比堆分配的字符数组更快,并保护您不受缓冲区溢出的影响。此外,它们更易于阅读和使用。

-

- - - - - -

However, C++ strings are not (very) suitable for usage across DLL boundaries, because this would require any user of such a DLL function to make sure he's using the exact same compiler and C++ runtime implementation, lest he risk his string class behaving differently.

然而,c++字符串(非常)不适合跨DLL边界使用,因为这需要任何使用此类DLL函数的用户确保使用的是完全相同的编译器和c++运行时实现,以免他的string类行为出现差异。

Normally, a string class would also release its heap memory on the calling heap, so it will only be able to free memory again if you're using a shared (.dll or .so) version of the runtime.

通常,string类也会在调用堆上释放它的堆内存,因此,如果您使用共享(.),它只能再次释放内存。运行时的版本。

In short: use C++ strings in all your internal functions and methods. If you ever write a .dll or .so, use C strings in your public (dll/so-exposed) functions.

简而言之:在所有内部函数和方法中使用c++字符串。如果您曾经编写过一个.dll或.so,请在您的公共(dll/so- exposure)函数中使用C字符串。

#2


6  

Well, string type is a completely managed class for character strings, while char[] is still what it was in C, a byte array representing a character string for you.

字符串类型是一个完全管理的字符串类,而char[]仍然是C语言中的类,一个字节数组代表一个字符串。

In terms of API and standard library everything is implemented in terms of strings and not char[], but there are still lots of functions from the libc that receive char[] so you may need to use it for those, apart from that I would always use std::string.

就API和标准库而言,所有东西都是用字符串而不是char[]来实现的,但是仍然有许多来自libc的函数接收char[],所以您可能需要对它们使用它,除此之外,我总是使用std:::string。

In terms of efficiency of course a raw buffer of unmanaged memory will almost always be faster for lots of things, but take in account comparing strings for example, std::string has always the size to check it first, while with char[] you need to compare character by character.

当然,就效率而言,非托管内存的原始缓冲区对于很多事情来说几乎总是更快的,但是要考虑到比较字符串,例如std::string总是先检查它的大小,而对于char[],您需要逐个字符地比较。

#3


6  

Arkaitz is correct that string is a managed type. What this means for you is that you never have to worry about how long the string is, nor do you have to worry about freeing or reallocating the memory of the string.

Arkaitz认为字符串是托管类型。这对您意味着,您不必担心字符串的长度,也不必担心释放或重新分配字符串的内存。

On the other hand, the char[] notation in the case above has restricted the character buffer to exactly 256 characters. If you tried to write more than 256 characters into that buffer, at best you will overwrite other memory that your program "owns". At worst, you will try to overwrite memory that you do not own, and your OS will kill your program on the spot.

另一方面,上面例子中的char[]表示法将字符缓冲区限制为恰好256个字符。如果您试图在该缓冲区中写入超过256个字符,那么您最多将覆盖程序“拥有”的其他内存。在最坏的情况下,您将尝试覆盖不拥有的内存,您的操作系统将当场杀死您的程序。

Bottom line? Strings are a lot more programmer friendly, char[]s are a lot more efficient for the computer.

底线?字符串对程序员来说友好得多,字符对计算机来说效率高得多。

#4


5  

I personally do not see any reason why one would like to use char* or char[] except for compatibility with old code. std::string's no slower than using a c-string, except that it will handle re-allocation for you. You can set it's size when you create it, and thus avoid re-allocation if you want. It's indexing operator ([]) provides constant time access (and is in every sense of the word the exact same thing as using a c-string indexer). Using the at method gives you bounds checked safety as well, something you don't get with c-strings, unless you write it. Your compiler will most often optimize out the indexer use in release mode. It is easy to mess around with c-strings; things such as delete vs delete[], exception safety, even how to reallocate a c-string.

我个人认为除了与旧代码兼容之外,没有任何理由需要使用char*或char[]。std::字符串的速度并不比使用c字符串慢,只是它会为你重新分配。您可以在创建它时设置它的大小,因此如果您愿意,可以避免重新分配。它的索引操作符([])提供了常量时间访问(并且在每一种意义上都与使用c-string索引器完全一样)。使用at方法还可以为您提供边界检查安全性,除非您编写它,否则您无法使用c-string。编译器通常会在发布模式中优化索引器的使用。c弦很容易弄乱;诸如delete vs delete[],异常安全,甚至如何重新分配c字符串。

And when you have to deal with advanced concepts like having COW strings, and non-COW for MT etc, you will need std::string.

当你需要处理高级概念时,比如有牛弦,没有牛弦代表MT等等,你需要std::string。

If you are worried about copies, as long as you use references, and const references wherever you can, you will not have any overhead due to copies, and it's the same thing as you would be doing with the c-string.

如果您担心复制,只要您使用引用,并且在任何可能的地方使用const引用,您就不会因为复制而产生任何开销,这与您使用c-string所做的事情是一样的。

#5


1  

Strings have helper functions and manage char arrays automatically. You can concatenate strings, for a char array you would need to copy it to a new array, strings can change their length at runtime. A char array is harder to manage than a string and certain functions may only accept a string as input, requiring you to convert the array to a string. It's better to use strings, they were made so that you don't have to use arrays. If arrays were objectively better we wouldn't have strings.

字符串有帮助函数和自动管理字符数组。你可以连接字符串,对于一个char数组你需要将它复制到一个新的数组中,字符串可以在运行时改变它们的长度。字符数组比字符串更难管理,某些函数可能只接受字符串作为输入,需要将数组转换为字符串。最好是使用字符串,这样你就不用使用数组了。如果数组在客观上更好,我们就不会有字符串。

#6


0  

Think of (char *) as string.begin(). The essential difference is that (char *) is an iterator and std::string is a container. If you stick to basic strings a (char *) will give you what std::string::iterator does. You could use (char *) when you want the benefit of an iterator and also compatibility with C, but that's the exception and not the rule. As always, be careful of iterator invalidation. When people say (char *) isn't safe this is what they mean. It's as safe as any other C++ iterator.

可以将(char *)看作string.begin()。本质上的区别在于(char *)是一个迭代器,而std::string是一个容器。如果您坚持使用基本的字符串a (char *),则会给出std::string::iterator的功能。您可以使用(char *)当您想要一个迭代器的好处,同时也可以与C兼容,但是这是一个例外,而不是规则。和往常一样,要小心迭代器失效。当人们说(char *)不安全时,这就是他们的意思。它和其他c++迭代器一样安全。

#7


0  

One of the difference is Null termination (\0).

其中一个区别是Null终止(\0)。

In C and C++, char* or char[] will take a pointer to a single char as a parameter and will track along the memory until a 0 memory value is reached (often called the null terminator).

在C和c++中,char*或char[]将使用一个指向单个char的指针作为参数,并沿着内存跟踪,直到达到一个0内存值(通常称为空终止符)。

C++ strings can contain embedded \0 characters, know their length without counting.

c++字符串可以包含嵌入的\0字符,知道它们的长度而不计算。

#include<stdio.h>
#include<string.h>
#include<iostream>

using namespace std;

void NullTerminatedString(string str){
   int NUll_term = 3;
   str[NUll_term] = '\0';       // specific character is kept as NULL in string
   cout << str << endl <<endl <<endl;
}

void NullTerminatedChar(char *str){
   int NUll_term = 3;
   str[NUll_term] = 0;     // from specific, all the character are removed 
   cout << str << endl;
}

int main(){
  string str = "Feels Happy";
  printf("string = %s\n", str.c_str());
  printf("strlen = %d\n", strlen(str.c_str()));  
  printf("size = %d\n", str.size());  
  printf("sizeof = %d\n", sizeof(str)); // sizeof std::string class  and compiler dependent
  NullTerminatedString(str);


  char str1[12] = "Feels Happy";
  printf("char[] = %s\n", str1);
  printf("strlen = %d\n", strlen(str1));
  printf("sizeof = %d\n", sizeof(str1));    // sizeof char array
  NullTerminatedChar(str1);
  return 0;
}

Output:

输出:

strlen = 11
size = 11
sizeof = 32  
Fee s Happy


strlen = 11
sizeof = 12
Fee

#1


146  

A char array is just that - an array of characters:

一个字符数组就是这样——一个字符数组:

  • If allocated on the stack (like in your example), it will always occupy eg. 256 bytes no matter how long the text it contains is
  • 如果分配到堆栈上(如您的示例),它将始终占据eg。256字节,不管它包含的文本有多长
  • If allocated on the heap (using malloc() or new char[]) you're responsible for releasing the memory afterwards and you will always have the overhead of a heap allocation.
  • 如果在堆上分配(使用malloc()或新的char[]),那么您将负责随后释放内存,并且始终存在堆分配的开销。
  • If you copy a text of more than 256 chars into the array, it might crash, produce ugly assertion messages or cause unexplainable (mis-)behavior somewhere else in your program.
  • 如果您将超过256个字符的文本复制到数组中,它可能会崩溃、产生难看的断言消息或导致程序中其他地方的不可解释(错误)行为。
  • To determine the text's length, the array has to be scanned, character by character, for a \0 character.
  • 要确定文本的长度,必须对数组进行扫描,每个字符扫描一个\0字符。

A string is a class that contains a char array, but automatically manages it for you. Most string implementations have a built-in array of 16 characters (so short strings don't fragment the heap) and use the heap for longer strings.

字符串是一个类,它包含一个char数组,但是会自动为您管理它。大多数字符串实现都有一个内置的16个字符数组(所以短字符串不分段堆),并使用堆来长字符串。

You can access a string's char array like this:

可以访问字符串的char数组,如下所示:

std::string myString = "Hello World";
const char *myStringChars = myString.c_str();

C++ strings can contain embedded \0 characters, know their length without counting, are faster than heap-allocated char arrays for short texts and protect you from buffer overruns. Plus they're more readable and easier to use.

c++字符串可以包含嵌入式\0字符,知道它们的长度而不计算,对于短文本,它比堆分配的字符数组更快,并保护您不受缓冲区溢出的影响。此外,它们更易于阅读和使用。

-

- - - - - -

However, C++ strings are not (very) suitable for usage across DLL boundaries, because this would require any user of such a DLL function to make sure he's using the exact same compiler and C++ runtime implementation, lest he risk his string class behaving differently.

然而,c++字符串(非常)不适合跨DLL边界使用,因为这需要任何使用此类DLL函数的用户确保使用的是完全相同的编译器和c++运行时实现,以免他的string类行为出现差异。

Normally, a string class would also release its heap memory on the calling heap, so it will only be able to free memory again if you're using a shared (.dll or .so) version of the runtime.

通常,string类也会在调用堆上释放它的堆内存,因此,如果您使用共享(.),它只能再次释放内存。运行时的版本。

In short: use C++ strings in all your internal functions and methods. If you ever write a .dll or .so, use C strings in your public (dll/so-exposed) functions.

简而言之:在所有内部函数和方法中使用c++字符串。如果您曾经编写过一个.dll或.so,请在您的公共(dll/so- exposure)函数中使用C字符串。

#2


6  

Well, string type is a completely managed class for character strings, while char[] is still what it was in C, a byte array representing a character string for you.

字符串类型是一个完全管理的字符串类,而char[]仍然是C语言中的类,一个字节数组代表一个字符串。

In terms of API and standard library everything is implemented in terms of strings and not char[], but there are still lots of functions from the libc that receive char[] so you may need to use it for those, apart from that I would always use std::string.

就API和标准库而言,所有东西都是用字符串而不是char[]来实现的,但是仍然有许多来自libc的函数接收char[],所以您可能需要对它们使用它,除此之外,我总是使用std:::string。

In terms of efficiency of course a raw buffer of unmanaged memory will almost always be faster for lots of things, but take in account comparing strings for example, std::string has always the size to check it first, while with char[] you need to compare character by character.

当然,就效率而言,非托管内存的原始缓冲区对于很多事情来说几乎总是更快的,但是要考虑到比较字符串,例如std::string总是先检查它的大小,而对于char[],您需要逐个字符地比较。

#3


6  

Arkaitz is correct that string is a managed type. What this means for you is that you never have to worry about how long the string is, nor do you have to worry about freeing or reallocating the memory of the string.

Arkaitz认为字符串是托管类型。这对您意味着,您不必担心字符串的长度,也不必担心释放或重新分配字符串的内存。

On the other hand, the char[] notation in the case above has restricted the character buffer to exactly 256 characters. If you tried to write more than 256 characters into that buffer, at best you will overwrite other memory that your program "owns". At worst, you will try to overwrite memory that you do not own, and your OS will kill your program on the spot.

另一方面,上面例子中的char[]表示法将字符缓冲区限制为恰好256个字符。如果您试图在该缓冲区中写入超过256个字符,那么您最多将覆盖程序“拥有”的其他内存。在最坏的情况下,您将尝试覆盖不拥有的内存,您的操作系统将当场杀死您的程序。

Bottom line? Strings are a lot more programmer friendly, char[]s are a lot more efficient for the computer.

底线?字符串对程序员来说友好得多,字符对计算机来说效率高得多。

#4


5  

I personally do not see any reason why one would like to use char* or char[] except for compatibility with old code. std::string's no slower than using a c-string, except that it will handle re-allocation for you. You can set it's size when you create it, and thus avoid re-allocation if you want. It's indexing operator ([]) provides constant time access (and is in every sense of the word the exact same thing as using a c-string indexer). Using the at method gives you bounds checked safety as well, something you don't get with c-strings, unless you write it. Your compiler will most often optimize out the indexer use in release mode. It is easy to mess around with c-strings; things such as delete vs delete[], exception safety, even how to reallocate a c-string.

我个人认为除了与旧代码兼容之外,没有任何理由需要使用char*或char[]。std::字符串的速度并不比使用c字符串慢,只是它会为你重新分配。您可以在创建它时设置它的大小,因此如果您愿意,可以避免重新分配。它的索引操作符([])提供了常量时间访问(并且在每一种意义上都与使用c-string索引器完全一样)。使用at方法还可以为您提供边界检查安全性,除非您编写它,否则您无法使用c-string。编译器通常会在发布模式中优化索引器的使用。c弦很容易弄乱;诸如delete vs delete[],异常安全,甚至如何重新分配c字符串。

And when you have to deal with advanced concepts like having COW strings, and non-COW for MT etc, you will need std::string.

当你需要处理高级概念时,比如有牛弦,没有牛弦代表MT等等,你需要std::string。

If you are worried about copies, as long as you use references, and const references wherever you can, you will not have any overhead due to copies, and it's the same thing as you would be doing with the c-string.

如果您担心复制,只要您使用引用,并且在任何可能的地方使用const引用,您就不会因为复制而产生任何开销,这与您使用c-string所做的事情是一样的。

#5


1  

Strings have helper functions and manage char arrays automatically. You can concatenate strings, for a char array you would need to copy it to a new array, strings can change their length at runtime. A char array is harder to manage than a string and certain functions may only accept a string as input, requiring you to convert the array to a string. It's better to use strings, they were made so that you don't have to use arrays. If arrays were objectively better we wouldn't have strings.

字符串有帮助函数和自动管理字符数组。你可以连接字符串,对于一个char数组你需要将它复制到一个新的数组中,字符串可以在运行时改变它们的长度。字符数组比字符串更难管理,某些函数可能只接受字符串作为输入,需要将数组转换为字符串。最好是使用字符串,这样你就不用使用数组了。如果数组在客观上更好,我们就不会有字符串。

#6


0  

Think of (char *) as string.begin(). The essential difference is that (char *) is an iterator and std::string is a container. If you stick to basic strings a (char *) will give you what std::string::iterator does. You could use (char *) when you want the benefit of an iterator and also compatibility with C, but that's the exception and not the rule. As always, be careful of iterator invalidation. When people say (char *) isn't safe this is what they mean. It's as safe as any other C++ iterator.

可以将(char *)看作string.begin()。本质上的区别在于(char *)是一个迭代器,而std::string是一个容器。如果您坚持使用基本的字符串a (char *),则会给出std::string::iterator的功能。您可以使用(char *)当您想要一个迭代器的好处,同时也可以与C兼容,但是这是一个例外,而不是规则。和往常一样,要小心迭代器失效。当人们说(char *)不安全时,这就是他们的意思。它和其他c++迭代器一样安全。

#7


0  

One of the difference is Null termination (\0).

其中一个区别是Null终止(\0)。

In C and C++, char* or char[] will take a pointer to a single char as a parameter and will track along the memory until a 0 memory value is reached (often called the null terminator).

在C和c++中,char*或char[]将使用一个指向单个char的指针作为参数,并沿着内存跟踪,直到达到一个0内存值(通常称为空终止符)。

C++ strings can contain embedded \0 characters, know their length without counting.

c++字符串可以包含嵌入的\0字符,知道它们的长度而不计算。

#include<stdio.h>
#include<string.h>
#include<iostream>

using namespace std;

void NullTerminatedString(string str){
   int NUll_term = 3;
   str[NUll_term] = '\0';       // specific character is kept as NULL in string
   cout << str << endl <<endl <<endl;
}

void NullTerminatedChar(char *str){
   int NUll_term = 3;
   str[NUll_term] = 0;     // from specific, all the character are removed 
   cout << str << endl;
}

int main(){
  string str = "Feels Happy";
  printf("string = %s\n", str.c_str());
  printf("strlen = %d\n", strlen(str.c_str()));  
  printf("size = %d\n", str.size());  
  printf("sizeof = %d\n", sizeof(str)); // sizeof std::string class  and compiler dependent
  NullTerminatedString(str);


  char str1[12] = "Feels Happy";
  printf("char[] = %s\n", str1);
  printf("strlen = %d\n", strlen(str1));
  printf("sizeof = %d\n", sizeof(str1));    // sizeof char array
  NullTerminatedChar(str1);
  return 0;
}

Output:

输出:

strlen = 11
size = 11
sizeof = 32  
Fee s Happy


strlen = 11
sizeof = 12
Fee