为什么我的弦的开头消失了?

时间:2023-01-13 20:46:37

In the following C++ code, I realised that gcount() was returning a larger number than I wanted, because getline() consumes the final newline character but doesn't send it to the input stream.

在下面的C ++代码中,我意识到gcount()返回的数字比我想要的大,因为getline()会消耗最终的换行符但不会将它发送到输入流。

What I still don't understand is the program's output, though. For input "Test\n", why do I get " est\n"? How come my mistake affects the first character of the string rather than adding unwanted rubbish onto the end? And how come the program's output is at odds with the way the string looks in the debugger ("Test\n", as I'd expect)?

但我仍然不明白的是程序的输出。对于输入“Test \ n”,为什么我得到“est \ n”?为什么我的错误会影响字符串的第一个字符,而不是在末尾添加不需要的垃圾?为什么程序的输出与字符串在调试器中的显示方式不一致(“Test \ n”,正如我所期望的那样)?

#include <fstream>
#include <vector>
#include <string>
#include <iostream>

using namespace std;

int main()
{
    const int bufferSize = 1024;
    ifstream input( "test.txt", ios::in | ios::binary );

    vector<char> vecBuffer( bufferSize );
    input.getline( &vecBuffer[0], bufferSize );
    string strResult( vecBuffer.begin(), vecBuffer.begin() + input.gcount() );
    cout << strResult << "\n";

    return 0;
}

4 个解决方案

#1


I've also duplicated this result, Windows Vista, Visual Studio 2005 SP2.

我也复制了这个结果,Windows Vista,Visual Studio 2005 SP2。

When I figure out what the heck is happening, I'll update this post.

当我弄清楚到底发生了什么时,我会更新这篇文章。

edit: Okay, there we go. The problem (and the different results people are getting) are from the \r. What happens is you call input.getline and put the result in vecBuffer. The getline function strips off the \n, but leaves the \r in place.

编辑:好的,我们去吧。问题(以及人们得到的不同结果)来自\ r \ n。你会调用input.getline并将结果放在vecBuffer中。 getline函数剥离\ n,但将\ r留在原位。

You then transfer the vecBuffer to a string variable, but use the gcount function from input, meaning you will get one char too much, because the input variable still contains the \n, and the vecBuffer does not.

然后将vecBuffer转换为字符串变量,但是使用输入中的gcount函数,这意味着你将得到一个char太多,因为输入变量仍包含\ n,而vecBuffer不包含。

The resulting strResult is:

由此产生的strResult是:

-       strResult   "Test"
        [0] 84 'T'  char
        [1] 101 'e' char
        [2] 115 's' char
        [3] 116 't' char
        [4] 13 '␍'  char
        [5] 0   char

So then "Test" is printed, followed by a carriage return (puts the cursor back at the start of the line), a null character (overwriting the T), and finally the \n, which correctly puts the cursor on the new line.

那么然后打印“Test”,然后是回车(将光标放回到行的开头),空字符(覆盖T),最后是\ n,它正确地将光标放在新行上。

So you either have to strip out the \r, or write a function that gets the string length directly from vecBuffer, checking for null characters.

所以你要么必须删除\ r,要么写一个直接从vecBuffer获取字符串长度的函数,检查空字符。

#2


I've duplicated Tommy's problem on a Windows XP Pro Service Pack 2 system with the code compiled using Visual Studio 2005 SP2 (actually, it says "Version 8.0.50727.879"), built as a console project.

我在Windows XP Pro Service Pack 2系统上复制了Tommy的问题,使用Visual Studio 2005 SP2编译的代码(实际上,它说“版本8.0.50727.879”),作为控制台项目构建。

If my test.txt file contains just "Test" and a CR, the program spits out " est" (note the leading space) when run.

如果我的test.txt文件只包含“Test”和CR,则程序在运行时会吐出“est”(注意前导空格)。

If I had to take a wild guess, I'd say that this version of the implementation has a bug where it is treating the Windows newline character like it should be treated in Unix (as a "go to the front of the same line" character), and then it wipes out the first character to hold part of the next prompt or something.

如果我不得不采取疯狂的猜测,我会说这个版本的实现有一个错误,它正在处理Windows换行符,就像它应该在Unix中处理(作为“走到同一行的前面”然后它会清除第一个字符以保存下一个提示的一部分或其他内容。


Update: After playing with it a bit, I'm positive that is what is going on. If you look at strResult in the debugger, you will see that it copied over a decimal 13 value at the end. That's CR, which in Windows-land is '\n', and everywhere else is "return to the beginning of the line". If I instead change your constructor to read:

更新:玩了一下后,我很肯定这是正在发生的事情。如果你在调试器中查看strResult,你会看到它在末尾复制了十进制13的值。那是CR,在Windows-land中是'\ n',其他地方都是“回到行的开头”。如果我改为你的构造函数来读取:

string strResult( vecBuffer.begin(), vecBuffer.begin() + input.gcount() - 1 );

string strResult(vecBuffer.begin(),vecBuffer.begin()+ input.gcount() - 1);

...(so that the CR isn't copied) then it prints out "Test" like you'd expect.

...(这样CR不会被复制)然后打印出“测试”,就像你期望的那样。

#3


I am pretty sure that the T is actually getting written and then overwritten. Running the same program in an rxvt window (cygwin) produces the expected output. You can do a couple things. If you get rid of the ios::binary in your open it will autoconvert \r\n to \n and things will work like you expect.

我很确定T实际上已经被写入然后被覆盖了。在rxvt窗口(cygwin)中运行相同的程序会产生预期的输出。你可以做几件事。如果你在打开时摆脱了ios :: binary,它将自动转换为\ n \ n到\ n,并且事情会像你期望的那样工作。

You can also open up your text file in the binary editor by clicking on the little down arrow on the open file dialog's open button and selecting open with...->Binary Editor. This will let you look at your file and confirm that it does indeed have \r\n and not just \n.

您还可以在二进制编辑器中打开文本文件,方法是单击打开文件对话框打开按钮上的小向下箭头,然后选择打开... - >二进制编辑器。这将让你查看你的文件并确认它确实有\ r \ n而不仅仅是\ n。

Edit: I redirected the output to a file and it is writing out:

编辑:我将输出重定向到一个文件,它正在写出:

Test\r\0\r\n

The reason you are getting the \0 is that gcount returns 6 (6 characters were removed from the stream) but the final delimiter is not copied to the buffer, a '\0' is instead. when you are constructing the string, you are actually telling it to include the '\0'. std::string has no problem with the embedded 0 and outputs it as asked. Some shells are apparently outputting a blank character and overwriting the T, while others don't do anything and the output looks okay, but is still probably wrong because it has the embedded '\0'

你得到\ 0的原因是gcount返回6(从流中删除了6个字符)但是最后的分隔符没有被复制到缓冲区,而是'\ 0'。当你构造字符串时,你实际上是在告诉它包含'\ 0'。 std :: string对嵌入式0没有问题,并按要求输出。有些shell显然输出了一个空白字符并覆盖了T,而其他shell没有做任何事情并且输出看起来没问题,但仍然可能是错误的,因为它有嵌入的'\ 0'

cout << strResult.c_str() << "\n";

Changing the last line to this will stop on the \0 and also get the output expected.

将最后一行更改为此将在\ 0上停止并且还获得预期的输出。

#4


I tested your code using Visual Studio 2005 SP2 on Windows XP Pro SP3 (32-bit), and everything works fine.

我在Windows XP Pro SP3(32位)上使用Visual Studio 2005 SP2测试了您的代码,一切正常。

#1


I've also duplicated this result, Windows Vista, Visual Studio 2005 SP2.

我也复制了这个结果,Windows Vista,Visual Studio 2005 SP2。

When I figure out what the heck is happening, I'll update this post.

当我弄清楚到底发生了什么时,我会更新这篇文章。

edit: Okay, there we go. The problem (and the different results people are getting) are from the \r. What happens is you call input.getline and put the result in vecBuffer. The getline function strips off the \n, but leaves the \r in place.

编辑:好的,我们去吧。问题(以及人们得到的不同结果)来自\ r \ n。你会调用input.getline并将结果放在vecBuffer中。 getline函数剥离\ n,但将\ r留在原位。

You then transfer the vecBuffer to a string variable, but use the gcount function from input, meaning you will get one char too much, because the input variable still contains the \n, and the vecBuffer does not.

然后将vecBuffer转换为字符串变量,但是使用输入中的gcount函数,这意味着你将得到一个char太多,因为输入变量仍包含\ n,而vecBuffer不包含。

The resulting strResult is:

由此产生的strResult是:

-       strResult   "Test"
        [0] 84 'T'  char
        [1] 101 'e' char
        [2] 115 's' char
        [3] 116 't' char
        [4] 13 '␍'  char
        [5] 0   char

So then "Test" is printed, followed by a carriage return (puts the cursor back at the start of the line), a null character (overwriting the T), and finally the \n, which correctly puts the cursor on the new line.

那么然后打印“Test”,然后是回车(将光标放回到行的开头),空字符(覆盖T),最后是\ n,它正确地将光标放在新行上。

So you either have to strip out the \r, or write a function that gets the string length directly from vecBuffer, checking for null characters.

所以你要么必须删除\ r,要么写一个直接从vecBuffer获取字符串长度的函数,检查空字符。

#2


I've duplicated Tommy's problem on a Windows XP Pro Service Pack 2 system with the code compiled using Visual Studio 2005 SP2 (actually, it says "Version 8.0.50727.879"), built as a console project.

我在Windows XP Pro Service Pack 2系统上复制了Tommy的问题,使用Visual Studio 2005 SP2编译的代码(实际上,它说“版本8.0.50727.879”),作为控制台项目构建。

If my test.txt file contains just "Test" and a CR, the program spits out " est" (note the leading space) when run.

如果我的test.txt文件只包含“Test”和CR,则程序在运行时会吐出“est”(注意前导空格)。

If I had to take a wild guess, I'd say that this version of the implementation has a bug where it is treating the Windows newline character like it should be treated in Unix (as a "go to the front of the same line" character), and then it wipes out the first character to hold part of the next prompt or something.

如果我不得不采取疯狂的猜测,我会说这个版本的实现有一个错误,它正在处理Windows换行符,就像它应该在Unix中处理(作为“走到同一行的前面”然后它会清除第一个字符以保存下一个提示的一部分或其他内容。


Update: After playing with it a bit, I'm positive that is what is going on. If you look at strResult in the debugger, you will see that it copied over a decimal 13 value at the end. That's CR, which in Windows-land is '\n', and everywhere else is "return to the beginning of the line". If I instead change your constructor to read:

更新:玩了一下后,我很肯定这是正在发生的事情。如果你在调试器中查看strResult,你会看到它在末尾复制了十进制13的值。那是CR,在Windows-land中是'\ n',其他地方都是“回到行的开头”。如果我改为你的构造函数来读取:

string strResult( vecBuffer.begin(), vecBuffer.begin() + input.gcount() - 1 );

string strResult(vecBuffer.begin(),vecBuffer.begin()+ input.gcount() - 1);

...(so that the CR isn't copied) then it prints out "Test" like you'd expect.

...(这样CR不会被复制)然后打印出“测试”,就像你期望的那样。

#3


I am pretty sure that the T is actually getting written and then overwritten. Running the same program in an rxvt window (cygwin) produces the expected output. You can do a couple things. If you get rid of the ios::binary in your open it will autoconvert \r\n to \n and things will work like you expect.

我很确定T实际上已经被写入然后被覆盖了。在rxvt窗口(cygwin)中运行相同的程序会产生预期的输出。你可以做几件事。如果你在打开时摆脱了ios :: binary,它将自动转换为\ n \ n到\ n,并且事情会像你期望的那样工作。

You can also open up your text file in the binary editor by clicking on the little down arrow on the open file dialog's open button and selecting open with...->Binary Editor. This will let you look at your file and confirm that it does indeed have \r\n and not just \n.

您还可以在二进制编辑器中打开文本文件,方法是单击打开文件对话框打开按钮上的小向下箭头,然后选择打开... - >二进制编辑器。这将让你查看你的文件并确认它确实有\ r \ n而不仅仅是\ n。

Edit: I redirected the output to a file and it is writing out:

编辑:我将输出重定向到一个文件,它正在写出:

Test\r\0\r\n

The reason you are getting the \0 is that gcount returns 6 (6 characters were removed from the stream) but the final delimiter is not copied to the buffer, a '\0' is instead. when you are constructing the string, you are actually telling it to include the '\0'. std::string has no problem with the embedded 0 and outputs it as asked. Some shells are apparently outputting a blank character and overwriting the T, while others don't do anything and the output looks okay, but is still probably wrong because it has the embedded '\0'

你得到\ 0的原因是gcount返回6(从流中删除了6个字符)但是最后的分隔符没有被复制到缓冲区,而是'\ 0'。当你构造字符串时,你实际上是在告诉它包含'\ 0'。 std :: string对嵌入式0没有问题,并按要求输出。有些shell显然输出了一个空白字符并覆盖了T,而其他shell没有做任何事情并且输出看起来没问题,但仍然可能是错误的,因为它有嵌入的'\ 0'

cout << strResult.c_str() << "\n";

Changing the last line to this will stop on the \0 and also get the output expected.

将最后一行更改为此将在\ 0上停止并且还获得预期的输出。

#4


I tested your code using Visual Studio 2005 SP2 on Windows XP Pro SP3 (32-bit), and everything works fine.

我在Windows XP Pro SP3(32位)上使用Visual Studio 2005 SP2测试了您的代码,一切正常。