为什么get函数如此危险以至于不应该使用它?

时间:2022-03-14 09:32:01

When I try to compile C code that uses the gets() function with GCC,

当我试图用GCC来编译使用gets()函数的C代码时,

I get this

我得到这个

warning:

警告:

(.text+0x34): warning: the `gets' function is dangerous and should not be used.

(.text+0x34):警告:' gets'函数是危险的,不应该使用。

I remember this has something to do with stack protection and security, but I'm not sure exactly why?

我记得这与堆栈保护和安全性有关,但我不确定确切的原因是什么?

Can someone help me with removing this warning and explain why there is such a warning about using gets()?

有人能帮我删除这个警告并解释为什么使用get()会有这样的警告吗?

If gets() is so dangerous then why can't we remove it?

如果get()是如此危险,为什么我们不能删除它?

11 个解决方案

#1


128  

In order to use gets safely, you have to know exactly how many characters you will be reading, so that you can make your buffer large enough. You will only know that if you know exactly what data you will be reading.

为了安全起见,您必须确切地知道您将读取多少字符,这样您就可以使缓冲区足够大。你只会知道,如果你确切地知道你要读什么数据。

Instead of using gets, you want to use fgets, which has the signature

与使用get不同,您希望使用具有签名的fget

char* fgets(char *string, int length, FILE * stream);

(fgets, if it reads an entire line, will leave the '\n' in the string; you'll have to deal with that.)

(fgets,如果它读取整个行,将在字符串中留下'\n';(你得处理这件事。)

It remained an official part of the language up to the 1999 ISO C standard, but it was officially removed by the 2011 standard. Most C implementations still support it, but at least gcc issues a warning for any code that uses it.

在1999年ISO C标准之前,它仍然是该语言的官方部分,但在2011年标准中被正式删除。大多数C实现仍然支持它,但至少gcc会对使用它的任何代码发出警告。

#2


114  

Why is gets() dangerous

The first internet worm (the Morris Internet Worm) escaped 27 years ago (1988-11-02), and it used gets() and a buffer overflow as one of its methods of propagating from system to system. The basic problem is that the function doesn't know how big the buffer is, so it continues reading until it finds a newline or encounters EOF, and may overflow the bounds of the buffer it was given.

第一个internet蠕虫(Morris internet蠕虫)在27年前(1988-11-02)逃脱,它使用get()和缓冲区溢出作为从系统到系统传播的方法之一。基本的问题是,函数不知道缓冲区有多大,因此它继续读取,直到找到一个换行符或遇到EOF,并可能超出给定缓冲区的界限。

You should forget you ever heard that gets() existed.

您应该忘记您曾经听说过gets()存在。

The C11 standard ISO/IEC 9899:2011 eliminated gets() as a standard function, which is A Good Thing™. Sadly, it will remain in libraries for many years (meaning 'decades') for reasons of backwards compatibility. If it were up to me, the implementation of gets() would become:

C11标准ISO / IEC 9899:2011消除()作为标准函数,这是一件好事™。遗憾的是,由于向后兼容的原因,它将在库中保留很多年(意思是“几十年”)。如果由我决定,则gets()的实现将变成:

char *gets(char *buffer)
{
    assert(buffer != 0);
    abort();
    return 0;
}

Given that your code will crash anyway, sooner or later, it is better to head the trouble off sooner rather than later. I'd be prepared to add an error message:

考虑到您的代码迟早会崩溃,最好尽早解决问题。我准备添加一条错误消息:

fputs("obsolete and dangerous function gets() called\n", stderr);

Modern versions of the Linux compilation system generates warnings if you link gets() — and also for some other functions that also have security problems (mktemp(), …).

如果您链接get(),那么Linux编译系统的现代版本将生成警告——对于其他一些也有安全问题的函数(mktemp(),…)。

Alternatives to gets()

fgets()

As everyone else said, the canonical alternative to gets() is fgets() specifying stdin as the file stream.

正如其他人所说,get()的典型替代方法是fgets()将stdin指定为文件流。

char buffer[BUFSIZ];

while (fgets(buffer, sizeof(buffer), stdin) != 0)
{
    ...process line of data...
}

What no-one else yet mentioned is that gets() does not include the newline but fgets() does. So, you might need to use a wrapper around fgets() that deletes the newline:

还没有人提到的是,gets()不包含换行符,但fgets()包含换行符。因此,您可能需要使用fgets()的包装器来删除换行:

char *fgets_wrapper(char *buffer, size_t buflen, FILE *fp)
{
    if (fgets(buffer, buflen, fp) != 0)
    {
        size_t len = strlen(buffer);
        if (len > 0 && buffer[len-1] == '\n')
            buffer[len-1] = '\0';
        return buffer;
    }
    return 0;
}

Also, as caf points out in a comment and paxdiablo shows in his answer, with fgets() you might have data left over on a line. My wrapper code leaves that data to be read next time; you can readily modify it to gobble the rest of the line of data if you prefer:

同样,正如caf在评论中指出的,paxdiablo在他的回答中显示,使用fgets(),您可能会在一行中留下数据。我的包装器代码留这些数据供下次读取;如果您愿意,您可以很容易地修改它,以便狼吞虎咽其余的数据行:

        if (len > 0 && buffer[len-1] == '\n')
            buffer[len-1] = '\0';
        else
        {
             int ch;
             while ((ch = getc(fp)) != EOF && ch != '\n')
                 ;
        }

The residual problem is how to report the three different result states — EOF or error, line read and not truncated, and partial line read but data was truncated.

剩余的问题是如何报告三个不同的结果状态——EOF或error、行读和不截断、部分行读和数据被截断。

This problem doesn't arise with gets() because it doesn't know where your buffer ends and merrily tramples beyond the end, wreaking havoc on your beautifully tended memory layout, often messing up the return stack (a Stack Overflow) if the buffer is allocated on the stack, or trampling over the control information if the buffer is dynamically allocated, or copying data over other precious global (or module) variables if the buffer is statically allocated. None of these is a good idea — they epitomize the phrase 'undefined behaviour`.

这个问题不会出现在(),因为它不知道你缓冲结束,愉快地践踏后,破坏你的美丽往往内存布局,通常把返回堆栈(stack Overflow)如果在堆栈上分配的缓冲区,或践踏在控制信息如果动态分配的缓冲区,或复制数据其他珍贵的全球(或模块)变量是否静态分配的缓冲区。这些都不是一个好主意——它们是“未定义行为”的缩影。


There is also the TR 24731-1 (Technical Report from the C Standard Committee) which provides safer alternatives to a variety of functions, including gets():

还有TR 24731-1(来自C标准委员会的技术报告),它为各种功能提供了更安全的替代方案,包括gets():

§6.5.4.1 The gets_s function

Synopsis

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
char *gets_s(char *s, rsize_t n);

Runtime-constraints

s shall not be a null pointer. n shall neither be equal to zero nor be greater than RSIZE_MAX. A new-line character, end-of-file, or read error shall occur within reading n-1 characters from stdin.25)

s不应该是空指针。n既不等于零,也不大于RSIZE_MAX。从stdin.25中读取n-1个字符时,将出现换行字符、文件结束或读取错误。

3 If there is a runtime-constraint violation, s[0] is set to the null character, and characters are read and discarded from stdin until a new-line character is read, or end-of-file or a read error occurs.

如果存在运行时约束冲突,则将s[0]设置为空字符,并从stdin中读取和丢弃字符,直到读取新行字符或文件结束或读取错误。

Description

4 The gets_s function reads at most one less than the number of characters specified by n from the stream pointed to by stdin, into the array pointed to by s. No additional characters are read after a new-line character (which is discarded) or after end-of-file. The discarded new-line character does not count towards number of characters read. A null character is written immediately after the last character read into the array.

gets_s函数从stdin指向的流中读取最多小于n所指定的字符数的字符数,并将其读入s所指向的数组中。丢弃的换行字符不计入所读字符的数量。在读入数组的最后一个字符之后立即写入空字符。

5 If end-of-file is encountered and no characters have been read into the array, or if a read error occurs during the operation, then s[0] is set to the null character, and the other elements of s take unspecified values.

如果遇到文件结束,并且没有对数组进行读取,或者在操作过程中出现读取错误,那么s[0]将被设置为空字符,而s的其他元素则采用未指定的值。

Recommended practice

6 The fgets function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers of fgets pay attention to the presence or absence of a new-line character in the result array. Consider using fgets (along with any needed processing based on new-line characters) instead of gets_s.

fgets函数允许编写适当的程序安全地处理输入行太长而不能存储在结果数组中。通常,这要求fgets的调用者注意结果数组中是否存在新行字符。考虑使用fgets(以及基于换行字符的任何必要处理)而不是gets_s。

25) The gets_s function, unlike gets, makes it a runtime-constraint violation for a line of input to overflow the buffer to store it. Unlike fgets, gets_s maintains a one-to-one relationship between input lines and successful calls to gets_s. Programs that use gets expect such a relationship.

25) gets_s函数与get不同,它违反了运行时约束,使一行输入溢出缓冲区以存储它。与fgets不同,gets_s在输入行和成功调用gets_s之间保持一对一的关系。使用的程序期望得到这样的关系。

The Microsoft Visual Studio compilers implement an approximation to the TR 24731-1 standard, but there are differences between the signatures implemented by Microsoft and those in the TR.

Microsoft Visual Studio编译器实现了TR 2471 -1标准的近似,但是Microsoft实现的签名与TR中的签名之间存在差异。

The C11 standard, ISO/IEC 9899-2011, includes TR24731 in Annex K as an optional part of the library. Unfortunately, it is seldom implemented on Unix-like systems.

C11标准ISO/IEC 9899-2011包括附录K中的TR24731,作为库的可选部分。不幸的是,它很少在类unix系统上实现。


getline() — POSIX

POSIX 2008 also provides a safe alternative to gets() called getline(). It allocates space for the line dynamically, so you end up needing to free it. It removes the limitation on line length, therefore. It also returns the length of the data that was read, or -1 (and not EOF!), which means that null bytes in the input can be handled reliably. There is also a 'choose your own single-character delimiter' variation called getdelim(); this can be useful if you are dealing with the output from find -print0 where the ends of the file names are marked with an ASCII NUL '\0' character, for example.

POSIX 2008还提供了一个名为getline()的安全替代方法。它动态地为行分配空间,所以您最终需要释放它。因此,它消除了对行长度的限制。它还返回所读取的数据的长度,或者-1(不是EOF!),这意味着输入中的空字节可以被可靠地处理。还有一个名为getdelim()的“选择自己的单字符分隔符”变体;如果您正在处理find -print0的输出,其中文件名的末尾被标记为ASCII NUL '\0'字符,那么这将非常有用。

#3


21  

Because gets doesn't do any kind of check while getting bytes from stdin and putting them somewhere. A simple example:

因为get在从stdin获取字节并将它们放到某个地方时不做任何检查。一个简单的例子:

char array1[] = "12345";
char array2[] = "67890";

gets(array1);

Now, first of all you are allowed to input how many characters you want, gets won't care about it. Secondly the bytes over the size of the array in which you put them (in this case array1) will overwrite whatever they find in memory because gets will write them. In the previous example this means that if you input "abcdefghijklmnopqrts" maybe, unpredictably, it will overwrite also array2 or whatever.

首先,你可以输入你想要的字符数,这并不重要。其次,你放置它们的数组大小的字节(在本例中是array1)将覆盖它们在内存中找到的任何东西,因为get将写入它们。在前面的示例中,这意味着如果您输入“abcdefghijklmnopqrts”,可能会不可预测地覆盖array2或其他内容。

The function is unsafe because it assumes consistent input. NEVER USE IT!

该函数不安全,因为它假定输入是一致的。从不使用它!

#4


15  

You should not use gets since it has no way to stop a buffer overflow. If the user types in more data than can fit in your buffer, you will most likely end up with corruption or worse.

您不应该使用get,因为它无法阻止缓冲区溢出。如果用户输入的数据超出了缓冲区的容量,那么很可能会导致损坏或更糟糕的结果。

In fact, ISO have actually taken the step of removing gets from the C standard (as of C11, though it was deprecated in C99) which, given how highly they rate backward compatibility, should be an indication of how bad that function was.

事实上,ISO已经从C标准中删除get(从C11开始,尽管在C99中已经被弃用),考虑到它们对向后兼容性的评价有多高,这应该是该功能有多糟糕的一个标志。

The correct thing to do is to use the fgets function with the stdin file handle since you can limit the characters read from the user.

正确的做法是使用带有stdin文件句柄的fget函数,因为可以限制从用户读取的字符。

But this also has its problems such as:

但这也有它的问题,例如:

  • extra characters entered by the user will be picked up the next time around.
  • 用户输入的额外字符将会在下一次被选中。
  • there's no quick notification that the user entered too much data.
  • 没有快速通知用户输入太多数据。

To that end, almost every C coder at some point in their career will write a more useful wrapper around fgets as well. Here's mine:

为此,在他们职业生涯的某一时刻,几乎所有的C程序员都会编写一个更有用的fgets包装器。这是我的:

#include <stdio.h>
#include <string.h>

#define OK       0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
    int ch, extra;

    // Get line with buffer overrun protection.
    if (prmpt != NULL) {
        printf ("%s", prmpt);
        fflush (stdout);
    }
    if (fgets (buff, sz, stdin) == NULL)
        return NO_INPUT;

    // If it was too long, there'll be no newline. In that case, we flush
    // to end of line so that excess doesn't affect the next call.
    if (buff[strlen(buff)-1] != '\n') {
        extra = 0;
        while (((ch = getchar()) != '\n') && (ch != EOF))
            extra = 1;
        return (extra == 1) ? TOO_LONG : OK;
    }

    // Otherwise remove newline and give string back to caller.
    buff[strlen(buff)-1] = '\0';
    return OK;
}

with some test code:

有一些测试代码:

// Test program for getLine().

int main (void) {
    int rc;
    char buff[10];

    rc = getLine ("Enter string> ", buff, sizeof(buff));
    if (rc == NO_INPUT) {
        printf ("No input\n");
        return 1;
    }

    if (rc == TOO_LONG) {
        printf ("Input too long\n");
        return 1;
    }

    printf ("OK [%s]\n", buff);

    return 0;
}

It provides the same protections as fgets in that it prevents buffer overflows but it also notifies the caller as to what happened and clears out the excess characters so that they do not affect your next input operation.

它提供与fgets相同的保护,以防止缓冲区溢出,但它也通知调用者发生了什么,并清除多余的字符,以便它们不会影响您的下一个输入操作。

Feel free to use it as you wish, I hereby release it under the "do what you damn well want to" licence :-)

请随意使用它,我在此以“做你最想做的事”许可下发布:-)

#5


10  

fgets.

fgets。

To read from the stdin:

阅读stdin:

char string[512];

fgets(string, sizeof(string), stdin); /* no buffer overflows here, you're safe! */

#6


6  

You can't remove API functions without breaking the API. If you would, many applications would no longer compile or run at all.

如果不破坏API,就不能删除API函数。如果您愿意,许多应用程序将不再编译或运行。

This is the reason that one reference gives:

这就是一个参考文献给出的原因:

Reading a line that overflows the array pointed to by s results in undefined behavior. The use of fgets() is recommended.

读取溢出s指向的数组的行会导致未定义的行为。推荐使用fgets()。

#7


4  

I read recently, in a USENET post to comp.lang.c, that gets() is getting removed from the Standard. WOOHOO

我最近读了一篇USENET的文章。那个get()被从标准中移除。哦吼

You'll be happy to know that the committee just voted (unanimously, as it turns out) to remove gets() from the draft as well.

您将很高兴地知道,委员会刚刚投票(结果是一致的)将gets()从草案中删除。

#8


4  

In C11(ISO/IEC 9899:201x), gets() has been removed. (It's deprecated in ISO/IEC 9899:1999/Cor.3:2007(E))

在C11(ISO/IEC 9899:201x)中,gets()被删除。(ISO/IEC 9899:1999/ corl .3:2007(E))

In addition to fgets(), C11 introduces a new safe alternative gets_s():

除了fgets()之外,C11还引入了一种新的安全替代gets_s():

C11 K.3.5.4.1 The gets_s function

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
char *gets_s(char *s, rsize_t n);

However, in the Recommended practice section, fgets() is still preferred.

但是,在推荐的实践部分,fgets()仍然是首选。

The fgets function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers of fgets pay attention to the presence or absence of a new-line character in the result array. Consider using fgets (along with any needed processing based on new-line characters) instead of gets_s.

fgets函数允许编写适当的程序安全地处理输入行太长而不能存储在结果数组中。通常,这要求fgets的调用者注意结果数组中是否存在新行字符。考虑使用fgets(以及基于换行字符的任何必要处理)而不是gets_s。

#9


3  

gets() is dangerous because it is possible for the user to crash the program by typing too much into the prompt. It can't detect the end of available memory, so if you allocate an amount of memory too small for the purpose, it can cause a seg fault and crash. Sometimes it seems very unlikely that a user will type 1000 letters into a prompt meant for a person's name, but as programmers, we need to make our programs bulletproof. (it may also be a security risk if a user can crash a system program by sending too much data).

get()是危险的,因为用户可能会通过在提示符中输入太多而使程序崩溃。它无法检测可用内存的末尾,因此如果您为这个目的分配的内存太小,它可能导致seg错误和崩溃。有时,用户似乎不太可能将1000个字母输入到一个人的名字中,但作为程序员,我们需要让程序防弹。(如果用户发送的数据太多,可能会导致系统程序崩溃,这也可能是安全风险)。

fgets() allows you to specify how many characters are taken out of the standard input buffer, so they don't overrun the variable.

fgets()允许您指定从标准输入缓冲区取出多少字符,这样它们就不会超出变量。

#10


2  

I would like to extend an earnest invitation to any C library maintainers out there who are still including gets in their libraries "just in case anyone is still depending on it": Please replace your implementation with the equivalent of

我真诚地邀请那些仍然在其库中包含get的C库维护人员,“以防有人仍然依赖它”:请将您的实现替换为等效的

char *gets(char *str)
{
    strcpy(str, "Never use gets!");
    return str;
}

This will help make sure nobody is still depending on it. Thank you.

这将有助于确保没有人仍然依赖它。谢谢你。

#11


2  

The C gets function is dangerous and has been a very costly mistake. Tony Hoare singles it out for specific mention in his talk "Null References: The Billion Dollar Mistake":

C得到函数是危险的,这是一个非常昂贵的错误。Tony Hoare在他的演讲“零引用:十亿美元的错误”中特别提到了这个问题:

http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare

http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare

The whole hour is worth watching but for his comments view from 30 minutes on with the specific gets criticism around 39 minutes.

整个小时都值得一看,但是对于他30分钟内的评论和具体的评论,他在39分钟左右受到了批评。

Hopefully this whets your appetite for the whole talk, which draws attention to how we need more formal correctness proofs in languages and how language designers should be blamed for the mistakes in their languages, not the programmer. This seems to have been the whole dubious reason for designers of bad languages to push the blame to programmers in the guise of 'programmer freedom'.

希望这能激起你对整个演讲的兴趣,让你注意到我们如何需要在语言中使用更正式的正确性证明,以及语言设计者应该如何为语言中的错误而不是程序员受到责备。这似乎是糟糕语言的设计者以“程序员*”的名义将责任推给程序员的整个可疑的原因。

#1


128  

In order to use gets safely, you have to know exactly how many characters you will be reading, so that you can make your buffer large enough. You will only know that if you know exactly what data you will be reading.

为了安全起见,您必须确切地知道您将读取多少字符,这样您就可以使缓冲区足够大。你只会知道,如果你确切地知道你要读什么数据。

Instead of using gets, you want to use fgets, which has the signature

与使用get不同,您希望使用具有签名的fget

char* fgets(char *string, int length, FILE * stream);

(fgets, if it reads an entire line, will leave the '\n' in the string; you'll have to deal with that.)

(fgets,如果它读取整个行,将在字符串中留下'\n';(你得处理这件事。)

It remained an official part of the language up to the 1999 ISO C standard, but it was officially removed by the 2011 standard. Most C implementations still support it, but at least gcc issues a warning for any code that uses it.

在1999年ISO C标准之前,它仍然是该语言的官方部分,但在2011年标准中被正式删除。大多数C实现仍然支持它,但至少gcc会对使用它的任何代码发出警告。

#2


114  

Why is gets() dangerous

The first internet worm (the Morris Internet Worm) escaped 27 years ago (1988-11-02), and it used gets() and a buffer overflow as one of its methods of propagating from system to system. The basic problem is that the function doesn't know how big the buffer is, so it continues reading until it finds a newline or encounters EOF, and may overflow the bounds of the buffer it was given.

第一个internet蠕虫(Morris internet蠕虫)在27年前(1988-11-02)逃脱,它使用get()和缓冲区溢出作为从系统到系统传播的方法之一。基本的问题是,函数不知道缓冲区有多大,因此它继续读取,直到找到一个换行符或遇到EOF,并可能超出给定缓冲区的界限。

You should forget you ever heard that gets() existed.

您应该忘记您曾经听说过gets()存在。

The C11 standard ISO/IEC 9899:2011 eliminated gets() as a standard function, which is A Good Thing™. Sadly, it will remain in libraries for many years (meaning 'decades') for reasons of backwards compatibility. If it were up to me, the implementation of gets() would become:

C11标准ISO / IEC 9899:2011消除()作为标准函数,这是一件好事™。遗憾的是,由于向后兼容的原因,它将在库中保留很多年(意思是“几十年”)。如果由我决定,则gets()的实现将变成:

char *gets(char *buffer)
{
    assert(buffer != 0);
    abort();
    return 0;
}

Given that your code will crash anyway, sooner or later, it is better to head the trouble off sooner rather than later. I'd be prepared to add an error message:

考虑到您的代码迟早会崩溃,最好尽早解决问题。我准备添加一条错误消息:

fputs("obsolete and dangerous function gets() called\n", stderr);

Modern versions of the Linux compilation system generates warnings if you link gets() — and also for some other functions that also have security problems (mktemp(), …).

如果您链接get(),那么Linux编译系统的现代版本将生成警告——对于其他一些也有安全问题的函数(mktemp(),…)。

Alternatives to gets()

fgets()

As everyone else said, the canonical alternative to gets() is fgets() specifying stdin as the file stream.

正如其他人所说,get()的典型替代方法是fgets()将stdin指定为文件流。

char buffer[BUFSIZ];

while (fgets(buffer, sizeof(buffer), stdin) != 0)
{
    ...process line of data...
}

What no-one else yet mentioned is that gets() does not include the newline but fgets() does. So, you might need to use a wrapper around fgets() that deletes the newline:

还没有人提到的是,gets()不包含换行符,但fgets()包含换行符。因此,您可能需要使用fgets()的包装器来删除换行:

char *fgets_wrapper(char *buffer, size_t buflen, FILE *fp)
{
    if (fgets(buffer, buflen, fp) != 0)
    {
        size_t len = strlen(buffer);
        if (len > 0 && buffer[len-1] == '\n')
            buffer[len-1] = '\0';
        return buffer;
    }
    return 0;
}

Also, as caf points out in a comment and paxdiablo shows in his answer, with fgets() you might have data left over on a line. My wrapper code leaves that data to be read next time; you can readily modify it to gobble the rest of the line of data if you prefer:

同样,正如caf在评论中指出的,paxdiablo在他的回答中显示,使用fgets(),您可能会在一行中留下数据。我的包装器代码留这些数据供下次读取;如果您愿意,您可以很容易地修改它,以便狼吞虎咽其余的数据行:

        if (len > 0 && buffer[len-1] == '\n')
            buffer[len-1] = '\0';
        else
        {
             int ch;
             while ((ch = getc(fp)) != EOF && ch != '\n')
                 ;
        }

The residual problem is how to report the three different result states — EOF or error, line read and not truncated, and partial line read but data was truncated.

剩余的问题是如何报告三个不同的结果状态——EOF或error、行读和不截断、部分行读和数据被截断。

This problem doesn't arise with gets() because it doesn't know where your buffer ends and merrily tramples beyond the end, wreaking havoc on your beautifully tended memory layout, often messing up the return stack (a Stack Overflow) if the buffer is allocated on the stack, or trampling over the control information if the buffer is dynamically allocated, or copying data over other precious global (or module) variables if the buffer is statically allocated. None of these is a good idea — they epitomize the phrase 'undefined behaviour`.

这个问题不会出现在(),因为它不知道你缓冲结束,愉快地践踏后,破坏你的美丽往往内存布局,通常把返回堆栈(stack Overflow)如果在堆栈上分配的缓冲区,或践踏在控制信息如果动态分配的缓冲区,或复制数据其他珍贵的全球(或模块)变量是否静态分配的缓冲区。这些都不是一个好主意——它们是“未定义行为”的缩影。


There is also the TR 24731-1 (Technical Report from the C Standard Committee) which provides safer alternatives to a variety of functions, including gets():

还有TR 24731-1(来自C标准委员会的技术报告),它为各种功能提供了更安全的替代方案,包括gets():

§6.5.4.1 The gets_s function

Synopsis

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
char *gets_s(char *s, rsize_t n);

Runtime-constraints

s shall not be a null pointer. n shall neither be equal to zero nor be greater than RSIZE_MAX. A new-line character, end-of-file, or read error shall occur within reading n-1 characters from stdin.25)

s不应该是空指针。n既不等于零,也不大于RSIZE_MAX。从stdin.25中读取n-1个字符时,将出现换行字符、文件结束或读取错误。

3 If there is a runtime-constraint violation, s[0] is set to the null character, and characters are read and discarded from stdin until a new-line character is read, or end-of-file or a read error occurs.

如果存在运行时约束冲突,则将s[0]设置为空字符,并从stdin中读取和丢弃字符,直到读取新行字符或文件结束或读取错误。

Description

4 The gets_s function reads at most one less than the number of characters specified by n from the stream pointed to by stdin, into the array pointed to by s. No additional characters are read after a new-line character (which is discarded) or after end-of-file. The discarded new-line character does not count towards number of characters read. A null character is written immediately after the last character read into the array.

gets_s函数从stdin指向的流中读取最多小于n所指定的字符数的字符数,并将其读入s所指向的数组中。丢弃的换行字符不计入所读字符的数量。在读入数组的最后一个字符之后立即写入空字符。

5 If end-of-file is encountered and no characters have been read into the array, or if a read error occurs during the operation, then s[0] is set to the null character, and the other elements of s take unspecified values.

如果遇到文件结束,并且没有对数组进行读取,或者在操作过程中出现读取错误,那么s[0]将被设置为空字符,而s的其他元素则采用未指定的值。

Recommended practice

6 The fgets function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers of fgets pay attention to the presence or absence of a new-line character in the result array. Consider using fgets (along with any needed processing based on new-line characters) instead of gets_s.

fgets函数允许编写适当的程序安全地处理输入行太长而不能存储在结果数组中。通常,这要求fgets的调用者注意结果数组中是否存在新行字符。考虑使用fgets(以及基于换行字符的任何必要处理)而不是gets_s。

25) The gets_s function, unlike gets, makes it a runtime-constraint violation for a line of input to overflow the buffer to store it. Unlike fgets, gets_s maintains a one-to-one relationship between input lines and successful calls to gets_s. Programs that use gets expect such a relationship.

25) gets_s函数与get不同,它违反了运行时约束,使一行输入溢出缓冲区以存储它。与fgets不同,gets_s在输入行和成功调用gets_s之间保持一对一的关系。使用的程序期望得到这样的关系。

The Microsoft Visual Studio compilers implement an approximation to the TR 24731-1 standard, but there are differences between the signatures implemented by Microsoft and those in the TR.

Microsoft Visual Studio编译器实现了TR 2471 -1标准的近似,但是Microsoft实现的签名与TR中的签名之间存在差异。

The C11 standard, ISO/IEC 9899-2011, includes TR24731 in Annex K as an optional part of the library. Unfortunately, it is seldom implemented on Unix-like systems.

C11标准ISO/IEC 9899-2011包括附录K中的TR24731,作为库的可选部分。不幸的是,它很少在类unix系统上实现。


getline() — POSIX

POSIX 2008 also provides a safe alternative to gets() called getline(). It allocates space for the line dynamically, so you end up needing to free it. It removes the limitation on line length, therefore. It also returns the length of the data that was read, or -1 (and not EOF!), which means that null bytes in the input can be handled reliably. There is also a 'choose your own single-character delimiter' variation called getdelim(); this can be useful if you are dealing with the output from find -print0 where the ends of the file names are marked with an ASCII NUL '\0' character, for example.

POSIX 2008还提供了一个名为getline()的安全替代方法。它动态地为行分配空间,所以您最终需要释放它。因此,它消除了对行长度的限制。它还返回所读取的数据的长度,或者-1(不是EOF!),这意味着输入中的空字节可以被可靠地处理。还有一个名为getdelim()的“选择自己的单字符分隔符”变体;如果您正在处理find -print0的输出,其中文件名的末尾被标记为ASCII NUL '\0'字符,那么这将非常有用。

#3


21  

Because gets doesn't do any kind of check while getting bytes from stdin and putting them somewhere. A simple example:

因为get在从stdin获取字节并将它们放到某个地方时不做任何检查。一个简单的例子:

char array1[] = "12345";
char array2[] = "67890";

gets(array1);

Now, first of all you are allowed to input how many characters you want, gets won't care about it. Secondly the bytes over the size of the array in which you put them (in this case array1) will overwrite whatever they find in memory because gets will write them. In the previous example this means that if you input "abcdefghijklmnopqrts" maybe, unpredictably, it will overwrite also array2 or whatever.

首先,你可以输入你想要的字符数,这并不重要。其次,你放置它们的数组大小的字节(在本例中是array1)将覆盖它们在内存中找到的任何东西,因为get将写入它们。在前面的示例中,这意味着如果您输入“abcdefghijklmnopqrts”,可能会不可预测地覆盖array2或其他内容。

The function is unsafe because it assumes consistent input. NEVER USE IT!

该函数不安全,因为它假定输入是一致的。从不使用它!

#4


15  

You should not use gets since it has no way to stop a buffer overflow. If the user types in more data than can fit in your buffer, you will most likely end up with corruption or worse.

您不应该使用get,因为它无法阻止缓冲区溢出。如果用户输入的数据超出了缓冲区的容量,那么很可能会导致损坏或更糟糕的结果。

In fact, ISO have actually taken the step of removing gets from the C standard (as of C11, though it was deprecated in C99) which, given how highly they rate backward compatibility, should be an indication of how bad that function was.

事实上,ISO已经从C标准中删除get(从C11开始,尽管在C99中已经被弃用),考虑到它们对向后兼容性的评价有多高,这应该是该功能有多糟糕的一个标志。

The correct thing to do is to use the fgets function with the stdin file handle since you can limit the characters read from the user.

正确的做法是使用带有stdin文件句柄的fget函数,因为可以限制从用户读取的字符。

But this also has its problems such as:

但这也有它的问题,例如:

  • extra characters entered by the user will be picked up the next time around.
  • 用户输入的额外字符将会在下一次被选中。
  • there's no quick notification that the user entered too much data.
  • 没有快速通知用户输入太多数据。

To that end, almost every C coder at some point in their career will write a more useful wrapper around fgets as well. Here's mine:

为此,在他们职业生涯的某一时刻,几乎所有的C程序员都会编写一个更有用的fgets包装器。这是我的:

#include <stdio.h>
#include <string.h>

#define OK       0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
    int ch, extra;

    // Get line with buffer overrun protection.
    if (prmpt != NULL) {
        printf ("%s", prmpt);
        fflush (stdout);
    }
    if (fgets (buff, sz, stdin) == NULL)
        return NO_INPUT;

    // If it was too long, there'll be no newline. In that case, we flush
    // to end of line so that excess doesn't affect the next call.
    if (buff[strlen(buff)-1] != '\n') {
        extra = 0;
        while (((ch = getchar()) != '\n') && (ch != EOF))
            extra = 1;
        return (extra == 1) ? TOO_LONG : OK;
    }

    // Otherwise remove newline and give string back to caller.
    buff[strlen(buff)-1] = '\0';
    return OK;
}

with some test code:

有一些测试代码:

// Test program for getLine().

int main (void) {
    int rc;
    char buff[10];

    rc = getLine ("Enter string> ", buff, sizeof(buff));
    if (rc == NO_INPUT) {
        printf ("No input\n");
        return 1;
    }

    if (rc == TOO_LONG) {
        printf ("Input too long\n");
        return 1;
    }

    printf ("OK [%s]\n", buff);

    return 0;
}

It provides the same protections as fgets in that it prevents buffer overflows but it also notifies the caller as to what happened and clears out the excess characters so that they do not affect your next input operation.

它提供与fgets相同的保护,以防止缓冲区溢出,但它也通知调用者发生了什么,并清除多余的字符,以便它们不会影响您的下一个输入操作。

Feel free to use it as you wish, I hereby release it under the "do what you damn well want to" licence :-)

请随意使用它,我在此以“做你最想做的事”许可下发布:-)

#5


10  

fgets.

fgets。

To read from the stdin:

阅读stdin:

char string[512];

fgets(string, sizeof(string), stdin); /* no buffer overflows here, you're safe! */

#6


6  

You can't remove API functions without breaking the API. If you would, many applications would no longer compile or run at all.

如果不破坏API,就不能删除API函数。如果您愿意,许多应用程序将不再编译或运行。

This is the reason that one reference gives:

这就是一个参考文献给出的原因:

Reading a line that overflows the array pointed to by s results in undefined behavior. The use of fgets() is recommended.

读取溢出s指向的数组的行会导致未定义的行为。推荐使用fgets()。

#7


4  

I read recently, in a USENET post to comp.lang.c, that gets() is getting removed from the Standard. WOOHOO

我最近读了一篇USENET的文章。那个get()被从标准中移除。哦吼

You'll be happy to know that the committee just voted (unanimously, as it turns out) to remove gets() from the draft as well.

您将很高兴地知道,委员会刚刚投票(结果是一致的)将gets()从草案中删除。

#8


4  

In C11(ISO/IEC 9899:201x), gets() has been removed. (It's deprecated in ISO/IEC 9899:1999/Cor.3:2007(E))

在C11(ISO/IEC 9899:201x)中,gets()被删除。(ISO/IEC 9899:1999/ corl .3:2007(E))

In addition to fgets(), C11 introduces a new safe alternative gets_s():

除了fgets()之外,C11还引入了一种新的安全替代gets_s():

C11 K.3.5.4.1 The gets_s function

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
char *gets_s(char *s, rsize_t n);

However, in the Recommended practice section, fgets() is still preferred.

但是,在推荐的实践部分,fgets()仍然是首选。

The fgets function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers of fgets pay attention to the presence or absence of a new-line character in the result array. Consider using fgets (along with any needed processing based on new-line characters) instead of gets_s.

fgets函数允许编写适当的程序安全地处理输入行太长而不能存储在结果数组中。通常,这要求fgets的调用者注意结果数组中是否存在新行字符。考虑使用fgets(以及基于换行字符的任何必要处理)而不是gets_s。

#9


3  

gets() is dangerous because it is possible for the user to crash the program by typing too much into the prompt. It can't detect the end of available memory, so if you allocate an amount of memory too small for the purpose, it can cause a seg fault and crash. Sometimes it seems very unlikely that a user will type 1000 letters into a prompt meant for a person's name, but as programmers, we need to make our programs bulletproof. (it may also be a security risk if a user can crash a system program by sending too much data).

get()是危险的,因为用户可能会通过在提示符中输入太多而使程序崩溃。它无法检测可用内存的末尾,因此如果您为这个目的分配的内存太小,它可能导致seg错误和崩溃。有时,用户似乎不太可能将1000个字母输入到一个人的名字中,但作为程序员,我们需要让程序防弹。(如果用户发送的数据太多,可能会导致系统程序崩溃,这也可能是安全风险)。

fgets() allows you to specify how many characters are taken out of the standard input buffer, so they don't overrun the variable.

fgets()允许您指定从标准输入缓冲区取出多少字符,这样它们就不会超出变量。

#10


2  

I would like to extend an earnest invitation to any C library maintainers out there who are still including gets in their libraries "just in case anyone is still depending on it": Please replace your implementation with the equivalent of

我真诚地邀请那些仍然在其库中包含get的C库维护人员,“以防有人仍然依赖它”:请将您的实现替换为等效的

char *gets(char *str)
{
    strcpy(str, "Never use gets!");
    return str;
}

This will help make sure nobody is still depending on it. Thank you.

这将有助于确保没有人仍然依赖它。谢谢你。

#11


2  

The C gets function is dangerous and has been a very costly mistake. Tony Hoare singles it out for specific mention in his talk "Null References: The Billion Dollar Mistake":

C得到函数是危险的,这是一个非常昂贵的错误。Tony Hoare在他的演讲“零引用:十亿美元的错误”中特别提到了这个问题:

http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare

http://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare

The whole hour is worth watching but for his comments view from 30 minutes on with the specific gets criticism around 39 minutes.

整个小时都值得一看,但是对于他30分钟内的评论和具体的评论,他在39分钟左右受到了批评。

Hopefully this whets your appetite for the whole talk, which draws attention to how we need more formal correctness proofs in languages and how language designers should be blamed for the mistakes in their languages, not the programmer. This seems to have been the whole dubious reason for designers of bad languages to push the blame to programmers in the guise of 'programmer freedom'.

希望这能激起你对整个演讲的兴趣,让你注意到我们如何需要在语言中使用更正式的正确性证明,以及语言设计者应该如何为语言中的错误而不是程序员受到责备。这似乎是糟糕语言的设计者以“程序员*”的名义将责任推给程序员的整个可疑的原因。