在论坛中逃避输入的正确/最安全的方法是什么?

时间:2022-02-12 19:48:20

I am creating a forum software using php and mysql backend, and want to know what is the most secure way to escape user input for forum posts.

我正在使用php和mysql后端创建一个论坛软件,并想知道什么是最安全的方式来逃避论坛帖子的用户输入。

I know about htmlentities() and strip_tags() and htmlspecialchars() and mysql_real_escape_string(), and even javascript's escape() but I don't know which to use and where.

我知道htmlentities()和strip_tags()以及htmlspecialchars()和mysql_real_escape_string(),甚至javascript的escape()但我不知道使用哪个和哪里。

What would be the safest way to process these three different types of input (by process, I mean get, save in a database, and display):

处理这三种不同类型输入的最安全的方法是什么(通过进程,我的意思是获取,保存在数据库中,并显示):

  1. A title of a post (which will also be the basis of the URL permalink).
  2. 帖子的标题(也将是URL永久链接的基础)。

  3. The content of a forum post limited to basic text input.
  4. 论坛帖子的内容仅限于基本文本输入。

  5. The content of a forum post which allows html.
  6. 允许html的论坛帖子的内容。

I would appreciate an answer that tells me how many of these escape functions I need to use in combination and why. Thanks!

我希望得到一个答案,告诉我有多少这些逃逸功能需要组合使用以及为什么。谢谢!

7 个解决方案

#1


8  

When generating HTLM output (like you're doing to get data into the form's fields when someone is trying to edit a post, or if you need to re-display the form because the user forgot one field, for instance), you'd probably use htmlspecialchars() : it will escape <, >, ", ', and & -- depending on the options you give it.

当生成HTLM输出时(就像你正在尝试编辑帖子时将数据输入到表单的字段中,或者如果你需要重新显示表单,因为用户忘记了一个字段),你就会可能使用htmlspecialchars():它将转义<,>,“,'和& - 取决于你给它的选项。

strip_tags will remove tags if user has entered some -- and you generally don't want something the user typed to just disappear ;-)
At least, not for the "content" field :-)

如果用户输入了一些标签,strip_tags将删除标签 - 并且您通常不希望用户键入的内容只是消失;-)至少,不是“内容”字段:-)


Once you've got what the user did input in the form (ie, when the form has been submitted), you need to escape it before sending it to the DB.
That's where functions like mysqli_real_escape_string become useful : they escape data for SQL

一旦你获得了用户在表单中输入的内容(即表单已经提交),你需要在将其发送到数据库之前将其转义。这就是像mysqli_real_escape_string这样的函数变得有用的地方:它们为SQL转义数据

You might also want to take a look at prepared statements, which might help you a bit ;-)
with mysqli - and with PDO

您可能还想查看准备好的语句,这可能对您有所帮助;-)使用mysqli - 以及使用PDO

You should not use anything like addslashes : the escaping it does doesn't depend on the Database engine ; it is better/safer to use a function that fits the engine (MySQL, PostGreSQL, ...) you are working with : it'll know precisely what to escape, and how.

你不应该使用像addslashes这样的东西:它的转义不依赖于数据库引擎;使用适合您正在使用的引擎(MySQL,PostGreSQL,...)的函数更好/更安全:它将准确知道要逃脱的内容以及如何逃脱。


Finally, to display the data inside a page :

最后,要在页面内显示数据:

  • for fields that must not contain HTML, you should use htmlspecialchars() : if the user did input HTML tags, those will be displayed as-is, and not injected as HTML.
  • 对于不能包含HTML的字段,您应该使用htmlspecialchars():如果用户输入HTML标记,那些将按原样显示,而不是作为HTML注入。

  • for fields that can contain HTML... This is a bit trickier : you will probably only want to allow a few tags, and strip_tags (which can do that) is not really up to the task (it will let attributes of the allowed tags)
    • You might want to take a look at a tool called HTMLPUrifier : it will allow you to specify which tags and attributes should be allowed -- and it generates valid HTML, which is always nice ^^
    • 您可能想看看一个名为HTMLPUrifier的工具:它将允许您指定应该允许哪些标记和属性 - 并且它生成有效的HTML,这总是很好^^

    • This might take some time to compute, and you probably don't want to re-generate that HTML each time is has to be displayed ; so you can think about storing it in the database (either only keeping that clean HTML, or keeping both it and the not-clean one, in two separate fields -- might be useful to allow people editing their posts ? )
    • 这可能需要一些时间来计算,并且您可能不希望每次都必须重新生成该HTML;所以你可以考虑将它存储在数据库中(或者只保留干净的HTML,或者将它和非干净的HTML保存在两个单独的字段中 - 可能对于允许人们编辑帖子有用吗?)

  • 对于可以包含HTML的字段...这有点棘手:你可能只想要允许一些标签,而strip_tags(可以做到这一点)并不真正取决于任务(它将允许允许的标签的属性)你可能想看看一个名为HTMLPUrifier的工具:它将允许你指定应该允许哪些标签和属性 - 它会生成有效的HTML,这总是很好^^这可能需要一些时间来计算,并且您可能不希望每次都必须重新生成该HTML;所以你可以考虑将它存储在数据库中(或者只保留干净的HTML,或者将它和非干净的HTML保存在两个单独的字段中 - 可能对于允许人们编辑帖子有用吗?)


Those are only a few pointers... hope they help you :-)
Don't hesitate to ask if you have more precise questions !

这些只是一些指示...希望他们帮助你:-)不要犹豫,问你是否有更准确的问题!

#2


4  

mysql_real_escape_string() escapes everything you need to put in a mysql database. But you should use prepared statements (in mysqli) instead, because they're cleaner and do any escaping automatically.

mysql_real_escape_string()可以转义放入mysql数据库所需的所有内容。但是你应该使用预备语句(在mysqli中),因为它们更干净并且可以自动进行任何转义。

Anything else can be done with htmlspecialchars() to remove HTML from the input and urlencode() to put things in a format for URL's.

还可以使用htmlspecialchars()从输入中删除HTML,并使用urlencode()将内容放入URL的格式中。

#3


3  

There are two completely different types of attack you have to defend against:

您必须防御两种完全不同类型的攻击:

  • SQL injection: input that tries to manipulate your DB. mysql_real_escape_string() and addslashes() are meant to defend against this. The former is better, but parameterized queries are better still
  • SQL注入:尝试操作数据库的输入。 mysql_real_escape_string()和addslashes()旨在防御这一点。前者更好,但参数化查询仍然更好

  • Cross-Site scripting (XSS): input that, when displayed on your page, tries to execute JavaScript in a visitor's browser to do all kinds of things (like steal the user's account data). htmlspecialchars() is the definite way to defend against this.
  • 跨站脚本(XSS):当您在页面上显示时,尝试在访问者的浏览器中执行JavaScript以执行各种操作(例如窃取用户的帐户数据)的输入。 htmlspecialchars()是防御这种情况的明确方法。

Allowing "some HTML" while avoiding XSS attacks is very, very hard. This is because there are endless possibilities of smuggling JavaScript into HTML. If you decided to do this, the safe way is to use BBCode or Markdown, i.e. a limited set of non-HTML markup that you then convert to HTML, while removing all real HTML with htmlspecialchars(). Even then you have to be careful not to allow javascript: URLs in links. Actually allowing users to input HTML is something you should only do if it's absolutely crucial for your site. And then you should spend a lot of time making sure you understand HTML and JavaScript and CSS completely.

在避免XSS攻击的同时允许“一些HTML”是非常非常困难的。这是因为将JavaScript走私到HTML中的可能性很大。如果您决定这样做,安全的方法是使用BBCode或Markdown,即一组有限的非HTML标记,然后转换为HTML,同时使用htmlspecialchars()删除所有真实的HTML。即便如此,你必须小心不要在链接中允许javascript:URL。实际上允许用户输入HTML是你应该做的事情,如果它对你的网站绝对至关重要。然后你应该花很多时间确保你完全理解HTML和JavaScript和CSS。

#4


1  

The answer to this post is a good answer

这篇文章的答案是一个很好的答案

Basically, using the pdo interface to parameterize your queries is much safer and less error prone than escaping your inputs manually.

基本上,使用pdo接口来参数化查询比手动转义输入更安全,更不容易出错。

#5


0  

I have a tendency to escape all characters that would be problematic in page display, Javascript and SQL all at the same time. It leaves it readable on the web and in HTML eMail and at the same time removes any problems with the code. A vb.NET Line Of Code Would Be:

我倾向于逃避所有在页面显示,Javascript和SQL同时存在问题的角色。它使它在Web和HTML eMail中可读,同时消除了代码的任何问题。一个vb.NET代码行将是:

SafeComment = Replace( _
              Replace(Replace(Replace( _
              Replace(Replace(Replace( _
              Replace(Replace(Replace( _
              Replace(Replace(Replace( _
                HttpUtility.HtmlEncode(Trim(strInput)), _
                  ":", "&#x3A;"), "-", "&#x2D;"), "|", "&#x7C;"), _
                  "`", "&#x60;"), "(", "&#x28;"), ")", "&#x29;"), _
                  "%", "&#x25;"), "^", "&#x5E;"), """", "&#x22;"), _
                  "/", "&#x2F;"), "*", "&#x2A;"), "\", "&#x5C;"), _
                  "'", "&#x27;")

#6


0  

First of all, general advice: don't escape variables literally when inserting in the database. There are plenty of solutions that let you use prepared statements with variable binding. The reason to not do this explicitly is because it is only a matter of time then before you forget it just once.

首先,一般建议:在数据库中插入时,不要逐字地转义变量。有许多解决方案可以让您使用带有变量绑定的预准备语句。不明确这样做的原因是因为在你忘记它之前只是时间问题。

If you're inserting plain text in the database, don't try to clean it on insert, but instead clean it on display. That is to say, use htmlentities to encode it as HTML (and pass the correct charset argument). You want to encode on display because then you're no longer trusting that the database contents are correct, which isn't necessarily a given.

如果要在数据库中插入纯文本,请不要尝试在插入时清除它,而是在显示时清除它。也就是说,使用htmlentities将其编码为HTML(并传递正确的charset参数)。您希望在显示器上进行编码,因为您不再相信数据库内容是正确的,这不一定是给定的。

If you're dealing with rich text (html), things get more complicated. Removing the "evil" bits from HTML without destroying the message is a difficult problem. Realistically speaking, you'll have to resort to a standardized solution, like HTMLPurifier. However, this is generally too slow to run on every page view, so you'll be forced to do this when writing to the database. You'll also have to ensure that the user can see their "cleaned up" html and correct the cleaned up version.

如果你正在处理富文本(html),事情会变得更复杂。从HTML中删除“邪恶”位而不破坏消息是一个难题。实际上,您将不得不求助于HTMLPurifier等标准化解决方案。但是,这通常太慢而无法在每个页面视图上运行,因此在写入数据库时​​您将*执行此操作。您还必须确保用户可以看到他们的“清理”html并更正已清理的版本。

Definitely try to avoid "rolling your own" filter or encoding solution at any step. These problems are notoriously tricky, and you run a large risk of overlooking some minor detail that has big security implications.

绝对尽量避免在任何步骤“滚动自己的”过滤器或编码解决方案。这些问题非常棘手,您可能会忽略一些具有重大安全隐患的细节。

#7


0  

I second Joeri, do not roll your own, go here to see some of the the many possible XSS attacks

我是第二个Joeri,不要自己动手,去这里看一些可能的XSS攻击

http://ha.ckers.org/xss.html

htmlentities() -> turns text into html, converting characters to entities. If using UTF-8 encoding then use htmlspecialchars() instead as the other entities are not needed. This is the best defence against XSS. I use it on every variable I output regardless of type or origin unless I intend it to be html. There is only a tiny performance cost and it is easier than trying to work out what needs escaping and what doesn't.

htmlentities() - >将文本转换为html,将字符转换为实体。如果使用UTF-8编码,则使用htmlspecialchars()代替,因为不需要其他实体。这是对XSS的最佳防御。我在输出的每个变量上使用它,无论类型或原点如何,除非我打算将它作为html。只有很小的性能成本,它比试图找出需要逃避和不需要的东西更容易。

strip_tags() - turns html into text by removing all html tags. Use this to ensure that there is nothing nasty in your input as a adjunct to escaping your output.

strip_tags() - 通过删除所有html标记将html转换为文本。使用此选项可确保输入中没有任何令人讨厌的东西作为转义输出的附件。

mysql_real_escape_string() - escapes a string for mysql and is your defence against SQL injections from little Bobby tables (better to use mysqli and prepare/bind as escaping is then done for you and you can avoid lots of messy string concatenations)

mysql_real_escape_string() - 为mysql转义一个字符串,可以防止来自小Bobby表的SQL注入(更好地使用mysqli和prepare / bind,因为为你完成了转义,你可以避免大量乱码串联)

The advice given obve re avoiding HTML input unless it is essential and opting for BBCode or similar (make your own up if needs be) is very sound indeed.

给出的建议主要是避免HTML输入,除非它是必不可少的并且选择BBCode或类似的(如果需要的话,自己动起来)确实非常合理。

#1


8  

When generating HTLM output (like you're doing to get data into the form's fields when someone is trying to edit a post, or if you need to re-display the form because the user forgot one field, for instance), you'd probably use htmlspecialchars() : it will escape <, >, ", ', and & -- depending on the options you give it.

当生成HTLM输出时(就像你正在尝试编辑帖子时将数据输入到表单的字段中,或者如果你需要重新显示表单,因为用户忘记了一个字段),你就会可能使用htmlspecialchars():它将转义<,>,“,'和& - 取决于你给它的选项。

strip_tags will remove tags if user has entered some -- and you generally don't want something the user typed to just disappear ;-)
At least, not for the "content" field :-)

如果用户输入了一些标签,strip_tags将删除标签 - 并且您通常不希望用户键入的内容只是消失;-)至少,不是“内容”字段:-)


Once you've got what the user did input in the form (ie, when the form has been submitted), you need to escape it before sending it to the DB.
That's where functions like mysqli_real_escape_string become useful : they escape data for SQL

一旦你获得了用户在表单中输入的内容(即表单已经提交),你需要在将其发送到数据库之前将其转义。这就是像mysqli_real_escape_string这样的函数变得有用的地方:它们为SQL转义数据

You might also want to take a look at prepared statements, which might help you a bit ;-)
with mysqli - and with PDO

您可能还想查看准备好的语句,这可能对您有所帮助;-)使用mysqli - 以及使用PDO

You should not use anything like addslashes : the escaping it does doesn't depend on the Database engine ; it is better/safer to use a function that fits the engine (MySQL, PostGreSQL, ...) you are working with : it'll know precisely what to escape, and how.

你不应该使用像addslashes这样的东西:它的转义不依赖于数据库引擎;使用适合您正在使用的引擎(MySQL,PostGreSQL,...)的函数更好/更安全:它将准确知道要逃脱的内容以及如何逃脱。


Finally, to display the data inside a page :

最后,要在页面内显示数据:

  • for fields that must not contain HTML, you should use htmlspecialchars() : if the user did input HTML tags, those will be displayed as-is, and not injected as HTML.
  • 对于不能包含HTML的字段,您应该使用htmlspecialchars():如果用户输入HTML标记,那些将按原样显示,而不是作为HTML注入。

  • for fields that can contain HTML... This is a bit trickier : you will probably only want to allow a few tags, and strip_tags (which can do that) is not really up to the task (it will let attributes of the allowed tags)
    • You might want to take a look at a tool called HTMLPUrifier : it will allow you to specify which tags and attributes should be allowed -- and it generates valid HTML, which is always nice ^^
    • 您可能想看看一个名为HTMLPUrifier的工具:它将允许您指定应该允许哪些标记和属性 - 并且它生成有效的HTML,这总是很好^^

    • This might take some time to compute, and you probably don't want to re-generate that HTML each time is has to be displayed ; so you can think about storing it in the database (either only keeping that clean HTML, or keeping both it and the not-clean one, in two separate fields -- might be useful to allow people editing their posts ? )
    • 这可能需要一些时间来计算,并且您可能不希望每次都必须重新生成该HTML;所以你可以考虑将它存储在数据库中(或者只保留干净的HTML,或者将它和非干净的HTML保存在两个单独的字段中 - 可能对于允许人们编辑帖子有用吗?)

  • 对于可以包含HTML的字段...这有点棘手:你可能只想要允许一些标签,而strip_tags(可以做到这一点)并不真正取决于任务(它将允许允许的标签的属性)你可能想看看一个名为HTMLPUrifier的工具:它将允许你指定应该允许哪些标签和属性 - 它会生成有效的HTML,这总是很好^^这可能需要一些时间来计算,并且您可能不希望每次都必须重新生成该HTML;所以你可以考虑将它存储在数据库中(或者只保留干净的HTML,或者将它和非干净的HTML保存在两个单独的字段中 - 可能对于允许人们编辑帖子有用吗?)


Those are only a few pointers... hope they help you :-)
Don't hesitate to ask if you have more precise questions !

这些只是一些指示...希望他们帮助你:-)不要犹豫,问你是否有更准确的问题!

#2


4  

mysql_real_escape_string() escapes everything you need to put in a mysql database. But you should use prepared statements (in mysqli) instead, because they're cleaner and do any escaping automatically.

mysql_real_escape_string()可以转义放入mysql数据库所需的所有内容。但是你应该使用预备语句(在mysqli中),因为它们更干净并且可以自动进行任何转义。

Anything else can be done with htmlspecialchars() to remove HTML from the input and urlencode() to put things in a format for URL's.

还可以使用htmlspecialchars()从输入中删除HTML,并使用urlencode()将内容放入URL的格式中。

#3


3  

There are two completely different types of attack you have to defend against:

您必须防御两种完全不同类型的攻击:

  • SQL injection: input that tries to manipulate your DB. mysql_real_escape_string() and addslashes() are meant to defend against this. The former is better, but parameterized queries are better still
  • SQL注入:尝试操作数据库的输入。 mysql_real_escape_string()和addslashes()旨在防御这一点。前者更好,但参数化查询仍然更好

  • Cross-Site scripting (XSS): input that, when displayed on your page, tries to execute JavaScript in a visitor's browser to do all kinds of things (like steal the user's account data). htmlspecialchars() is the definite way to defend against this.
  • 跨站脚本(XSS):当您在页面上显示时,尝试在访问者的浏览器中执行JavaScript以执行各种操作(例如窃取用户的帐户数据)的输入。 htmlspecialchars()是防御这种情况的明确方法。

Allowing "some HTML" while avoiding XSS attacks is very, very hard. This is because there are endless possibilities of smuggling JavaScript into HTML. If you decided to do this, the safe way is to use BBCode or Markdown, i.e. a limited set of non-HTML markup that you then convert to HTML, while removing all real HTML with htmlspecialchars(). Even then you have to be careful not to allow javascript: URLs in links. Actually allowing users to input HTML is something you should only do if it's absolutely crucial for your site. And then you should spend a lot of time making sure you understand HTML and JavaScript and CSS completely.

在避免XSS攻击的同时允许“一些HTML”是非常非常困难的。这是因为将JavaScript走私到HTML中的可能性很大。如果您决定这样做,安全的方法是使用BBCode或Markdown,即一组有限的非HTML标记,然后转换为HTML,同时使用htmlspecialchars()删除所有真实的HTML。即便如此,你必须小心不要在链接中允许javascript:URL。实际上允许用户输入HTML是你应该做的事情,如果它对你的网站绝对至关重要。然后你应该花很多时间确保你完全理解HTML和JavaScript和CSS。

#4


1  

The answer to this post is a good answer

这篇文章的答案是一个很好的答案

Basically, using the pdo interface to parameterize your queries is much safer and less error prone than escaping your inputs manually.

基本上,使用pdo接口来参数化查询比手动转义输入更安全,更不容易出错。

#5


0  

I have a tendency to escape all characters that would be problematic in page display, Javascript and SQL all at the same time. It leaves it readable on the web and in HTML eMail and at the same time removes any problems with the code. A vb.NET Line Of Code Would Be:

我倾向于逃避所有在页面显示,Javascript和SQL同时存在问题的角色。它使它在Web和HTML eMail中可读,同时消除了代码的任何问题。一个vb.NET代码行将是:

SafeComment = Replace( _
              Replace(Replace(Replace( _
              Replace(Replace(Replace( _
              Replace(Replace(Replace( _
              Replace(Replace(Replace( _
                HttpUtility.HtmlEncode(Trim(strInput)), _
                  ":", "&#x3A;"), "-", "&#x2D;"), "|", "&#x7C;"), _
                  "`", "&#x60;"), "(", "&#x28;"), ")", "&#x29;"), _
                  "%", "&#x25;"), "^", "&#x5E;"), """", "&#x22;"), _
                  "/", "&#x2F;"), "*", "&#x2A;"), "\", "&#x5C;"), _
                  "'", "&#x27;")

#6


0  

First of all, general advice: don't escape variables literally when inserting in the database. There are plenty of solutions that let you use prepared statements with variable binding. The reason to not do this explicitly is because it is only a matter of time then before you forget it just once.

首先,一般建议:在数据库中插入时,不要逐字地转义变量。有许多解决方案可以让您使用带有变量绑定的预准备语句。不明确这样做的原因是因为在你忘记它之前只是时间问题。

If you're inserting plain text in the database, don't try to clean it on insert, but instead clean it on display. That is to say, use htmlentities to encode it as HTML (and pass the correct charset argument). You want to encode on display because then you're no longer trusting that the database contents are correct, which isn't necessarily a given.

如果要在数据库中插入纯文本,请不要尝试在插入时清除它,而是在显示时清除它。也就是说,使用htmlentities将其编码为HTML(并传递正确的charset参数)。您希望在显示器上进行编码,因为您不再相信数据库内容是正确的,这不一定是给定的。

If you're dealing with rich text (html), things get more complicated. Removing the "evil" bits from HTML without destroying the message is a difficult problem. Realistically speaking, you'll have to resort to a standardized solution, like HTMLPurifier. However, this is generally too slow to run on every page view, so you'll be forced to do this when writing to the database. You'll also have to ensure that the user can see their "cleaned up" html and correct the cleaned up version.

如果你正在处理富文本(html),事情会变得更复杂。从HTML中删除“邪恶”位而不破坏消息是一个难题。实际上,您将不得不求助于HTMLPurifier等标准化解决方案。但是,这通常太慢而无法在每个页面视图上运行,因此在写入数据库时​​您将*执行此操作。您还必须确保用户可以看到他们的“清理”html并更正已清理的版本。

Definitely try to avoid "rolling your own" filter or encoding solution at any step. These problems are notoriously tricky, and you run a large risk of overlooking some minor detail that has big security implications.

绝对尽量避免在任何步骤“滚动自己的”过滤器或编码解决方案。这些问题非常棘手,您可能会忽略一些具有重大安全隐患的细节。

#7


0  

I second Joeri, do not roll your own, go here to see some of the the many possible XSS attacks

我是第二个Joeri,不要自己动手,去这里看一些可能的XSS攻击

http://ha.ckers.org/xss.html

htmlentities() -> turns text into html, converting characters to entities. If using UTF-8 encoding then use htmlspecialchars() instead as the other entities are not needed. This is the best defence against XSS. I use it on every variable I output regardless of type or origin unless I intend it to be html. There is only a tiny performance cost and it is easier than trying to work out what needs escaping and what doesn't.

htmlentities() - >将文本转换为html,将字符转换为实体。如果使用UTF-8编码,则使用htmlspecialchars()代替,因为不需要其他实体。这是对XSS的最佳防御。我在输出的每个变量上使用它,无论类型或原点如何,除非我打算将它作为html。只有很小的性能成本,它比试图找出需要逃避和不需要的东西更容易。

strip_tags() - turns html into text by removing all html tags. Use this to ensure that there is nothing nasty in your input as a adjunct to escaping your output.

strip_tags() - 通过删除所有html标记将html转换为文本。使用此选项可确保输入中没有任何令人讨厌的东西作为转义输出的附件。

mysql_real_escape_string() - escapes a string for mysql and is your defence against SQL injections from little Bobby tables (better to use mysqli and prepare/bind as escaping is then done for you and you can avoid lots of messy string concatenations)

mysql_real_escape_string() - 为mysql转义一个字符串,可以防止来自小Bobby表的SQL注入(更好地使用mysqli和prepare / bind,因为为你完成了转义,你可以避免大量乱码串联)

The advice given obve re avoiding HTML input unless it is essential and opting for BBCode or similar (make your own up if needs be) is very sound indeed.

给出的建议主要是避免HTML输入,除非它是必不可少的并且选择BBCode或类似的(如果需要的话,自己动起来)确实非常合理。