我应该清理托管CMS的HTML标记吗？

I am looking at starting a hosted CMS-like service for customers.

我正在寻找为客户启动托管类似CMS的服务。

As it would, it would require the customer to input text which would be served up to anyone that comes to visit their site. I am planning on using Markdown, possibly in combination with WMD (the live markdown preview that SO uses) for the big blocks of text.

如此,它将要求客户输入文本,该文本将提供给访问其网站的任何人。我打算使用Markdown,可能与大规模文本块的WMD(SO使用的实时降价预览)结合使用。

Now, should I be sanitizing their input for html? Given that there would only be a handful of people editing their 'CMS', all paying customers, should i be stripping out the bad HTML, or should I just let them run wild? After all, it is their 'site'

现在,我应该清理他们对html的输入吗?鉴于只有少数人编辑他们的“CMS”,所有付费客户,我应该剥离坏的HTML,还是应该让他们疯狂?毕竟,这是他们的“网站”

Edit: The main reason as to why I would do it is to let them use their own javascript, and have their own css and divs and what not for the output

编辑:为什么我会这样做的主要原因是让他们使用自己的javascript,并拥有自己的CSS和div以及什么不是输出

5 个解决方案

#1

Why wouldn't you sanitize the input?

为什么不对输入进行消毒?

If you don't, you're inviting calamity - to either your customer or yourself or both.

如果你不这样做,那么你就是在向客户或你自己或两者提出灾难。

#2

Your question asks:

你的问题问:

"Edit: The main reason as to why I would do it is to let them use their own javascript, and have their own css and divs and what not for the output".

“编辑:为什么我会这样做的主要原因是让他们使用自己的javascript,并拥有自己的css和div以及什么不是输出”。

If you allow users to supply arbitrary JavaScript, then sanitizing input is not worth the effort. The definition of Cross-Site Scripting (XSS) is basically "users can supply JavaScript and some users are bad".

如果您允许用户提供任意JavaScript,那么清理输入是不值得的。跨站点脚本(XSS)的定义基本上是“用户可以提供JavaScript而一些用户不好”。

Now, some websites do allow users to supply JavaScript and they mitigate the risk in one of two ways:

现在,一些网站确实允许用户提供JavaScript,并通过以下两种方式之一来降低风险:

Host the individual user's CMS's under a different domain. Blogger and Tumblr (e.g. myblog.blogspot.com vs. blogger.com) do this to prevent user's templates from stealing other user's cookies. You have to know what you are doing and never host any of the user content under the root domain.

在不同的域下托管单个用户的CMS。 Blogger和Tumblr(例如myblog.blogspot.com与blogger.com)这样做是为了防止用户的模板窃取其他用户的cookie。您必须知道自己在做什么,并且永远不会在根域下托管任何用户内容。

If user content is never shared between users then it does not matter what script malicious users supply. However, CMS's are about sharing so this probably doesn't apply here

如果用户之间永远不会共享用户内容,则恶意用户提供的脚本无关紧要。但是,CMS是关于共享的,所以这可能不适用于此

There are some Blacklist filters out there that may work, but they only work today. The HTML spec and browsers change regularly which makes filters almost impossible to maintain. Blacklisting is a sure fire way to have both security and functional problems.

有一些黑名单过滤器可以使用,但它们只在今天工作。 HTML规范和浏览器会定期更改,这使得过滤器几乎无法维护。黑名单是一种确保安全和功能问题的可靠方法。

When dealing with user data, always treat it as untrusted. If you don't address this early in the product and your scenarios change, it is almost impossible to go back and find all of the XSS points or modifythe product to prevent XSS without upsetting your users.

处理用户数据时,请始终将其视为不受信任。如果您在产品的早期阶段没有解决这个问题并且您的方案发生了变化,那么几乎不可能返回并查找所有XSS点或修改产品以防止XSS而不会扰乱您的用户。

#3

You would also be protecting again disgruntled employees, cross customer attacks, or any other sort of idiotic behavior.

您还将再次保护心怀不满的员工,跨越客户攻击或任何其他类型的愚蠢行为。

You should always sanitize, no matter the users or viewers.

无论用户还是观众,您都应该始终进行消毒。

#4

At least parse their entry an only allow a certain "safe" subset of HTML tags.

至少解析他们的条目只允许HTML标签的某个“安全”子集。

#5

I think you should always sanitize the input. Most people use a CMS because they don't want to create their own website from scratch and they want easy access to edit their pages. These users most likely will not be trying to put in text that would get sanitized, but by protecting against it you are protecting their users.

我认为你应该始终消毒输入。大多数人使用CMS是因为他们不想从头开始创建自己的网站,他们希望轻松访问以编辑他们的网页。这些用户很可能不会尝试输入可以消毒的文本,但通过防范它,您正在保护他们的用户。

#1