在JSON响应中编码HTML特殊字符的安全性好处

问题描述:

我最近收到第三方的建议,出于出于安全原因",在所有服务器响应中对HTML特殊字符进行编码.所以:

I have recently received a recommendation, from a third party, to encode HTML special characters in all server responses "for security reasons". So:

' --> '
& --> &

例如

{ "id": 1, "name": "Miles O'Brien" }

问题:这样做有安全保障,还是只是偏执狂?

Question: Is there a security gain in doing this, or is it just a paranoia?

& --> &

您确定这是他们所指的那种编码吗?

Are you sure this was the kind of encoding they meant?

之所以要对在JSON响应中返回的HTML特殊字符进行编码进行编码,是为了避免XSS引起不必要的类型嗅探.例如,如果您有:

There is a reason to encode HTML-special characters being returned inside JSON responses, and that's to avoid XSS causing by unwanted type-sniffing. For example if you had:

{ "name": "<body>Mister <script>...</script>" }

,并且攻击者在HTML上下文(例如iframe src)中包含了指向您的返回JSON的资源的链接,然后,一个愚蠢的浏览器可能会由于赠予字符串<body>决定您的文档不是JSON对象,而是HTML文档.然后,它可以在您的安全上下文中执行脚本,从而导致XSS漏洞.

and an attacker included a link to your JSON-returning resource in an HTML context (eg iframe src) then a stupid browser might decide that, due to the giveaway string <body>, your document was not a JSON object but an HTML document. It could then execute the script in your security context, leading to XSS vulns.

解决方案是使用JSON字符串文字转义,例如:

The solution to this is to use JSON string literal escaping, for example:

{ "name": "\u003Cbody\u003EMister \u003Cscript\u003E...\u003C/script\u003E" }

在这种情况下使用HTML转义可以避免问题,但具有改变字符串含义的副作用. JSON解析器读取的"Miles O&#x27;Brien"仍然是Miles O&#x27;Brien,其中包含与号-x-二十七,因此,如果您使用.value.textContent或jQuery .text()看起来很奇怪.

Using HTML-escaping in this context, whilst it avoids the problem, has the side-effect of changing the meaning of the strings. "Miles O&#x27;Brien" read by a JSON parser is still Miles O&#x27;Brien with the ampersand-x-twenty-seven in it, so if you're writing that value to the page using the likes of .value, .textContent or jQuery .text() it's going to look weird.

现在,如果您将字符串分配给.innerHTML或jQuery .html(),那么是的,无论JSON XSS问题如何,您都绝对需要在某个时候对其进行HTML转义.但是,我建议在这种情况下,出于关注分离的原因,该点应该在客户端上,您实际上是将内容注入HTML标记中,而不是在服务器端生成JSON.通常,当可以使用更安全的DOM样式的方法时,最好避免以任何方式将字符串注入标记中.

Now if you were assigning that string to .innerHTML or jQuery .html() instead, then yeah, you'd definitely need to HTML-escape it at some point, regardless of the JSON XSS problem. However I'd suggest that in this case, for separation-of-concerns reasons, that point should be at the client end where you're actually injecting the content into HTML markup, rather than the server side generating the JSON. In general it is better to avoid injecting strings into markup anyhow, when safer DOM-style methods are available.