脚本标记的charset属性的默认值是什么?

时间:2022-11-27 07:49:35

Say I have a script like this: <script type="text/javascript" src="myScript.js">

假设我有一个这样的脚本:

I've seen some sources online that claim that if the charset attribute is omitted, it defaults to ISO-8859-1. I've seen others that claim it assumes the same encoding as the HTML page that contains the script tag. What's the truth?

我在网上看到一些消息来源声称如果省略charset属性,它默认为ISO-8859-1。我见过其他人声称它采用与包含脚本标记的HTML页面相同的编码。真相是什么?

I need to know because my JavaScript file contains literal strings that will be inserted into the HTML, and which include non-ASCII characters like the Euro symbol (€). I realize that adding a charset attribute or just HTML encoding these characters should solve my problem, but I'd still like to understand the default behavior.

我需要知道,因为我的JavaScript文件包含将插入HTML的文字字符串,其中包括非ASCII字符,如欧元符号(€)。我意识到添加一个charset属性或只是编码这些字符的HTML应该可以解决我的问题,但我仍然想了解默认行为。

EDIT: To clarify one point, I need to know not just what the standards say, but how browsers actually act. The behavior described here: http://joconner.com/2008/09/javascript-file-encoding/ seems to suggest that browsers don't always assume ISO-8859-1.

编辑:澄清一点,我不仅需要知道标准的含义,还要了解浏览器的实际行为。这里描述的行为:http://joconner.com/2008/09/javascript-file-encoding/似乎表明浏览器并不总是假设ISO-8859-1。

4 个解决方案

#1


6  

The w3c has a standard way for a browser to determine the char encoding, you can read about it here: http://www.w3.org/TR/html4/charset.html#spec-char-encoding

w3c有一个浏览器确定char编码的标准方法,你可以在这里阅读:http://www.w3.org/TR/html4/charset.html#spec-char-encoding

To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):

总而言之,在确定文档的字符编码时(从最高优先级到最低优先级),符合要求的用户代理必须遵循以下优先级:

  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. “Content-Type”字段中的HTTP“charset”参数。
  3. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  4. META声明,其中“http-equiv”设置为“Content-Type”,并为“charset”设置值。
  5. The charset attribute set on an element that designates an external resource.
  6. charset属性设置在指定外部资源的元素上。

In addition to this list of priorities, the user agent may use heuristics and user settings. For example, many user agents use a heuristic to distinguish the various encodings used for Japanese text. Also, user agents typically have a user-definable, local default character encoding which they apply in the absence of other indicators.

除了该优先级列表之外,用户代理还可以使用启发式和用户设置。例如,许多用户代理使用启发式来区分用于日文文本的各种编码。此外,用户代理通常具有用户可定义的本地默认字符编码,它们在没有其他指示符的情况下应用。

#2


2  

According to w3schools.com the value is ISO-8859-1 and this is supported across all major browsers.

据w3schools.com称,该值为ISO-8859-1,所有主流浏览器都支持此功能。

According to the HTTP 1.1 specification:

根据HTTP 1.1规范:

When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value. See section 3.4.1 for compatibility problems.

当发送方未提供显式字符集参数时,“文本”类型的媒体子类型被定义为在通过HTTP接收时具有默认字符集值“ISO-8859-1”。除“ISO-8859-1”或其子集之外的字符集中的数据必须用适当的字符集值标记。有关兼容性问题,请参见第3.4.1节。

So anything that doesn't conform to this does not technically follow the HTTP 1.1 specification.

所以任何不符合这一点的东西在技术上都不符合HTTP 1.1规范。

#3


1  

HTML5 4.11.1 The script element:

HTML5 4.11.1脚本元素:

If the script element has a charset attribute, then let the script block's character encoding for this script element be the result of getting an encoding from the value of the charset attribute.

如果script元素具有charset属性,那么让脚本块的此脚本元素的字符编码是从charset属性的值获取编码的结果。

Otherwise, let the script block's fallback character encoding for this script element be the same as the encoding of the document itself.

否则,让脚本块的此脚本元素的后备字符编码与文档本身的编码相同。

The quote links to the DOM document element, which has an encoding property.

引用链接到DOM文档元素,该元素具有编码属性。

TODO: find how the encoding of that object is determined from the standards.

TODO:找出如何根据标准确定该对象的编码。

#4


0  

HTML encoding strings and passing them into javascript variables can cause problems, specially if you use hex codes as js I'm told prefers octal.

HTML编码字符串并将它们传递给javascript变量可能会导致问题,特别是如果你使用十六进制代码作为js我被告知更喜欢八进制。

If you can work in utf-8 as the charset of your web pages then js works with these just fine. I use this a lot and there has never been a need to define a charset for the included script files.

如果您可以使用utf-8作为网页的字符集,那么js可以正常使用这些。我经常使用它,从来没有必要为包含的脚本文件定义一个charset。

#1


6  

The w3c has a standard way for a browser to determine the char encoding, you can read about it here: http://www.w3.org/TR/html4/charset.html#spec-char-encoding

w3c有一个浏览器确定char编码的标准方法,你可以在这里阅读:http://www.w3.org/TR/html4/charset.html#spec-char-encoding

To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):

总而言之,在确定文档的字符编码时(从最高优先级到最低优先级),符合要求的用户代理必须遵循以下优先级:

  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. “Content-Type”字段中的HTTP“charset”参数。
  3. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  4. META声明,其中“http-equiv”设置为“Content-Type”,并为“charset”设置值。
  5. The charset attribute set on an element that designates an external resource.
  6. charset属性设置在指定外部资源的元素上。

In addition to this list of priorities, the user agent may use heuristics and user settings. For example, many user agents use a heuristic to distinguish the various encodings used for Japanese text. Also, user agents typically have a user-definable, local default character encoding which they apply in the absence of other indicators.

除了该优先级列表之外,用户代理还可以使用启发式和用户设置。例如,许多用户代理使用启发式来区分用于日文文本的各种编码。此外,用户代理通常具有用户可定义的本地默认字符编码,它们在没有其他指示符的情况下应用。

#2


2  

According to w3schools.com the value is ISO-8859-1 and this is supported across all major browsers.

据w3schools.com称,该值为ISO-8859-1,所有主流浏览器都支持此功能。

According to the HTTP 1.1 specification:

根据HTTP 1.1规范:

When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP. Data in character sets other than "ISO-8859-1" or its subsets MUST be labeled with an appropriate charset value. See section 3.4.1 for compatibility problems.

当发送方未提供显式字符集参数时,“文本”类型的媒体子类型被定义为在通过HTTP接收时具有默认字符集值“ISO-8859-1”。除“ISO-8859-1”或其子集之外的字符集中的数据必须用适当的字符集值标记。有关兼容性问题,请参见第3.4.1节。

So anything that doesn't conform to this does not technically follow the HTTP 1.1 specification.

所以任何不符合这一点的东西在技术上都不符合HTTP 1.1规范。

#3


1  

HTML5 4.11.1 The script element:

HTML5 4.11.1脚本元素:

If the script element has a charset attribute, then let the script block's character encoding for this script element be the result of getting an encoding from the value of the charset attribute.

如果script元素具有charset属性,那么让脚本块的此脚本元素的字符编码是从charset属性的值获取编码的结果。

Otherwise, let the script block's fallback character encoding for this script element be the same as the encoding of the document itself.

否则,让脚本块的此脚本元素的后备字符编码与文档本身的编码相同。

The quote links to the DOM document element, which has an encoding property.

引用链接到DOM文档元素,该元素具有编码属性。

TODO: find how the encoding of that object is determined from the standards.

TODO:找出如何根据标准确定该对象的编码。

#4


0  

HTML encoding strings and passing them into javascript variables can cause problems, specially if you use hex codes as js I'm told prefers octal.

HTML编码字符串并将它们传递给javascript变量可能会导致问题,特别是如果你使用十六进制代码作为js我被告知更喜欢八进制。

If you can work in utf-8 as the charset of your web pages then js works with these just fine. I use this a lot and there has never been a need to define a charset for the included script files.

如果您可以使用utf-8作为网页的字符集,那么js可以正常使用这些。我经常使用它,从来没有必要为包含的脚本文件定义一个charset。