为什么HTML要求在浏览器中显示多个空格?

时间:2023-01-14 15:05:09

I have long recognized that any set of whitespace in an HTML file will only be displayed as a single space. For instance, this:

我早就认识到,HTML文件中的任何一组空白将只显示为一个单独的空间。例如,这个:

<p>Hello.        Hello. Hello. Hello.                       Hello.</p>

displays as:

显示为:

Hello. Hello. Hello. Hello. Hello.

你好。你好。你好。你好。你好。

This is perfectly fine, as if you need multiple spaces of pre-formatted text you can just use the <pre> tag. But what is the reason? More precisely, why is this in the specification for HTML?

这非常好,如果您需要多个预格式化文本的空格,您可以使用

标记。但是原因是什么呢?更准确地说,为什么这在HTML规范中?

12 个解决方案

#1


33  

Spaces are compacted in HTML because there's a distinction between how HTML is formatted and how it should be rendered. Consider a page like this:

空格在HTML中被压缩,因为在HTML如何格式化和如何呈现之间是有区别的。考虑这样一个页面:

<html>
    <body>
        <a href="mylink">A link</a>
    </body>
</html>

If the HTML was indented using spaces for example, the link would be preceded by several spaces.

例如,如果HTML使用空格进行缩进,那么链接前面将有几个空格。

#2


15  

To try to address the "why" it may be because HTML was based on SGML which had specified it that way. It was in turn based on GML from the early 60's. The reason for white space handling could very well be because data was entered one "card" at a time back then which could result in undesired breakup of sentences and paragraphs. One difference in the old GML is that it specified that there has to be two spaces between sentences (like the old typewriter rules) which may have established a precedenct that spaces are independent of the markup.

要解决“为什么”这个问题,可能是因为HTML基于以这种方式指定它的SGML。它反过来又基于60年代早期的GML。空格处理的原因很可能是因为数据一次只输入一张“卡片”,这可能导致句子和段落不希望被分割。旧GML的一个不同之处在于,它指定句子之间必须有两个空格(如旧的typewriter规则),它们可能已经建立了独立于标记的判例。

#3


11  

Not only is it in the specification, but there is some sense to it. If spaces weren't compacted, you would have to put all your html on a single line. so something like this:

它不仅在规范中,而且还有一些意义。如果没有压缩空格,就必须将所有的html放在一行上。所以这样的:

<div>
    <h1>Title</h1>
    <p>
       This is some text
       <a href="#">Read More</a>
    </p>
</div>

Would have some strange alignment with spaces all over the place. The only way to get it right would be to compact that code, which would be difficult to maintain.

会有一些奇怪的排列和到处的空间。唯一正确的方法是压缩该代码,这将很难维护。

#4


11  

"Why are multiple spaces converted to single spaces?"

“为什么多个空间被转换成单个空间?”

First, "why" questions are hard to answer. It's in the spec. That's pretty much the end of it.

首先,“为什么”的问题很难回答。它在说明书里,差不多就到此为止了。

Consider that there are several kinds of white space.

考虑有几种类型的空白。

  • White space between tags. <p>\n<b>hi</b>\n</p>

    标签之间的空白。< p > \ n < b > < / b >你好\ n < / p >

  • White space in the content within a tag. <p>Hi <i>everyone</i>.</p>

    标签内容中的空白。< p >嗨 <我> < / i >。< / p >

  • White space in a <pre> or CDATA section.

    或CDATA部分中的空格。

The first two are hard to distinguish. Whitespace between tags, even in XML, is "optional". But when you have what is called a "mixed content model" -- tags intermixed with content -- the subtlety of "between tags" and "in the content but between tags" and "in the content but not between tags" is impossible to sort out.

前两者很难区分。即使在XML中,标签之间的空格也是“可选的”。但是,当您拥有所谓的“混合内容模型”(标签与内容混合)时,“标签之间”和“内容中但在标签之间”和“内容中而不是标签之间”的微妙之处是不可能解决的。

So they don't sort it out. Whitespace between tags and whitespace in the content is all optional.

所以他们没有解决问题。标签和内容中的空格都是可选的。

#5


9  

As others have said, it's in the HTML specification.

正如其他人所说的,它在HTML规范中。

If you want to preserve whitespace in output, you can use the <pre> tag:

如果想在输出中保留空格,可以使用

标记:

<pre>This     text has              extra spaces

and

    newlines</pre>

But this will also generally display the text in a different font.

但这通常也会以不同的字体显示文本。

#6


7  

If browsers did not do this, it could be difficult to format your HTML code to make it easily readable. For example, you might want to format your code like this:

如果浏览器没有这样做,那么很难格式化HTML代码,使其易于阅读。例如,您可能希望将代码格式化为如下格式:

<html>
<body>
    <div>
        I like to indent all content that is inside div tags.
    </div>
</body>
</html>

If the browser does not ignore the eight or so spaces before the text inside the div tag, your webpage might not look the way you intended it to look.

如果浏览器没有忽略div标签内文本之前的8个空格,那么你的网页可能看起来不像你想要的样子。

#7


4  

Usually, these design decisions are not documented in any specification and can only be gleaned from working group discussion archives that happen to be publicly accessible, or explained by the spec authors themselves. However, in this particular case, HTML 3.2 does state the following:

通常,这些设计决策并没有在任何规范中被记录下来,而且只能从工作小组讨论档案中收集到,这些文档碰巧是可以公开访问的,或者由规范作者自己解释。然而,在本例中,HTML 3.2确实说明了以下内容:

Except within literal text (e.g. the PRE element), HTML treats contiguous sequences of white space characters as being equivalent to a single space character (ASCII decimal 32). These rules allow authors considerable flexibility when editing the marked-up text directly. Note that future revisions to HTML may allow for the interpretation of the horizontal tab character (ASCII decimal 9) with respect to a tab rule defined by an associated style sheet.

除了在文本中(例如,前元素),HTML将连续的空格字符序列视为等同于单个空格字符(ASCII十进制32)。这些规则允许作者在直接编辑标记文本时具有相当的灵活性。注意,将来对HTML的修订可能允许对由关联样式表定义的制表符规则的水平制表符(ASCII十进制9)进行解释。

The behavior you see today is of course much more complicated than what was specified in HTML 3.2, but I believe the reasoning still applies. One example of where this flexibility can be useful is when you have a long paragraph that you intend to hard-wrap and indent:

您今天看到的行为当然比HTML 3.2中指定的要复杂得多,但我相信推理仍然适用。这种灵活性的一个例子就是当你有一个很长的段落时,你想要硬包装和缩进:

<H1>Lorem ipsum</H1>
<P>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fastidii oportere
   consulatu no quo. Vix saepe labores an, pri illud mentitum et, ex suas quas
   duo. Sit utinam volutpat ea, id vis cibo meis dolorum, eam docendi
   accommodare voluptatibus no. Id quaeque electram vim, ut sed singulis
   neglegentur, ne graece alterum has. Simul partiendo quaerendum et his.

If whitespace wasn't collapsed, you would end up with a paragraph with unusually large gaps where the text is hard-wrapped due to the indentation.

如果空白没有折叠起来,那么最后会出现一个段落,其中由于缩进而导致文本被硬包装。

No other HTML specification suggests any sort of reasoning behind this design decision. In particular HTML 4 only describes the collapsing behavior, and HTML5 and the living spec both defer to CSS, which doesn't explain anything either. Earlier versions of HTML also do not contain any explanation, although the following excerpt does appear in an example snippet in HTML 2.0:

没有其他HTML规范建议在设计决策背后进行任何推理。特别是HTML 4只描述了崩溃的行为,HTML5和living spec都遵从CSS,这也不能解释任何事情。早期版本的HTML也不包含任何解释,尽管以下摘录确实出现在HTML 2.0的示例代码片段中:

<OL>
...
  <UL COMPACT>
  ...
  <LI> Whitespace may be used to assist in reading the
       HTML source.
  </UL>
...
</OL>

#8


3  

It's in the HTML spec. It's the part about inter-word spaces being rendered as an ASCII space.

它在HTML规范中,是作为ASCII空间呈现的字间空间的一部分。

http://www.w3.org/TR/html401/struct/text.html

http://www.w3.org/TR/html401/struct/text.html

#9


3  

Simple, it's in the specification.

很简单,在规范中。

From the HTML specification, section 9.1:

从HTML规范中,第9.1节:

In particular, user agents should collapse input white space sequences when producing output inter-word space.

特别是,用户代理应该在生成输出字间空间时折叠输入空格序列。

#10


3  

To answer why is this in the specification for HTML? you have to consider the origins of HTML.

要回答为什么这在HTML规范中?您必须考虑HTML的起源。

Tim Berners-Lee designed HTML for sharing of scientific documents. He based it on pre-existing syntax ideas in SGML, which also has similar treatments of whitespace.

Tim Berners-Lee设计了用于共享科学文件的HTML。他基于SGML中已有的语法思想,SGML对空格也有类似的处理。

One can imagine that earlier writers of HTML at CERN did so without the aid of WYSIWYG tools, and so the ability to treat whitespace in this way aids legibility of such hand-written source files.

可以想象,在CERN的早期HTML作者是在没有WYSIWYG工具的帮助下这样做的,因此以这种方式处理空白的能力有助于这些手写的源文件的易读性。

#11


3  

There's also a typographic answer: words and sentences should have only one space between them, regardless of what your typing teacher in school may have told you.

还有一个排版的答案:单词和句子之间应该只有一个空格,不管你在学校的打字老师告诉你什么。

Use One Space Between Sentences

在句子之间使用一个空格

Use A Single Word Space Between Sentences

在句子之间使用一个单词空格

#12


2  

The definition/specifications of HTML clearly stated to ignore excess whitespace.

HTML的定义/规范明确说明忽略多余的空格。

If you want to include extra spaces, use either the <pre> tag or &nbsp;

如果您想包含额外的空格,请使用

标记或&;

#1


33  

Spaces are compacted in HTML because there's a distinction between how HTML is formatted and how it should be rendered. Consider a page like this:

空格在HTML中被压缩,因为在HTML如何格式化和如何呈现之间是有区别的。考虑这样一个页面:

<html>
    <body>
        <a href="mylink">A link</a>
    </body>
</html>

If the HTML was indented using spaces for example, the link would be preceded by several spaces.

例如,如果HTML使用空格进行缩进,那么链接前面将有几个空格。

#2


15  

To try to address the "why" it may be because HTML was based on SGML which had specified it that way. It was in turn based on GML from the early 60's. The reason for white space handling could very well be because data was entered one "card" at a time back then which could result in undesired breakup of sentences and paragraphs. One difference in the old GML is that it specified that there has to be two spaces between sentences (like the old typewriter rules) which may have established a precedenct that spaces are independent of the markup.

要解决“为什么”这个问题,可能是因为HTML基于以这种方式指定它的SGML。它反过来又基于60年代早期的GML。空格处理的原因很可能是因为数据一次只输入一张“卡片”,这可能导致句子和段落不希望被分割。旧GML的一个不同之处在于,它指定句子之间必须有两个空格(如旧的typewriter规则),它们可能已经建立了独立于标记的判例。

#3


11  

Not only is it in the specification, but there is some sense to it. If spaces weren't compacted, you would have to put all your html on a single line. so something like this:

它不仅在规范中,而且还有一些意义。如果没有压缩空格,就必须将所有的html放在一行上。所以这样的:

<div>
    <h1>Title</h1>
    <p>
       This is some text
       <a href="#">Read More</a>
    </p>
</div>

Would have some strange alignment with spaces all over the place. The only way to get it right would be to compact that code, which would be difficult to maintain.

会有一些奇怪的排列和到处的空间。唯一正确的方法是压缩该代码,这将很难维护。

#4


11  

"Why are multiple spaces converted to single spaces?"

“为什么多个空间被转换成单个空间?”

First, "why" questions are hard to answer. It's in the spec. That's pretty much the end of it.

首先,“为什么”的问题很难回答。它在说明书里,差不多就到此为止了。

Consider that there are several kinds of white space.

考虑有几种类型的空白。

  • White space between tags. <p>\n<b>hi</b>\n</p>

    标签之间的空白。< p > \ n < b > < / b >你好\ n < / p >

  • White space in the content within a tag. <p>Hi <i>everyone</i>.</p>

    标签内容中的空白。< p >嗨 <我> < / i >。< / p >

  • White space in a <pre> or CDATA section.

    或CDATA部分中的空格。

The first two are hard to distinguish. Whitespace between tags, even in XML, is "optional". But when you have what is called a "mixed content model" -- tags intermixed with content -- the subtlety of "between tags" and "in the content but between tags" and "in the content but not between tags" is impossible to sort out.

前两者很难区分。即使在XML中,标签之间的空格也是“可选的”。但是,当您拥有所谓的“混合内容模型”(标签与内容混合)时,“标签之间”和“内容中但在标签之间”和“内容中而不是标签之间”的微妙之处是不可能解决的。

So they don't sort it out. Whitespace between tags and whitespace in the content is all optional.

所以他们没有解决问题。标签和内容中的空格都是可选的。

#5


9  

As others have said, it's in the HTML specification.

正如其他人所说的,它在HTML规范中。

If you want to preserve whitespace in output, you can use the <pre> tag:

如果想在输出中保留空格,可以使用

标记:

<pre>This     text has              extra spaces

and

    newlines</pre>

But this will also generally display the text in a different font.

但这通常也会以不同的字体显示文本。

#6


7  

If browsers did not do this, it could be difficult to format your HTML code to make it easily readable. For example, you might want to format your code like this:

如果浏览器没有这样做,那么很难格式化HTML代码,使其易于阅读。例如,您可能希望将代码格式化为如下格式:

<html>
<body>
    <div>
        I like to indent all content that is inside div tags.
    </div>
</body>
</html>

If the browser does not ignore the eight or so spaces before the text inside the div tag, your webpage might not look the way you intended it to look.

如果浏览器没有忽略div标签内文本之前的8个空格,那么你的网页可能看起来不像你想要的样子。

#7


4  

Usually, these design decisions are not documented in any specification and can only be gleaned from working group discussion archives that happen to be publicly accessible, or explained by the spec authors themselves. However, in this particular case, HTML 3.2 does state the following:

通常,这些设计决策并没有在任何规范中被记录下来,而且只能从工作小组讨论档案中收集到,这些文档碰巧是可以公开访问的,或者由规范作者自己解释。然而,在本例中,HTML 3.2确实说明了以下内容:

Except within literal text (e.g. the PRE element), HTML treats contiguous sequences of white space characters as being equivalent to a single space character (ASCII decimal 32). These rules allow authors considerable flexibility when editing the marked-up text directly. Note that future revisions to HTML may allow for the interpretation of the horizontal tab character (ASCII decimal 9) with respect to a tab rule defined by an associated style sheet.

除了在文本中(例如,前元素),HTML将连续的空格字符序列视为等同于单个空格字符(ASCII十进制32)。这些规则允许作者在直接编辑标记文本时具有相当的灵活性。注意,将来对HTML的修订可能允许对由关联样式表定义的制表符规则的水平制表符(ASCII十进制9)进行解释。

The behavior you see today is of course much more complicated than what was specified in HTML 3.2, but I believe the reasoning still applies. One example of where this flexibility can be useful is when you have a long paragraph that you intend to hard-wrap and indent:

您今天看到的行为当然比HTML 3.2中指定的要复杂得多,但我相信推理仍然适用。这种灵活性的一个例子就是当你有一个很长的段落时,你想要硬包装和缩进:

<H1>Lorem ipsum</H1>
<P>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Fastidii oportere
   consulatu no quo. Vix saepe labores an, pri illud mentitum et, ex suas quas
   duo. Sit utinam volutpat ea, id vis cibo meis dolorum, eam docendi
   accommodare voluptatibus no. Id quaeque electram vim, ut sed singulis
   neglegentur, ne graece alterum has. Simul partiendo quaerendum et his.

If whitespace wasn't collapsed, you would end up with a paragraph with unusually large gaps where the text is hard-wrapped due to the indentation.

如果空白没有折叠起来,那么最后会出现一个段落,其中由于缩进而导致文本被硬包装。

No other HTML specification suggests any sort of reasoning behind this design decision. In particular HTML 4 only describes the collapsing behavior, and HTML5 and the living spec both defer to CSS, which doesn't explain anything either. Earlier versions of HTML also do not contain any explanation, although the following excerpt does appear in an example snippet in HTML 2.0:

没有其他HTML规范建议在设计决策背后进行任何推理。特别是HTML 4只描述了崩溃的行为,HTML5和living spec都遵从CSS,这也不能解释任何事情。早期版本的HTML也不包含任何解释,尽管以下摘录确实出现在HTML 2.0的示例代码片段中:

<OL>
...
  <UL COMPACT>
  ...
  <LI> Whitespace may be used to assist in reading the
       HTML source.
  </UL>
...
</OL>

#8


3  

It's in the HTML spec. It's the part about inter-word spaces being rendered as an ASCII space.

它在HTML规范中,是作为ASCII空间呈现的字间空间的一部分。

http://www.w3.org/TR/html401/struct/text.html

http://www.w3.org/TR/html401/struct/text.html

#9


3  

Simple, it's in the specification.

很简单,在规范中。

From the HTML specification, section 9.1:

从HTML规范中,第9.1节:

In particular, user agents should collapse input white space sequences when producing output inter-word space.

特别是,用户代理应该在生成输出字间空间时折叠输入空格序列。

#10


3  

To answer why is this in the specification for HTML? you have to consider the origins of HTML.

要回答为什么这在HTML规范中?您必须考虑HTML的起源。

Tim Berners-Lee designed HTML for sharing of scientific documents. He based it on pre-existing syntax ideas in SGML, which also has similar treatments of whitespace.

Tim Berners-Lee设计了用于共享科学文件的HTML。他基于SGML中已有的语法思想,SGML对空格也有类似的处理。

One can imagine that earlier writers of HTML at CERN did so without the aid of WYSIWYG tools, and so the ability to treat whitespace in this way aids legibility of such hand-written source files.

可以想象,在CERN的早期HTML作者是在没有WYSIWYG工具的帮助下这样做的,因此以这种方式处理空白的能力有助于这些手写的源文件的易读性。

#11


3  

There's also a typographic answer: words and sentences should have only one space between them, regardless of what your typing teacher in school may have told you.

还有一个排版的答案:单词和句子之间应该只有一个空格,不管你在学校的打字老师告诉你什么。

Use One Space Between Sentences

在句子之间使用一个空格

Use A Single Word Space Between Sentences

在句子之间使用一个单词空格

#12


2  

The definition/specifications of HTML clearly stated to ignore excess whitespace.

HTML的定义/规范明确说明忽略多余的空格。

If you want to include extra spaces, use either the <pre> tag or &nbsp;

如果您想包含额外的空格,请使用

标记或&;