在Jinja2模板中使用utf-8字符。

时间:2022-08-08 16:51:35

I'm trying to use utf-8 characters when rendering a template with Jinja2. Here is how my template looks like:

在使用Jinja2呈现模板时,我尝试使用utf-8字符。下面是我的模板的样子:

<!DOCTYPE HTML>
<html manifest="" lang="en-US">
<head>
    <meta charset="UTF-8">
    <title>{{title}}</title>
...

The title variable is set something like this:

标题变量设置如下:

index_variables = {'title':''}
index_variables['title'] = myvar.encode("utf8")

template = env.get_template('index.html')
index_file = open(preview_root + "/" + "index.html", "w")

index_file.write(
    template.render(index_variables)
)
index_file.close()

Now, the problem is that myvar is a message read from a message queue and can contain those special utf8 characters (ex. "Séptimo Cine").

现在,问题是myvar是从消息队列读取的消息,并且可以包含那些特殊的utf8字符(ex)。“Septimo电影”)。

The rendered template looks something like:

呈现的模板看起来如下:

...
    <title>S\u00e9ptimo Cine</title>
...

and I want it to be:

我希望它是:

...
    <title>Séptimo Cine</title>
...

I have made several tests but I can't get this to work.

我已经做了几次测试,但我不能让它发挥作用。

  • I have tried to set the title variable without .encode("utf8"), but it throws an exception (ValueError: Expected a bytes object, not a unicode object), so my guess is that the initial message is unicode

    我尝试过设置title变量,没有。encode(“utf8”),但是它抛出了一个异常(ValueError:预期一个字节对象,而不是unicode对象),所以我猜测初始消息是unicode。

  • I have used chardet.detect to get the encoding of the message (it's "ascii"), then did the following: myvar.decode("ascii").encode("cp852"), but the title is still not rendered correctly.

    我使用了chardet.com .检测来获取消息的编码(它是“ascii”),然后执行以下操作:myvar.decode(“ascii”).encode(“cp852”),但是标题仍然没有正确呈现。

  • I also made sure that my template is a UTF-8 file, but it didn't make a difference.

    我还确保我的模板是一个UTF-8文件,但它并没有起到什么作用。

Any ideas on how to do this?

有什么办法吗?

4 个解决方案

#1


28  

TL;DR:

TL;博士:

  • Pass Unicode to template.render()
  • 通过Unicode template.render()
  • Encode the rendered unicode result to a bytestring before writing it to a file
  • 将呈现的unicode结果编码到一个bytestring,然后再将其写入文件。

This had me puzzled for a while. Because you do

这让我困惑了一段时间。因为你做的

index_file.write(
    template.render(index_variables)
)

in one statement, that's basically just one line where Python is concerned, so the traceback you get is misleading: The exception I got when recreating your test case didn't happen in template.render(index_variables), but in index_file.write() instead. So splitting the code up like this

在一个语句中,这基本上只是Python所关心的一行,所以您得到的回溯是误导的:当您重新创建测试用例时,我得到的异常并不是在template.render(index_variables)中发生的,而是在index_file.write()中。把代码分成这样。

output = template.render(index_variables)
index_file.write(output)

was the first step to diagnose where exactly the UnicodeEncodeError happens.

这是诊断UnicodeEncodeError发生的第一步。

Jinja returns unicode whet you let it render the template. Therefore you need to encode the result to a bytestring before you can write it to a file:

Jinja返回unicode whet,让它呈现模板。因此,您需要将结果编码为bytestring,然后才能将其写入文件:

index_file.write(output.encode('utf-8'))

The second error is that you pass in an utf-8 encoded bytestring to template.render() - Jinja wants unicode. So assuming your myvar contains UTF-8, you need to decode it to unicode first:

第二个错误是将一个utf-8编码的bytestring传递给template.render() - Jinja需要unicode。因此,假设myvar包含UTF-8,需要先将其解码为unicode:

index_variables['title'] = myvar.decode('utf-8')

So, to put it all together, this works for me:

综上所述,这对我来说很有效:

# -*- coding: utf-8 -*-

from jinja2 import Environment, PackageLoader
env = Environment(loader=PackageLoader('myproject', 'templates'))


# Make sure we start with an utf-8 encoded bytestring
myvar = 'Séptimo Cine'

index_variables = {'title':''}

# Decode the UTF-8 string to get unicode
index_variables['title'] = myvar.decode('utf-8')

template = env.get_template('index.html')

with open("index_file.html", "w") as index_file:
    output = template.render(index_variables)

    # jinja returns unicode - so `output` needs to be encoded to a bytestring
    # before writing it to a file
    index_file.write(output.encode('utf-8'))

#2


4  

Try changing your render command to this...

试着改变你的渲染命令到这个…

template.render(index_variables).encode( "utf-8" )

Jinja2's documentation says "This will return the rendered template as unicode string."

Jinja2的文档说“这将返回呈现的模板作为unicode字符串。”

http://jinja.pocoo.org/docs/api/?highlight=render#jinja2.Template.render

http://jinja.pocoo.org/docs/api/?highlight=render jinja2.Template.render

Hope this helps!

希望这可以帮助!

#3


0  

And if nothing works because you have a mix of languages -like in my case-, just replace "utf-8" for "utf-16"

如果没有效果,因为在我的例子中,你有一种混合的语言,就用“utf-8”代替“utf-16”

All the encoding options here:

所有的编码选项:

https://docs.python.org/2.4/lib/standard-encodings.html

https://docs.python.org/2.4/lib/standard-encodings.html

#4


-3  

Add the following lines to the beginning of your script and it will work fine without any further changes:

在脚本的开头添加以下几行,它将在没有任何更改的情况下正常工作:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
reload(sys)
sys.setdefaultencoding("utf-8")

#1


28  

TL;DR:

TL;博士:

  • Pass Unicode to template.render()
  • 通过Unicode template.render()
  • Encode the rendered unicode result to a bytestring before writing it to a file
  • 将呈现的unicode结果编码到一个bytestring,然后再将其写入文件。

This had me puzzled for a while. Because you do

这让我困惑了一段时间。因为你做的

index_file.write(
    template.render(index_variables)
)

in one statement, that's basically just one line where Python is concerned, so the traceback you get is misleading: The exception I got when recreating your test case didn't happen in template.render(index_variables), but in index_file.write() instead. So splitting the code up like this

在一个语句中,这基本上只是Python所关心的一行,所以您得到的回溯是误导的:当您重新创建测试用例时,我得到的异常并不是在template.render(index_variables)中发生的,而是在index_file.write()中。把代码分成这样。

output = template.render(index_variables)
index_file.write(output)

was the first step to diagnose where exactly the UnicodeEncodeError happens.

这是诊断UnicodeEncodeError发生的第一步。

Jinja returns unicode whet you let it render the template. Therefore you need to encode the result to a bytestring before you can write it to a file:

Jinja返回unicode whet,让它呈现模板。因此,您需要将结果编码为bytestring,然后才能将其写入文件:

index_file.write(output.encode('utf-8'))

The second error is that you pass in an utf-8 encoded bytestring to template.render() - Jinja wants unicode. So assuming your myvar contains UTF-8, you need to decode it to unicode first:

第二个错误是将一个utf-8编码的bytestring传递给template.render() - Jinja需要unicode。因此,假设myvar包含UTF-8,需要先将其解码为unicode:

index_variables['title'] = myvar.decode('utf-8')

So, to put it all together, this works for me:

综上所述,这对我来说很有效:

# -*- coding: utf-8 -*-

from jinja2 import Environment, PackageLoader
env = Environment(loader=PackageLoader('myproject', 'templates'))


# Make sure we start with an utf-8 encoded bytestring
myvar = 'Séptimo Cine'

index_variables = {'title':''}

# Decode the UTF-8 string to get unicode
index_variables['title'] = myvar.decode('utf-8')

template = env.get_template('index.html')

with open("index_file.html", "w") as index_file:
    output = template.render(index_variables)

    # jinja returns unicode - so `output` needs to be encoded to a bytestring
    # before writing it to a file
    index_file.write(output.encode('utf-8'))

#2


4  

Try changing your render command to this...

试着改变你的渲染命令到这个…

template.render(index_variables).encode( "utf-8" )

Jinja2's documentation says "This will return the rendered template as unicode string."

Jinja2的文档说“这将返回呈现的模板作为unicode字符串。”

http://jinja.pocoo.org/docs/api/?highlight=render#jinja2.Template.render

http://jinja.pocoo.org/docs/api/?highlight=render jinja2.Template.render

Hope this helps!

希望这可以帮助!

#3


0  

And if nothing works because you have a mix of languages -like in my case-, just replace "utf-8" for "utf-16"

如果没有效果,因为在我的例子中,你有一种混合的语言,就用“utf-8”代替“utf-16”

All the encoding options here:

所有的编码选项:

https://docs.python.org/2.4/lib/standard-encodings.html

https://docs.python.org/2.4/lib/standard-encodings.html

#4


-3  

Add the following lines to the beginning of your script and it will work fine without any further changes:

在脚本的开头添加以下几行,它将在没有任何更改的情况下正常工作:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
reload(sys)
sys.setdefaultencoding("utf-8")