如何在CSV文件中写入UTF-8 ?

时间:2023-01-05 15:30:13

I am trying to create a text file in csv format out of a PyQt4 QTableWidget. I want to write the text with a UTF-8 encoding because it contains special characters. I use following code:

我正在尝试用PyQt4 QTableWidget创建一个csv格式的文本文件。我想用UTF-8编码编写文本,因为它包含特殊字符。我用下面的代码:

import codecs
...
myfile = codecs.open(filename, 'w','utf-8')
...
f = result.table.item(i,c).text()
myfile.write(f+";")

It works until the cell contains a special character. I tried also with

它一直工作到单元格包含一个特殊字符为止。我也试过用

myfile = open(filename, 'w')
...
f = unicode(result.table.item(i,c).text(), "utf-8")

But it also stops when a special character appears. I have no idea what I am doing wrong.

但当一个特殊的角色出现时,它也会停止。我不知道我做错了什么。

6 个解决方案

#1


76  

From your shell run:

从您的shell运行:

pip2 install unicodecsv

And (unlike the original question) presuming you're using Python's built in csv module, turn
import csv into
import unicodecsv as csv in your code.

而且(不像最初的问题)假设您使用的是在csv模块中构建的Python,将导入的csv转换为在代码中作为csv导入的unicodecsv。

#2


42  

It's very simple for Python 3.x (docs).

对于Python 3来说非常简单。x(文档)。

import csv

with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file:
    writer = csv.writer(csv_file, delimiter=';')
    writer.writerow('my_utf8_string')

For Python 2.x, look here.

对于Python 2。x,看这里。

#3


14  

Use this package, it just works: https://github.com/jdunck/python-unicodecsv.

使用这个包,它就可以工作:https://github.com/jdunck/python-unicodecsv。

#4


2  

The examples in the Python documentation show how to write Unicode CSV files: http://docs.python.org/2/library/csv.html#examples

Python文档中的示例展示了如何编写Unicode CSV文件:http://docs.python.org/2/library/csv.html#示例。

(can't copy the code here because it's protected by copyright)

(不能在这里复制代码,因为它受版权保护)

#5


0  

For me the UnicodeWriter class from Python 2 CSV module documentation didn't really work as it breaks the csv.writer.write_row() interface.

对于我来说,Python 2 CSV模块文档中的UnicodeWriter类并没有真正工作,因为它破坏了CSV .writer.write_row()接口。

For example:

例如:

csv_writer = csv.writer(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)

works, while:

工作时:

csv_writer = UnicodeWriter(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)

will throw AttributeError: 'int' object has no attribute 'encode'.

将抛出AttributeError: 'int'对象没有属性'encode'。

As UnicodeWriter obviously expects all column values to be strings, we can convert the values ourselves and just use the default CSV module:

显然UnicodeWriter希望所有列值都是字符串,我们可以自己转换这些值,使用默认的CSV模块:

def to_utf8(lst):
    return [unicode(elem).encode('utf-8') for elem in lst]

...
csv_writer.writerow(to_utf8(row))

Or we can even monkey-patch csv_writer to add a write_utf8_row function - the exercise is left to the reader.

或者我们甚至可以monkey-patch csv_writer来添加一个write_utf8_row函数——这个练习留给读者。

#6


-1  

A very simple hack is to use the json import instead of csv. For example instead of csv.writer just do the following:

一个非常简单的技巧是使用json导入而不是csv。例如,而不是csv。作者只需做以下事情:

    fd = codecs.open(tempfilename, 'wb', 'utf-8')  
    for c in whatever :
        fd.write( json.dumps(c) [1:-1] )   # json dumps writes ["a",..]
        fd.write('\n')
    fd.close()

Basically, given the list of fields in correct order, the json formatted string is identical to a csv line except for [ and ] at the start and end respectively. And json seems to be robust to utf-8 in python 2.*

基本上,给定以正确顺序排列的字段列表,json格式的字符串与csv行是相同的,除了开头和结尾的[和]。在python 2中,json对于utf-8似乎是健壮的

#1


76  

From your shell run:

从您的shell运行:

pip2 install unicodecsv

And (unlike the original question) presuming you're using Python's built in csv module, turn
import csv into
import unicodecsv as csv in your code.

而且(不像最初的问题)假设您使用的是在csv模块中构建的Python,将导入的csv转换为在代码中作为csv导入的unicodecsv。

#2


42  

It's very simple for Python 3.x (docs).

对于Python 3来说非常简单。x(文档)。

import csv

with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file:
    writer = csv.writer(csv_file, delimiter=';')
    writer.writerow('my_utf8_string')

For Python 2.x, look here.

对于Python 2。x,看这里。

#3


14  

Use this package, it just works: https://github.com/jdunck/python-unicodecsv.

使用这个包,它就可以工作:https://github.com/jdunck/python-unicodecsv。

#4


2  

The examples in the Python documentation show how to write Unicode CSV files: http://docs.python.org/2/library/csv.html#examples

Python文档中的示例展示了如何编写Unicode CSV文件:http://docs.python.org/2/library/csv.html#示例。

(can't copy the code here because it's protected by copyright)

(不能在这里复制代码,因为它受版权保护)

#5


0  

For me the UnicodeWriter class from Python 2 CSV module documentation didn't really work as it breaks the csv.writer.write_row() interface.

对于我来说,Python 2 CSV模块文档中的UnicodeWriter类并没有真正工作,因为它破坏了CSV .writer.write_row()接口。

For example:

例如:

csv_writer = csv.writer(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)

works, while:

工作时:

csv_writer = UnicodeWriter(csv_file)
row = ['The meaning', 42]
csv_writer.writerow(row)

will throw AttributeError: 'int' object has no attribute 'encode'.

将抛出AttributeError: 'int'对象没有属性'encode'。

As UnicodeWriter obviously expects all column values to be strings, we can convert the values ourselves and just use the default CSV module:

显然UnicodeWriter希望所有列值都是字符串,我们可以自己转换这些值,使用默认的CSV模块:

def to_utf8(lst):
    return [unicode(elem).encode('utf-8') for elem in lst]

...
csv_writer.writerow(to_utf8(row))

Or we can even monkey-patch csv_writer to add a write_utf8_row function - the exercise is left to the reader.

或者我们甚至可以monkey-patch csv_writer来添加一个write_utf8_row函数——这个练习留给读者。

#6


-1  

A very simple hack is to use the json import instead of csv. For example instead of csv.writer just do the following:

一个非常简单的技巧是使用json导入而不是csv。例如,而不是csv。作者只需做以下事情:

    fd = codecs.open(tempfilename, 'wb', 'utf-8')  
    for c in whatever :
        fd.write( json.dumps(c) [1:-1] )   # json dumps writes ["a",..]
        fd.write('\n')
    fd.close()

Basically, given the list of fields in correct order, the json formatted string is identical to a csv line except for [ and ] at the start and end respectively. And json seems to be robust to utf-8 in python 2.*

基本上,给定以正确顺序排列的字段列表,json格式的字符串与csv行是相同的,除了开头和结尾的[和]。在python 2中,json对于utf-8似乎是健壮的