Excel和UTF8编码的CSV。

时间:2023-01-05 19:06:12

I have an Excel file that has some Spanish characters (tildes, etc.) that I need to convert to a CSV file to use as an import file. However, when I do Save As CSV it mangles the "special" Spanish characters that aren't ASCII characters. It also seems to do this with the left and right quotes and long dashes that appear to be coming from the original user creating the Excel file in Mac.

我有一个Excel文件,它有一些西班牙字符(tildes,等等),我需要将它转换为一个CSV文件以用作导入文件。但是,当我保存为CSV时,它会损坏非ASCII字符的“特殊”西班牙字符。它似乎也使用了左引号和右引号和长破折号,这似乎是来自于在Mac中创建Excel文件的原始用户。

Since CSV is just a text file I'm sure it can handle a UTF8 encoding, so I'm guessing it is an Excel limitation, but I'm looking for a way to get from Excel to CSV and keep the non-ASCII characters intact.

因为CSV只是一个文本文件,所以我确信它可以处理UTF8编码,所以我猜测它是一个Excel的限制,但是我正在寻找一种从Excel到CSV的方法,并保持非ascii字符的完整。

37 个解决方案

#1


348  

A simple workaround is to use Google Spreadsheet. Paste (values only if you have complex formulas) or import the sheet then download CSV. I just tried a few characters and it works rather well.

一个简单的解决方案是使用谷歌电子表格。粘贴(仅当您有复杂的公式)或导入表然后下载CSV。我只是试了几个角色,效果很好。

NOTE: Google Sheets does have limitations when importing. See here.

注意:在导入时,谷歌表有限制。在这里看到的。

NOTE: Be careful of sensitive data with Google Sheets.

注意:小心使用谷歌表的敏感数据。

EDIT: Another alternative - basically they use VB macro or addins to force the save as UTF8. I have not tried any of these solutions but they sound reasonable.

编辑:另一种选择——基本上他们使用VB宏或addins来强制保存为UTF8。我没有尝试过这些解决方案,但听起来很合理。

#2


105  

I've found OpenOffice's spreadsheet application, Calc, is really good at handling CSV data.

我发现OpenOffice的电子表格应用Calc非常擅长处理CSV数据。

In the "Save As..." dialog, click "Format Options" to get different encodings for CSV. LibreOffice works the same way AFAIK.

在“Save As…”对话框中,单击“格式选项”以获得CSV的不同编码。LibreOffice和AFAIK的工作方式一样。

Excel和UTF8编码的CSV。

#3


91  

  1. Save the Excel sheet as "Unicode Text (.txt)". The good news is that all the international characters are in UTF16 (note, not in UTF8). However, the new "*.txt" file is TAB delimited, not comma delimited, and therefore is not a true CSV.

    将Excel表保存为“Unicode文本(.txt)”。好消息是所有的国际角色都在UTF16(注意,不是UTF8)。然而,新的“*。txt文件是标签分隔符,而不是逗号分隔符,因此不是真正的CSV。

  2. (optional) Unless you can use a TAB delimited file for import, use your favorite text editor and replace the TAB characters with commas ",".

    (可选)除非您可以使用TAB分隔符文件来导入,使用您喜欢的文本编辑器,并使用逗号“,”替换选项卡字符。

  3. Import your *.txt file in the target application. Make sure it can accept UTF16 format.

    导入*。txt文件在目标应用程序中。确保它可以接受UTF16格式。

If UTF-16 has been properly implemented with support for non-BMP code points, that you can convert a UTF-16 file to UTF-8 without losing information. I leave it to you to find your favourite method of doing so.

如果UTF-16在非bmp代码点的支持下得到了正确的实现,那么您可以将UTF-16文件转换为UTF-8,而不会丢失信息。我把它留给你去寻找你最喜欢的方法。

I use this procedure to import data from Excel to Moodle.

我用这个程序从Excel中导入数据。

#4


41  

I know this is an old question but I happened to come upon this question while struggling with the same issues as the OP.

我知道这是一个老问题,但我碰巧遇到了这个问题,同时也遇到了与OP同样的问题。

Not having found any of the offered solutions a viable option, I set out to discover if there is a way to do this just using Excel.

我没有找到任何可行的解决方案,我开始探索是否有办法使用Excel来实现这一点。

Fortunately, I have found that the lost character issue only happens (in my case) when saving from xlsx format to csv format. I tried saving the xlsx file to xls first, then to csv. It actually worked.

幸运的是,我发现当从xlsx格式保存到csv格式时,丢失的字符问题只会发生(在我的例子中)。我尝试先将xlsx文件保存到xls,然后再将其保存到csv。它实际上工作。

Please give it a try and see if it works for you. Good luck.

请试一试,看看是否适合你。祝你好运。

#5


31  

You can use iconv command under Unix (also available on Windows as libiconv).

您可以在Unix下使用iconv命令(也可以在Windows上使用libiconv)。

After saving as CSV under Excel in the command line put:

在命令行中在Excel中保存为CSV后:

iconv -f cp1250 -t utf-8 file-encoded-cp1250.csv > file-encoded-utf8.csv

(remember to replace cp1250 with your encoding).

(记住用你的编码替换cp1250)。

Works fast and great for big files like post codes database, which cannot be imported to GoogleDocs (400.000 cells limit).

对于像邮政编码数据库这样的大文件,它的运行速度和效果都很好,它不能导入到GoogleDocs(400.000个单元格)。

#6


21  

The only "easy way" of doing this is as follows. First, realize that there is a difference between what is displayed and what is kept hidden in the Excel .csv file.

这样做的唯一“简单方法”如下。首先,要认识到在Excel .csv文件中显示的内容和隐藏的内容之间存在差异。

(1) Open an Excel file where you have the info (.xls, .xlsx)

(1)打开有信息的Excel文件。xls,.xlsx)

(2) In Excel, choose "CSV (Comma Delimited) (*.csv) as the file type and save as that type.

(2)在Excel中,选择“CSV(逗号分隔)(*. CSV)作为文件类型并保存为该类型。

(3) In NOTEPAD (found under "Programs" and then Accessories in Start menu), open the saved .csv file in Notepad

(3)在记事本中(在“程序”和“开始”菜单中的附件中找到),在记事本中打开保存的.csv文件。

(4) Then choose -> Save As..and at the bottom of the "save as" box, there is a select box labelled as "Encoding". Select UTF-8 (do NOT use ANSI or you lose all accents etc). After selecting UTF-8, then save the file to a slightly different file name from the original.

(4)然后选择->保存为。在“save as”框的底部,有一个被标记为“编码”的选择框。选择UTF-8(不要使用ANSI或失去所有的重音等)。选择UTF-8之后,将文件保存到与原始文件稍有不同的文件名。

This file is in UTF-8 and retains all characters and accents and can be imported, for example, into MySQL and other database programs.

这个文件是UTF-8的,保留了所有字符和重音,可以导入到MySQL和其他数据库程序中。

This answer is taken from this forum.

这个答案取自这个论坛。

#7


20  

Another one I've found useful: "Numbers" allows encoding-settings when saving as CSV.

另一个我发现有用的:“数字”允许在保存为CSV时设置设置。

#8


15  

You can do this on a modern Windows machine without third party software. This method is reliable and it will handle data that includes quoted commas, quoted tab characters, CJK characters, etc.

你可以在没有第三方软件的现代Windows机器上做这个。该方法是可靠的,它将处理包括引用逗号、引号字符、CJK字符等的数据。

1. Save from Excel

1。保存从Excel

In Excel, save the data to file.txt using the type Unicode Text (*.txt).

在Excel中,将数据保存到文件中。txt使用类型Unicode文本(*.txt)。

2. Start PowerShell

2。开始PowerShell

Run powershell from the Start menu.

从开始菜单运行powershell。

3. Load the file in PowerShell

3所示。加载PowerShell中的文件。

$data = Import-Csv C:\path\to\file.txt -Delimiter "`t" -Encoding BigEndianUnicode

4. Save the data as CSV

4所示。将数据保存为CSV。

$data | Export-Csv file.csv -Encoding UTF8 -NoTypeInformation

#9


14  

"nevets1219" is right about Google docs, however if you simply "import" the file it often does not convert it to UTF-8.

“nevets1219”对于谷歌文档来说是正确的,然而,如果您只是“导入”文件,它通常不会将其转换为UTF-8。

But if you import the CSV into an existing Google spreadsheet it does convert to UTF-8.

但是如果将CSV导入到现有的谷歌电子表格中,它就会转换为UTF-8。

Here's a recipe:

这里有一个秘诀:

  • On the main Docs (or Drive) screen click the "Create" button and choose "Spreadsheet"
  • 在主文档(或驱动器)屏幕上单击“创建”按钮并选择“电子表格”
  • From the "File" menu choose "Import"
  • 从“文件”菜单中选择“导入”
  • Click "Choose File"
  • 点击“选择文件”
  • Choose "Replace spreadsheet"
  • 选择“替换表格”
  • Choose whichever character you are using as a Separator
  • 选择您用作分隔符的字符。
  • Click "Import"
  • 点击“导入”
  • From the "File" menu choose "Download as" -> CSV (current sheet)
  • 从“文件”菜单选择“下载为”-> CSV(当前页)

The resulting file will be in UTF-8

结果文件将在UTF-8中。

#10


8  

For those looking for an entirely programmatic (or at least server-side) solution, I've had great success using catdoc's xls2csv tool.

对于那些寻找完全编程(或至少是服务器端)解决方案的人来说,我使用catdoc的xls2csv工具取得了巨大的成功。

Install catdoc:

安装catdoc:

apt-get install catdoc

Do the conversion:

转换:

xls2csv -d utf-8 file.xls > file-utf-8.csv 

This is blazing fast.

这是超快。

Note that it's important that you include the -d utf-8 flag, otherwise it will encode the output in the default cp1252 encoding, and you run the risk of losing information.

注意,包含-d utf-8标志非常重要,否则它将在默认的cp1252编码中对输出进行编码,您将冒丢失信息的风险。

Note that xls2csv also only works with .xls files, it does not work with .xlsx files.

注意,xls2csv也只与.xls文件一起使用,它与.xlsx文件不兼容。

#11


7  

What about using Powershell.

使用Powershell呢。

Get-Content 'C:\my.csv' | Out-File 'C:\my_utf8.csv' -Encoding UTF8

#12


5  

As funny as it may seem, the easiest way I found to save my 180MB spreadsheet into a UTF8 CSV file was to select the cells into Excel, copy them and to paste the content of the clipboard into SublimeText.

尽管看起来很有趣,但我发现将我的180MB电子表格保存到UTF8 CSV文件中最简单的方法是,将这些单元格选择为Excel,复制它们,并将剪贴板的内容粘贴到SublimeText中。

#13


3  

A second option to "nevets1219" is to open your CSV file in Notepad++ and do a convertion to ANSI.

“nevets1219”的第二个选项是在Notepad++中打开CSV文件,并对ANSI进行转换。

Choose in the top menu : Encoding -> Convert to Ansi

在顶部菜单中选择:编码->转换为Ansi。

#14


3  

I was not able to find a VBA solution for this problem on Mac Excel. There simply seemed to be no way to output UTF-8 text.

我无法在Mac Excel上找到这个问题的VBA解决方案。似乎没有办法输出UTF-8文本。

So I finally had to give up on VBA, bit the bullet, and learned AppleScript. It wasn't nearly as bad as I had thought.

所以我最终不得不放弃VBA,咬了子弹,学习了AppleScript。这并不像我想的那么糟。

Solution is described here: http://talesoftech.blogspot.com/2011/05/excel-on-mac-goodbye-vba-hello.html

这里描述了解决方案:http://talesoftech.blogspot.com/2011/05/excel- mac-goodbye-vba-hello.html。

#15


3  

Assuming an Windows environment, save and work with the file as usual in Excel but then open up the saved Excel file in Gnome Gnumeric (free). Save Gnome Gnumeric's spreadsheet as CSV which - for me anyway - saves it as UTF-8 CSV.

假设一个Windows环境,在Excel中像往常一样保存和处理文件,然后在Gnome Gnumeric(免费)中打开保存的Excel文件。将Gnome Gnumeric的电子表格保存为CSV(对我而言),将其保存为UTF-8 CSV。

#16


3  

Easy way to do it: download open office (here), load the spreadsheet and open the excel file (.xls or .xlsx). Then just save it as a text CSV file and a window opens asking to keep the current format or to save as a .ODF format. select "keep the current format" and in the new window select the option that works better for you, according with the language that your file is been written on. For Spanish language select Western Europe (Windows-1252/ WinLatin 1) and the file works just fine. If you select Unicode (UTF-8), it is not going to work with the spanish characters.

简单的方法:下载open office(这里),加载电子表格并打开excel文件(。xls或.xlsx)。然后将它保存为一个文本CSV文件,打开一个窗口,请求保留当前格式或保存为. odf格式。选择“保持当前格式”,在新窗口中选择对您更有效的选项,根据您的文件所写的语言。对于西班牙语选择西欧(Windows-1252/ WinLatin 1),文件工作正常。如果选择Unicode (UTF-8),它将不会与西班牙字符一起工作。

#17


3  

  1. Save xls file (Excel file) as Unicode text=>file will be saved in text format (.txt)

    保存xls文件(Excel文件)作为Unicode文本=>文件将以文本格式保存(.txt)

  2. Change format from .txt to .csv (rename the file from XYX.txt to XYX.csv

    从.txt转换为.csv(从XYX重命名文件)。txt,XYX.csv

#18


2  

Microsoft Excel has an option to export spreadsheet using Unicode encoding. See following screenshot.

Microsoft Excel可以选择使用Unicode编码导出电子表格。见以下截图。

Excel和UTF8编码的CSV。

#19


2  

easiest way: no need Open office and google docs

最简单的方法:不需要开放式办公室和谷歌文档。

  1. Save your file as "Unicode text file";
  2. 将文件保存为“Unicode文本文件”;
  3. now you have an unicode text file
  4. 现在您有了一个unicode文本文件。
  5. open it with "notepad" and "Save as" it with selecting "utf-8" or other code page that you want
  6. 用“notepad”和“Save as”打开它,选择“utf-8”或其他你想要的代码页。
  7. rename file extension from "txt" to "csv"
  8. 将文件扩展名从“txt”重命名为“csv”

dont open it with Ms-office anyway!!! Now you have a tab delimited CSV file.

不管怎样,不要在办公室里打开它!!!现在您有了一个标签分隔的CSV文件。

#20


2  

I have written a small Python script that can export worksheets in UTF-8.

我编写了一个小的Python脚本,可以在UTF-8中导出工作表。

You just have to provide the Excel file as first parameter followed by the sheets that you would like to export. If you do not provide the sheets, the script will export all worksheets that are present in the Excel file.

您只需提供Excel文件作为第一个参数,然后是您想要导出的表。如果不提供表单,脚本将导出Excel文件中显示的所有工作表。

#!/usr/bin/env python

# export data sheets from xlsx to csv

from openpyxl import load_workbook
import csv
from os import sys

reload(sys)
sys.setdefaultencoding('utf-8')

def get_all_sheets(excel_file):
    sheets = []
    workbook = load_workbook(excel_file,use_iterators=True,data_only=True)
    all_worksheets = workbook.get_sheet_names()
    for worksheet_name in all_worksheets:
        sheets.append(worksheet_name)
    return sheets

def csv_from_excel(excel_file, sheets):
    workbook = load_workbook(excel_file,use_iterators=True,data_only=True)
    for worksheet_name in sheets:
        print("Export " + worksheet_name + " ...")

        try:
            worksheet = workbook.get_sheet_by_name(worksheet_name)
        except KeyError:
            print("Could not find " + worksheet_name)
            sys.exit(1)

        your_csv_file = open(''.join([worksheet_name,'.csv']), 'wb')
        wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
        for row in worksheet.iter_rows():
            lrow = []
            for cell in row:
                lrow.append(cell.value)
            wr.writerow(lrow)
        print(" ... done")
    your_csv_file.close()

if not 2 <= len(sys.argv) <= 3:
    print("Call with " + sys.argv[0] + " <xlxs file> [comma separated list of sheets to export]")
    sys.exit(1)
else:
    sheets = []
    if len(sys.argv) == 3:
        sheets = list(sys.argv[2].split(','))
    else:
        sheets = get_all_sheets(sys.argv[1])
    assert(sheets != None and len(sheets) > 0)
    csv_from_excel(sys.argv[1], sheets)

#21


2  

Under Excel 2016, we have a CSV export option dedicated to UTF-8 format.

在excel2016中,我们有一个用于UTF-8格式的CSV导出选项。

#22


2  

Excel typically saves a csv file as ANSI encoding instead of utf8.

Excel通常将csv文件保存为ANSI编码而不是utf8。

One option to correct the file is to use Notepad or Notepad++:

一种纠正文件的选项是使用记事本或Notepad++:

  1. Open the .csv with Notepad or Notepad++.
  2. 使用记事本或Notepad++打开.csv。
  3. Copy the contents to your computer clipboard.
  4. 将内容复制到您的计算机剪贴板。
  5. Delete the contents from the file.
  6. 从文件中删除内容。
  7. Change the encoding of the file to utf8.
  8. 将文件的编码更改为utf8。
  9. Paste the contents back from the clipboard.
  10. 将内容从剪贴板粘贴回来。
  11. Save the file.
  12. 保存文件。

#23


2  

I have also came across the same problem but there is an easy solution for this.

我也遇到过同样的问题,但有一个简单的解决方法。

  1. Open your xlsx file in Excel 2016 or higher.
  2. 在Excel 2016或更高版本中打开xlsx文件。
  3. In "Save As" choose this option: "(CSV UTF-8(Comma Delimited)*.csv)"
  4. 在“Save As”中选择此选项:“(CSV UTF-8(逗号分隔)*.csv)”

It works perfectly and a csv file is generated which can be imported in any software. I imported this csv file in my SQLITE database and it works perfectly with all unicode characters intact.

它工作得很好,可以在任何软件中导入一个csv文件。我在我的SQLITE数据库中导入了这个csv文件,它与完整的unicode字符完美地工作。

#24


1  

Encoding -> Convert to Ansi will encode it in ANSI/UNICODE. Utf8 is a subset of Unicode. Perhaps in ANSI will be encoded correctly, but here we are talking about UTF8, @SequenceDigitale.

编码->转换为Ansi将在Ansi /UNICODE编码。Utf8是Unicode的一个子集。也许在ANSI中可以正确编码,但这里我们讨论的是UTF8, @SequenceDigitale。

There are faster ways, like exporting as csv ( comma delimited ) and then, opening that csv with Notepad++ ( free ), then Encoding > Convert to UTF8. But only if you have to do this once per file. If you need to change and export fequently, then the best is LibreOffice or GDocs solution.

有更快的方法,比如导出为csv(逗号分隔),然后用Notepad++(免费)打开csv,然后编码>转换为UTF8。但前提是每个文件必须这样做一次。如果您需要更改和导出fequently,那么最好是LibreOffice或GDocs解决方案。

#25


1  

open .csv fine with notepad++. if you see your encoding is good (you see all characters as they should be) press encoding , then convert to ANSI else - find out what is your current encoding

使用notepad++打开.csv。如果你看到你的编码很好(你看到所有的字符都应该是)按编码,然后转换成ANSI -找出你当前的编码是什么。

#26


1  

another solution is to open the file by winword and save it as txt and then reopen it by excel and it will work ISA

另一种解决方案是通过winword打开文件,并将其保存为txt,然后通过excel重新打开它,它将工作ISA。

#27


1  

Save Dialog > Tools Button > Web Options > Encoding Tab

保存对话框>工具按钮>网络选项>编码选项卡。

#28


1  

Came across the same problem and googled out this post. None of the above worked for me. At last I converted my Unicode .xls to .xml (choose Save as ... XML Spreadsheet 2003) and it produced the correct character. Then I wrote code to parse the xml and extracted content for my use.

遇到了同样的问题,谷歌上了这篇文章。以上这些都不适合我。最后,我将Unicode .xls转换为.xml(选择Save as…XML电子表格2003)并生成了正确的字符。然后,我编写代码来解析xml并提取内容供我使用。

#29


0  

Another way is to open the UTF-8 CSV file in Notepad where it will be displayed correctly. Then replace all the "," with tabs. Paste all of this into a new excel file.

另一种方法是在记事本中打开UTF-8 CSV文件,以便正确显示它。然后用制表符替换所有的“。”将所有这些粘贴到一个新的excel文件中。

#30


0  

I have the same problem and come across this add in , and it works perfectly fine in excel 2013 beside excel 2007 and 2010 which it is mention for.

我遇到了同样的问题,遇到了这个问题,在excel 2013和2010年的excel表格中,它非常好用。

#1


348  

A simple workaround is to use Google Spreadsheet. Paste (values only if you have complex formulas) or import the sheet then download CSV. I just tried a few characters and it works rather well.

一个简单的解决方案是使用谷歌电子表格。粘贴(仅当您有复杂的公式)或导入表然后下载CSV。我只是试了几个角色,效果很好。

NOTE: Google Sheets does have limitations when importing. See here.

注意:在导入时,谷歌表有限制。在这里看到的。

NOTE: Be careful of sensitive data with Google Sheets.

注意:小心使用谷歌表的敏感数据。

EDIT: Another alternative - basically they use VB macro or addins to force the save as UTF8. I have not tried any of these solutions but they sound reasonable.

编辑:另一种选择——基本上他们使用VB宏或addins来强制保存为UTF8。我没有尝试过这些解决方案,但听起来很合理。

#2


105  

I've found OpenOffice's spreadsheet application, Calc, is really good at handling CSV data.

我发现OpenOffice的电子表格应用Calc非常擅长处理CSV数据。

In the "Save As..." dialog, click "Format Options" to get different encodings for CSV. LibreOffice works the same way AFAIK.

在“Save As…”对话框中,单击“格式选项”以获得CSV的不同编码。LibreOffice和AFAIK的工作方式一样。

Excel和UTF8编码的CSV。

#3


91  

  1. Save the Excel sheet as "Unicode Text (.txt)". The good news is that all the international characters are in UTF16 (note, not in UTF8). However, the new "*.txt" file is TAB delimited, not comma delimited, and therefore is not a true CSV.

    将Excel表保存为“Unicode文本(.txt)”。好消息是所有的国际角色都在UTF16(注意,不是UTF8)。然而,新的“*。txt文件是标签分隔符,而不是逗号分隔符,因此不是真正的CSV。

  2. (optional) Unless you can use a TAB delimited file for import, use your favorite text editor and replace the TAB characters with commas ",".

    (可选)除非您可以使用TAB分隔符文件来导入,使用您喜欢的文本编辑器,并使用逗号“,”替换选项卡字符。

  3. Import your *.txt file in the target application. Make sure it can accept UTF16 format.

    导入*。txt文件在目标应用程序中。确保它可以接受UTF16格式。

If UTF-16 has been properly implemented with support for non-BMP code points, that you can convert a UTF-16 file to UTF-8 without losing information. I leave it to you to find your favourite method of doing so.

如果UTF-16在非bmp代码点的支持下得到了正确的实现,那么您可以将UTF-16文件转换为UTF-8,而不会丢失信息。我把它留给你去寻找你最喜欢的方法。

I use this procedure to import data from Excel to Moodle.

我用这个程序从Excel中导入数据。

#4


41  

I know this is an old question but I happened to come upon this question while struggling with the same issues as the OP.

我知道这是一个老问题,但我碰巧遇到了这个问题,同时也遇到了与OP同样的问题。

Not having found any of the offered solutions a viable option, I set out to discover if there is a way to do this just using Excel.

我没有找到任何可行的解决方案,我开始探索是否有办法使用Excel来实现这一点。

Fortunately, I have found that the lost character issue only happens (in my case) when saving from xlsx format to csv format. I tried saving the xlsx file to xls first, then to csv. It actually worked.

幸运的是,我发现当从xlsx格式保存到csv格式时,丢失的字符问题只会发生(在我的例子中)。我尝试先将xlsx文件保存到xls,然后再将其保存到csv。它实际上工作。

Please give it a try and see if it works for you. Good luck.

请试一试,看看是否适合你。祝你好运。

#5


31  

You can use iconv command under Unix (also available on Windows as libiconv).

您可以在Unix下使用iconv命令(也可以在Windows上使用libiconv)。

After saving as CSV under Excel in the command line put:

在命令行中在Excel中保存为CSV后:

iconv -f cp1250 -t utf-8 file-encoded-cp1250.csv > file-encoded-utf8.csv

(remember to replace cp1250 with your encoding).

(记住用你的编码替换cp1250)。

Works fast and great for big files like post codes database, which cannot be imported to GoogleDocs (400.000 cells limit).

对于像邮政编码数据库这样的大文件,它的运行速度和效果都很好,它不能导入到GoogleDocs(400.000个单元格)。

#6


21  

The only "easy way" of doing this is as follows. First, realize that there is a difference between what is displayed and what is kept hidden in the Excel .csv file.

这样做的唯一“简单方法”如下。首先,要认识到在Excel .csv文件中显示的内容和隐藏的内容之间存在差异。

(1) Open an Excel file where you have the info (.xls, .xlsx)

(1)打开有信息的Excel文件。xls,.xlsx)

(2) In Excel, choose "CSV (Comma Delimited) (*.csv) as the file type and save as that type.

(2)在Excel中,选择“CSV(逗号分隔)(*. CSV)作为文件类型并保存为该类型。

(3) In NOTEPAD (found under "Programs" and then Accessories in Start menu), open the saved .csv file in Notepad

(3)在记事本中(在“程序”和“开始”菜单中的附件中找到),在记事本中打开保存的.csv文件。

(4) Then choose -> Save As..and at the bottom of the "save as" box, there is a select box labelled as "Encoding". Select UTF-8 (do NOT use ANSI or you lose all accents etc). After selecting UTF-8, then save the file to a slightly different file name from the original.

(4)然后选择->保存为。在“save as”框的底部,有一个被标记为“编码”的选择框。选择UTF-8(不要使用ANSI或失去所有的重音等)。选择UTF-8之后,将文件保存到与原始文件稍有不同的文件名。

This file is in UTF-8 and retains all characters and accents and can be imported, for example, into MySQL and other database programs.

这个文件是UTF-8的,保留了所有字符和重音,可以导入到MySQL和其他数据库程序中。

This answer is taken from this forum.

这个答案取自这个论坛。

#7


20  

Another one I've found useful: "Numbers" allows encoding-settings when saving as CSV.

另一个我发现有用的:“数字”允许在保存为CSV时设置设置。

#8


15  

You can do this on a modern Windows machine without third party software. This method is reliable and it will handle data that includes quoted commas, quoted tab characters, CJK characters, etc.

你可以在没有第三方软件的现代Windows机器上做这个。该方法是可靠的,它将处理包括引用逗号、引号字符、CJK字符等的数据。

1. Save from Excel

1。保存从Excel

In Excel, save the data to file.txt using the type Unicode Text (*.txt).

在Excel中,将数据保存到文件中。txt使用类型Unicode文本(*.txt)。

2. Start PowerShell

2。开始PowerShell

Run powershell from the Start menu.

从开始菜单运行powershell。

3. Load the file in PowerShell

3所示。加载PowerShell中的文件。

$data = Import-Csv C:\path\to\file.txt -Delimiter "`t" -Encoding BigEndianUnicode

4. Save the data as CSV

4所示。将数据保存为CSV。

$data | Export-Csv file.csv -Encoding UTF8 -NoTypeInformation

#9


14  

"nevets1219" is right about Google docs, however if you simply "import" the file it often does not convert it to UTF-8.

“nevets1219”对于谷歌文档来说是正确的,然而,如果您只是“导入”文件,它通常不会将其转换为UTF-8。

But if you import the CSV into an existing Google spreadsheet it does convert to UTF-8.

但是如果将CSV导入到现有的谷歌电子表格中,它就会转换为UTF-8。

Here's a recipe:

这里有一个秘诀:

  • On the main Docs (or Drive) screen click the "Create" button and choose "Spreadsheet"
  • 在主文档(或驱动器)屏幕上单击“创建”按钮并选择“电子表格”
  • From the "File" menu choose "Import"
  • 从“文件”菜单中选择“导入”
  • Click "Choose File"
  • 点击“选择文件”
  • Choose "Replace spreadsheet"
  • 选择“替换表格”
  • Choose whichever character you are using as a Separator
  • 选择您用作分隔符的字符。
  • Click "Import"
  • 点击“导入”
  • From the "File" menu choose "Download as" -> CSV (current sheet)
  • 从“文件”菜单选择“下载为”-> CSV(当前页)

The resulting file will be in UTF-8

结果文件将在UTF-8中。

#10


8  

For those looking for an entirely programmatic (or at least server-side) solution, I've had great success using catdoc's xls2csv tool.

对于那些寻找完全编程(或至少是服务器端)解决方案的人来说,我使用catdoc的xls2csv工具取得了巨大的成功。

Install catdoc:

安装catdoc:

apt-get install catdoc

Do the conversion:

转换:

xls2csv -d utf-8 file.xls > file-utf-8.csv 

This is blazing fast.

这是超快。

Note that it's important that you include the -d utf-8 flag, otherwise it will encode the output in the default cp1252 encoding, and you run the risk of losing information.

注意,包含-d utf-8标志非常重要,否则它将在默认的cp1252编码中对输出进行编码,您将冒丢失信息的风险。

Note that xls2csv also only works with .xls files, it does not work with .xlsx files.

注意,xls2csv也只与.xls文件一起使用,它与.xlsx文件不兼容。

#11


7  

What about using Powershell.

使用Powershell呢。

Get-Content 'C:\my.csv' | Out-File 'C:\my_utf8.csv' -Encoding UTF8

#12


5  

As funny as it may seem, the easiest way I found to save my 180MB spreadsheet into a UTF8 CSV file was to select the cells into Excel, copy them and to paste the content of the clipboard into SublimeText.

尽管看起来很有趣,但我发现将我的180MB电子表格保存到UTF8 CSV文件中最简单的方法是,将这些单元格选择为Excel,复制它们,并将剪贴板的内容粘贴到SublimeText中。

#13


3  

A second option to "nevets1219" is to open your CSV file in Notepad++ and do a convertion to ANSI.

“nevets1219”的第二个选项是在Notepad++中打开CSV文件,并对ANSI进行转换。

Choose in the top menu : Encoding -> Convert to Ansi

在顶部菜单中选择:编码->转换为Ansi。

#14


3  

I was not able to find a VBA solution for this problem on Mac Excel. There simply seemed to be no way to output UTF-8 text.

我无法在Mac Excel上找到这个问题的VBA解决方案。似乎没有办法输出UTF-8文本。

So I finally had to give up on VBA, bit the bullet, and learned AppleScript. It wasn't nearly as bad as I had thought.

所以我最终不得不放弃VBA,咬了子弹,学习了AppleScript。这并不像我想的那么糟。

Solution is described here: http://talesoftech.blogspot.com/2011/05/excel-on-mac-goodbye-vba-hello.html

这里描述了解决方案:http://talesoftech.blogspot.com/2011/05/excel- mac-goodbye-vba-hello.html。

#15


3  

Assuming an Windows environment, save and work with the file as usual in Excel but then open up the saved Excel file in Gnome Gnumeric (free). Save Gnome Gnumeric's spreadsheet as CSV which - for me anyway - saves it as UTF-8 CSV.

假设一个Windows环境,在Excel中像往常一样保存和处理文件,然后在Gnome Gnumeric(免费)中打开保存的Excel文件。将Gnome Gnumeric的电子表格保存为CSV(对我而言),将其保存为UTF-8 CSV。

#16


3  

Easy way to do it: download open office (here), load the spreadsheet and open the excel file (.xls or .xlsx). Then just save it as a text CSV file and a window opens asking to keep the current format or to save as a .ODF format. select "keep the current format" and in the new window select the option that works better for you, according with the language that your file is been written on. For Spanish language select Western Europe (Windows-1252/ WinLatin 1) and the file works just fine. If you select Unicode (UTF-8), it is not going to work with the spanish characters.

简单的方法:下载open office(这里),加载电子表格并打开excel文件(。xls或.xlsx)。然后将它保存为一个文本CSV文件,打开一个窗口,请求保留当前格式或保存为. odf格式。选择“保持当前格式”,在新窗口中选择对您更有效的选项,根据您的文件所写的语言。对于西班牙语选择西欧(Windows-1252/ WinLatin 1),文件工作正常。如果选择Unicode (UTF-8),它将不会与西班牙字符一起工作。

#17


3  

  1. Save xls file (Excel file) as Unicode text=>file will be saved in text format (.txt)

    保存xls文件(Excel文件)作为Unicode文本=>文件将以文本格式保存(.txt)

  2. Change format from .txt to .csv (rename the file from XYX.txt to XYX.csv

    从.txt转换为.csv(从XYX重命名文件)。txt,XYX.csv

#18


2  

Microsoft Excel has an option to export spreadsheet using Unicode encoding. See following screenshot.

Microsoft Excel可以选择使用Unicode编码导出电子表格。见以下截图。

Excel和UTF8编码的CSV。

#19


2  

easiest way: no need Open office and google docs

最简单的方法:不需要开放式办公室和谷歌文档。

  1. Save your file as "Unicode text file";
  2. 将文件保存为“Unicode文本文件”;
  3. now you have an unicode text file
  4. 现在您有了一个unicode文本文件。
  5. open it with "notepad" and "Save as" it with selecting "utf-8" or other code page that you want
  6. 用“notepad”和“Save as”打开它,选择“utf-8”或其他你想要的代码页。
  7. rename file extension from "txt" to "csv"
  8. 将文件扩展名从“txt”重命名为“csv”

dont open it with Ms-office anyway!!! Now you have a tab delimited CSV file.

不管怎样,不要在办公室里打开它!!!现在您有了一个标签分隔的CSV文件。

#20


2  

I have written a small Python script that can export worksheets in UTF-8.

我编写了一个小的Python脚本,可以在UTF-8中导出工作表。

You just have to provide the Excel file as first parameter followed by the sheets that you would like to export. If you do not provide the sheets, the script will export all worksheets that are present in the Excel file.

您只需提供Excel文件作为第一个参数,然后是您想要导出的表。如果不提供表单,脚本将导出Excel文件中显示的所有工作表。

#!/usr/bin/env python

# export data sheets from xlsx to csv

from openpyxl import load_workbook
import csv
from os import sys

reload(sys)
sys.setdefaultencoding('utf-8')

def get_all_sheets(excel_file):
    sheets = []
    workbook = load_workbook(excel_file,use_iterators=True,data_only=True)
    all_worksheets = workbook.get_sheet_names()
    for worksheet_name in all_worksheets:
        sheets.append(worksheet_name)
    return sheets

def csv_from_excel(excel_file, sheets):
    workbook = load_workbook(excel_file,use_iterators=True,data_only=True)
    for worksheet_name in sheets:
        print("Export " + worksheet_name + " ...")

        try:
            worksheet = workbook.get_sheet_by_name(worksheet_name)
        except KeyError:
            print("Could not find " + worksheet_name)
            sys.exit(1)

        your_csv_file = open(''.join([worksheet_name,'.csv']), 'wb')
        wr = csv.writer(your_csv_file, quoting=csv.QUOTE_ALL)
        for row in worksheet.iter_rows():
            lrow = []
            for cell in row:
                lrow.append(cell.value)
            wr.writerow(lrow)
        print(" ... done")
    your_csv_file.close()

if not 2 <= len(sys.argv) <= 3:
    print("Call with " + sys.argv[0] + " <xlxs file> [comma separated list of sheets to export]")
    sys.exit(1)
else:
    sheets = []
    if len(sys.argv) == 3:
        sheets = list(sys.argv[2].split(','))
    else:
        sheets = get_all_sheets(sys.argv[1])
    assert(sheets != None and len(sheets) > 0)
    csv_from_excel(sys.argv[1], sheets)

#21


2  

Under Excel 2016, we have a CSV export option dedicated to UTF-8 format.

在excel2016中,我们有一个用于UTF-8格式的CSV导出选项。

#22


2  

Excel typically saves a csv file as ANSI encoding instead of utf8.

Excel通常将csv文件保存为ANSI编码而不是utf8。

One option to correct the file is to use Notepad or Notepad++:

一种纠正文件的选项是使用记事本或Notepad++:

  1. Open the .csv with Notepad or Notepad++.
  2. 使用记事本或Notepad++打开.csv。
  3. Copy the contents to your computer clipboard.
  4. 将内容复制到您的计算机剪贴板。
  5. Delete the contents from the file.
  6. 从文件中删除内容。
  7. Change the encoding of the file to utf8.
  8. 将文件的编码更改为utf8。
  9. Paste the contents back from the clipboard.
  10. 将内容从剪贴板粘贴回来。
  11. Save the file.
  12. 保存文件。

#23


2  

I have also came across the same problem but there is an easy solution for this.

我也遇到过同样的问题,但有一个简单的解决方法。

  1. Open your xlsx file in Excel 2016 or higher.
  2. 在Excel 2016或更高版本中打开xlsx文件。
  3. In "Save As" choose this option: "(CSV UTF-8(Comma Delimited)*.csv)"
  4. 在“Save As”中选择此选项:“(CSV UTF-8(逗号分隔)*.csv)”

It works perfectly and a csv file is generated which can be imported in any software. I imported this csv file in my SQLITE database and it works perfectly with all unicode characters intact.

它工作得很好,可以在任何软件中导入一个csv文件。我在我的SQLITE数据库中导入了这个csv文件,它与完整的unicode字符完美地工作。

#24


1  

Encoding -> Convert to Ansi will encode it in ANSI/UNICODE. Utf8 is a subset of Unicode. Perhaps in ANSI will be encoded correctly, but here we are talking about UTF8, @SequenceDigitale.

编码->转换为Ansi将在Ansi /UNICODE编码。Utf8是Unicode的一个子集。也许在ANSI中可以正确编码,但这里我们讨论的是UTF8, @SequenceDigitale。

There are faster ways, like exporting as csv ( comma delimited ) and then, opening that csv with Notepad++ ( free ), then Encoding > Convert to UTF8. But only if you have to do this once per file. If you need to change and export fequently, then the best is LibreOffice or GDocs solution.

有更快的方法,比如导出为csv(逗号分隔),然后用Notepad++(免费)打开csv,然后编码>转换为UTF8。但前提是每个文件必须这样做一次。如果您需要更改和导出fequently,那么最好是LibreOffice或GDocs解决方案。

#25


1  

open .csv fine with notepad++. if you see your encoding is good (you see all characters as they should be) press encoding , then convert to ANSI else - find out what is your current encoding

使用notepad++打开.csv。如果你看到你的编码很好(你看到所有的字符都应该是)按编码,然后转换成ANSI -找出你当前的编码是什么。

#26


1  

another solution is to open the file by winword and save it as txt and then reopen it by excel and it will work ISA

另一种解决方案是通过winword打开文件,并将其保存为txt,然后通过excel重新打开它,它将工作ISA。

#27


1  

Save Dialog > Tools Button > Web Options > Encoding Tab

保存对话框>工具按钮>网络选项>编码选项卡。

#28


1  

Came across the same problem and googled out this post. None of the above worked for me. At last I converted my Unicode .xls to .xml (choose Save as ... XML Spreadsheet 2003) and it produced the correct character. Then I wrote code to parse the xml and extracted content for my use.

遇到了同样的问题,谷歌上了这篇文章。以上这些都不适合我。最后,我将Unicode .xls转换为.xml(选择Save as…XML电子表格2003)并生成了正确的字符。然后,我编写代码来解析xml并提取内容供我使用。

#29


0  

Another way is to open the UTF-8 CSV file in Notepad where it will be displayed correctly. Then replace all the "," with tabs. Paste all of this into a new excel file.

另一种方法是在记事本中打开UTF-8 CSV文件,以便正确显示它。然后用制表符替换所有的“。”将所有这些粘贴到一个新的excel文件中。

#30


0  

I have the same problem and come across this add in , and it works perfectly fine in excel 2013 beside excel 2007 and 2010 which it is mention for.

我遇到了同样的问题,遇到了这个问题,在excel 2013和2010年的excel表格中,它非常好用。