如何最好地生成CSV(逗号分隔的文本文件)以供ASP.NET下载?

时间:2023-02-05 22:36:31

This is what I've got. It works. But, is there a simpler or better way?

这就是我所拥有的。有用。但是,有更简单或更好的方法吗?

One an ASPX page, I've got the download link...

一个ASPX页面,我有下载链接...

<asp:HyperLink ID="HyperLinkDownload" runat="server" NavigateUrl="~/Download.aspx">Download as CSV file</asp:HyperLink>

And then I've got the Download.aspx.vb Code Behind...

然后我得到了Download.aspx.vb代码......

Public Partial Class Download
    Inherits System.Web.UI.Page

    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        'set header
        Response.Clear()
        Response.ContentType = "text/csv"
        Dim FileName As String = "books.csv"
        Response.AppendHeader("Content-Disposition", "attachment;filename=" + FileName)

        'generate file content
        Dim db As New bookDevelopmentDataContext
        Dim Allbooks = From b In db.books _
                       Order By b.Added _
                       Select b
        Dim CsvFile As New StringBuilder
        CsvFile.AppendLine(CsvHeader())
        For Each b As Book In Allbooks
            CsvFile.AppendLine(bookString(b))
        Next

        'write the file
        Response.Write(CsvFile.ToString)
        Response.End()
    End Sub

    Function CsvHeader() As String
        Dim CsvLine As New StringBuilder
        CsvLine.Append("Published,")
        CsvLine.Append("Title,")
        CsvLine.Append("Author,")
        CsvLine.Append("Price")
        Return CsvLine.ToString
    End Function

    Function bookString(ByVal b As Book) As String
        Dim CsvLine As New StringBuilder
        CsvLine.Append(b.Published.ToShortDateString + ",")
        CsvLine.Append(b.Title.Replace(",", "") + ",")
        CsvLine.Append(b.Author.Replace(",", "") + ",")
        CsvLine.Append(Format(b.Price, "c").Replace(",", ""))
        Return CsvLine.ToString
    End Function

End Class

8 个解决方案

#1


22  

CSV formatting has some gotchas. Have you asked yourself these questions:

CSV格式有一些问题。你问过自己这些问题:

  • Does any of my data have embedded commas?
  • 我的任何数据都有嵌入的逗号吗?
  • Does any of my data have embedded double-quotes?
  • 我的任何数据都嵌入了双引号吗?
  • Does any of my data have have newlines?
  • 我的任何数据都有新行吗?
  • Do I need to support Unicode strings?
  • 我需要支持Unicode字符串吗?

I see several problems in your code above. The comma thing first of all... you are stripping commas:

我在上面的代码中看到了几个问题。首先是逗号的东西......你正在删除逗号:

CsvLine.Append(Format(b.Price, "c").Replace(",", ""))

Why? In CSV, you should be surrounding anything which has commas with quotes:

为什么?在CSV中,您应该围绕任何带引号的逗号:

CsvLine.Append(String.Format("\"{0:c}\"", b.Price))

(or something like that... my VB is not very good). If you're not sure if there are commas, but put quotes around it. If there are quotes in the string, you need to escape them by doubling them. " becomes "".

(或类似的东西......我的VB不是很好)。如果你不确定是否有逗号,但在它周围加上引号。如果字符串中有引号,则需要通过加倍来转义它们。 “成为”“。

b.Title.Replace("\"", "\"\"")

Then surround this by quotes if you want. If there are newlines in your string, you need to surround the string with quotes... yes, literal newlines are allowed in CSV files. It looks weird to humans, but it's all good.

如果你愿意,可以用引号括起来。如果字符串中有换行符,则需要用引号括起字符串...是的,CSV文件中允许使用文字换行符。对人类来说看起来很奇怪,但这一切都很好。

A good CSV writer requires some thought. A good CSV reader (parser) is just plain hard (and no, regex not good enough for parsing CSV... it will only get you about 95% of the way there).

一个好的CSV编写器需要一些思考。一个好的CSV阅读器(解析器)很简单(并且没有,正则表达式不足以解析CSV ...它只会让你大约95%的方式)。

And then there is Unicode... or more generally I18N (Internationalization) issues. For example, you are stripping commas out of a formatted price. But that's assuming the price is formatted as you expect it in the US. In France, the number formatting is reversed (periods used instead of commas, and vice versa). Bottom line, use culture-agnostic formatting wherever possible.

然后是Unicode ......或更普遍的I18N(国际化)问题。例如,您正在从格式化的价格中删除逗号。但这是假设价格的格式符合您在美国的预期。在法国,数字格式是相反的(使用句号代替逗号,反之亦然)。最重要的是,尽可能使用与文化无关的格式。

While the issue here is generating CSV, inevitably you will need to parse CSV. In .NET, the best parser I have found (for free) is Fast CSV Reader on CodeProject. I've actually used it in production code and it is really really fast, and very easy to use!

虽然这里的问题是生成CSV,但您不可避免地需要解析CSV。在.NET中,我找到的最好的解析器(免费)是CodeProject上的Fast CSV Reader。我实际上在生产代码中使用它,它真的非常快,而且非常容易使用!

#2


8  

I pass all my CSV data through a function like this:

我通过以下函数传递所有CSV数据:

Function PrepForCSV(ByVal value As String) As String
    return String.Format("""{0}""", Value.Replace("""", """"""))
End Function

Also, if you're not serving up html you probably want an http handler (.ashx file) rather than a full web page. If you create a new handler in Visual Studio, odds are you could just copy past your existing code into the main method and it will just work, with a small performance boost for your efforts.

此外,如果您没有提供HTML,您可能需要一个http处理程序(.ashx文件)而不是一个完整的网页。如果你在Visual Studio中创建一个新的处理程序,你很可能只是将你现有的代码复制到main方法中,它只会起作用,对你的努力有很小的性能提升。

#3


4  

You can create the equivalent of bookString() in the query itself. Here is what I think would be a simpler way.

您可以在查询本身中创建bookString()的等效项。这是我认为更简单的方法。

protected void Page_Load(object sender, EventArgs e)
{
    using (var db = new bookDevelopmentDataContext())
    {
        string fileName = "book.csv";
        var q = from b in db.books
                select string.Format("{0:d},\"{1}\",\"{2}\",{3:F2}", b.Published, b.Title.Replace("\"", "\"\""), b.Author.Replace("\"", "\"\""), t.price);

        string outstring = string.Join(",", q.ToArray());

        Response.Clear();
        Response.ClearHeaders();
        Response.ContentType = "text/csv";
        Response.AppendHeader("Content-Disposition", string.Format("attachment;filename={0}", fileName));
        Response.Write("Published,Title,Author,Price," + outstring);
        Response.End();
    }
}

#4


3  

If you want a colon delimited value converter then there is a 3rd party open source called FileHelpers. I'm not sure about what open-source license it is under, but it has helped me quite a lot.

如果你想要一个冒号分隔值转换器,那么有一个名为FileHelpers的第三方开源。我不确定它所属的开源许可证,但它对我帮助很大。

#5


2  

There's a lot of overhead associated with the Page class. Since you're just spitting out a CSV file and have no need for postback, server controls, caching, or the rest of it, you should make this into a handler with an .ashx extension. See here.

与Page类相关的开销很大。由于您只是吐出CSV文件而不需要回发,服务器控件,缓存或其余部分,因此您应该将其转换为扩展名为.ashx的处理程序。看这里。

#6


1  

In addition to what Simon said, you may want to read the CSV how-to guide and make sure your output doesn't run across any of the gotchas.

除了Simon所说的,您可能还想阅读CSV操作指南,并确保您的输出不会遇到任何问题。

To clarify something Simon said:

澄清西蒙所说的话:

Then surround this by quotes if you want

如果你愿意,可以用引号括起来

Fields that contain doubled up double quotes ("") will need to be completely surrounded with double quotes. There shouldn't be any harm in just wrapping all fields with double quotes, unless you specifically want the parser to strip out leading and trailing whitespace (instead of trimming it yourself).

包含双引号双引号(“”)的字段需要完全用双引号括起来。使用双引号包装所有字段应该没有任何损害,除非您特别希望解析器去除前导和尾随空格(而不是自己修剪)。

#7


1  

I use the following method when building a CSV file from a DataTable. ControllerContext is just the reponse stream object where the file is written to. For you it is just going to be the Response object.

从DataTable构建CSV文件时,我使用以下方法。 ControllerContext只是写入文件的响应流对象。对你来说它只是Response对象。

public override void ExecuteResult(ControllerContext context)
        {
            StringBuilder csv = new StringBuilder(10 * Table.Rows.Count * Table.Columns.Count);

            for (int c = 0; c < Table.Columns.Count; c++)
            {
                if (c > 0)
                    csv.Append(",");
                DataColumn dc = Table.Columns[c];
                string columnTitleCleaned = CleanCSVString(dc.ColumnName);
                csv.Append(columnTitleCleaned);
            }
            csv.Append(Environment.NewLine);
            foreach (DataRow dr in Table.Rows)
            {
                StringBuilder csvRow = new StringBuilder();
                for(int c = 0; c < Table.Columns.Count; c++)
                {
                    if(c != 0)
                        csvRow.Append(",");

                    object columnValue = dr[c];
                    if (columnValue == null)
                        csvRow.Append("");
                    else
                    {
                        string columnStringValue = columnValue.ToString();


                        string cleanedColumnValue = CleanCSVString(columnStringValue);

                        if (columnValue.GetType() == typeof(string) && !columnStringValue.Contains(","))
                        {
                            cleanedColumnValue = "=" + cleanedColumnValue; // Prevents a number stored in a string from being shown as 8888E+24 in Excel. Example use is the AccountNum field in CI that looks like a number but is really a string.
                        }
                        csvRow.Append(cleanedColumnValue);
                    }
                }
                csv.AppendLine(csvRow.ToString());
            }

            HttpResponseBase response = context.HttpContext.Response;
            response.ContentType = "text/csv";
            response.AppendHeader("Content-Disposition", "attachment;filename=" + this.FileName);
            response.Write(csv.ToString());
        }

        protected string CleanCSVString(string input)
        {
            string output = "\"" + input.Replace("\"", "\"\"").Replace("\r\n", " ").Replace("\r", " ").Replace("\n", "") + "\"";
            return output;
        }

#8


1  

Looking mostly good except in your function "BookString()" you should pass all those strings through a small function like this first:

看起来很好,除了你的函数“BookString()”你应该通过这样的小函数传递所有这些字符串:

Private Function formatForCSV(stringToProcess As String) As String
    If stringToProcess.Contains("""") Or stringToProcess.Contains(",") Then
        stringToProcess = String.Format("""{0}""", stringToProcess.Replace("""", """"""))
    End If
    Return stringToProcess
End Function

'So, lines like this:
CsvLine.Append(b.Title.Replace(",", "") + ",")
'would be lines like this instead:
CsvLine.Append(formatForCSV(b.Title)) + ",")

The function will format your strings well for CSV. It replaces quotes with double quotes and add quotes around the string if there are either quotes or commas in the string.

该函数将为CSV格式化您的字符串。如果字符串中有引号或逗号,它会用双引号替换引号并在字符串周围添加引号。

Note that it doesn't account for newlines, but can only safely guarantee good CSV output for those strings that you know are free of newlines (inputs from simple one-line text forms, etc.).

请注意,它不考虑换行符,但只能安全地保证那些您知道没有换行符的字符串的良好CSV输出(来自简单的单行文本表单的输入等)。

#1


22  

CSV formatting has some gotchas. Have you asked yourself these questions:

CSV格式有一些问题。你问过自己这些问题:

  • Does any of my data have embedded commas?
  • 我的任何数据都有嵌入的逗号吗?
  • Does any of my data have embedded double-quotes?
  • 我的任何数据都嵌入了双引号吗?
  • Does any of my data have have newlines?
  • 我的任何数据都有新行吗?
  • Do I need to support Unicode strings?
  • 我需要支持Unicode字符串吗?

I see several problems in your code above. The comma thing first of all... you are stripping commas:

我在上面的代码中看到了几个问题。首先是逗号的东西......你正在删除逗号:

CsvLine.Append(Format(b.Price, "c").Replace(",", ""))

Why? In CSV, you should be surrounding anything which has commas with quotes:

为什么?在CSV中,您应该围绕任何带引号的逗号:

CsvLine.Append(String.Format("\"{0:c}\"", b.Price))

(or something like that... my VB is not very good). If you're not sure if there are commas, but put quotes around it. If there are quotes in the string, you need to escape them by doubling them. " becomes "".

(或类似的东西......我的VB不是很好)。如果你不确定是否有逗号,但在它周围加上引号。如果字符串中有引号,则需要通过加倍来转义它们。 “成为”“。

b.Title.Replace("\"", "\"\"")

Then surround this by quotes if you want. If there are newlines in your string, you need to surround the string with quotes... yes, literal newlines are allowed in CSV files. It looks weird to humans, but it's all good.

如果你愿意,可以用引号括起来。如果字符串中有换行符,则需要用引号括起字符串...是的,CSV文件中允许使用文字换行符。对人类来说看起来很奇怪,但这一切都很好。

A good CSV writer requires some thought. A good CSV reader (parser) is just plain hard (and no, regex not good enough for parsing CSV... it will only get you about 95% of the way there).

一个好的CSV编写器需要一些思考。一个好的CSV阅读器(解析器)很简单(并且没有,正则表达式不足以解析CSV ...它只会让你大约95%的方式)。

And then there is Unicode... or more generally I18N (Internationalization) issues. For example, you are stripping commas out of a formatted price. But that's assuming the price is formatted as you expect it in the US. In France, the number formatting is reversed (periods used instead of commas, and vice versa). Bottom line, use culture-agnostic formatting wherever possible.

然后是Unicode ......或更普遍的I18N(国际化)问题。例如,您正在从格式化的价格中删除逗号。但这是假设价格的格式符合您在美国的预期。在法国,数字格式是相反的(使用句号代替逗号,反之亦然)。最重要的是,尽可能使用与文化无关的格式。

While the issue here is generating CSV, inevitably you will need to parse CSV. In .NET, the best parser I have found (for free) is Fast CSV Reader on CodeProject. I've actually used it in production code and it is really really fast, and very easy to use!

虽然这里的问题是生成CSV,但您不可避免地需要解析CSV。在.NET中,我找到的最好的解析器(免费)是CodeProject上的Fast CSV Reader。我实际上在生产代码中使用它,它真的非常快,而且非常容易使用!

#2


8  

I pass all my CSV data through a function like this:

我通过以下函数传递所有CSV数据:

Function PrepForCSV(ByVal value As String) As String
    return String.Format("""{0}""", Value.Replace("""", """"""))
End Function

Also, if you're not serving up html you probably want an http handler (.ashx file) rather than a full web page. If you create a new handler in Visual Studio, odds are you could just copy past your existing code into the main method and it will just work, with a small performance boost for your efforts.

此外,如果您没有提供HTML,您可能需要一个http处理程序(.ashx文件)而不是一个完整的网页。如果你在Visual Studio中创建一个新的处理程序,你很可能只是将你现有的代码复制到main方法中,它只会起作用,对你的努力有很小的性能提升。

#3


4  

You can create the equivalent of bookString() in the query itself. Here is what I think would be a simpler way.

您可以在查询本身中创建bookString()的等效项。这是我认为更简单的方法。

protected void Page_Load(object sender, EventArgs e)
{
    using (var db = new bookDevelopmentDataContext())
    {
        string fileName = "book.csv";
        var q = from b in db.books
                select string.Format("{0:d},\"{1}\",\"{2}\",{3:F2}", b.Published, b.Title.Replace("\"", "\"\""), b.Author.Replace("\"", "\"\""), t.price);

        string outstring = string.Join(",", q.ToArray());

        Response.Clear();
        Response.ClearHeaders();
        Response.ContentType = "text/csv";
        Response.AppendHeader("Content-Disposition", string.Format("attachment;filename={0}", fileName));
        Response.Write("Published,Title,Author,Price," + outstring);
        Response.End();
    }
}

#4


3  

If you want a colon delimited value converter then there is a 3rd party open source called FileHelpers. I'm not sure about what open-source license it is under, but it has helped me quite a lot.

如果你想要一个冒号分隔值转换器,那么有一个名为FileHelpers的第三方开源。我不确定它所属的开源许可证,但它对我帮助很大。

#5


2  

There's a lot of overhead associated with the Page class. Since you're just spitting out a CSV file and have no need for postback, server controls, caching, or the rest of it, you should make this into a handler with an .ashx extension. See here.

与Page类相关的开销很大。由于您只是吐出CSV文件而不需要回发,服务器控件,缓存或其余部分,因此您应该将其转换为扩展名为.ashx的处理程序。看这里。

#6


1  

In addition to what Simon said, you may want to read the CSV how-to guide and make sure your output doesn't run across any of the gotchas.

除了Simon所说的,您可能还想阅读CSV操作指南,并确保您的输出不会遇到任何问题。

To clarify something Simon said:

澄清西蒙所说的话:

Then surround this by quotes if you want

如果你愿意,可以用引号括起来

Fields that contain doubled up double quotes ("") will need to be completely surrounded with double quotes. There shouldn't be any harm in just wrapping all fields with double quotes, unless you specifically want the parser to strip out leading and trailing whitespace (instead of trimming it yourself).

包含双引号双引号(“”)的字段需要完全用双引号括起来。使用双引号包装所有字段应该没有任何损害,除非您特别希望解析器去除前导和尾随空格(而不是自己修剪)。

#7


1  

I use the following method when building a CSV file from a DataTable. ControllerContext is just the reponse stream object where the file is written to. For you it is just going to be the Response object.

从DataTable构建CSV文件时,我使用以下方法。 ControllerContext只是写入文件的响应流对象。对你来说它只是Response对象。

public override void ExecuteResult(ControllerContext context)
        {
            StringBuilder csv = new StringBuilder(10 * Table.Rows.Count * Table.Columns.Count);

            for (int c = 0; c < Table.Columns.Count; c++)
            {
                if (c > 0)
                    csv.Append(",");
                DataColumn dc = Table.Columns[c];
                string columnTitleCleaned = CleanCSVString(dc.ColumnName);
                csv.Append(columnTitleCleaned);
            }
            csv.Append(Environment.NewLine);
            foreach (DataRow dr in Table.Rows)
            {
                StringBuilder csvRow = new StringBuilder();
                for(int c = 0; c < Table.Columns.Count; c++)
                {
                    if(c != 0)
                        csvRow.Append(",");

                    object columnValue = dr[c];
                    if (columnValue == null)
                        csvRow.Append("");
                    else
                    {
                        string columnStringValue = columnValue.ToString();


                        string cleanedColumnValue = CleanCSVString(columnStringValue);

                        if (columnValue.GetType() == typeof(string) && !columnStringValue.Contains(","))
                        {
                            cleanedColumnValue = "=" + cleanedColumnValue; // Prevents a number stored in a string from being shown as 8888E+24 in Excel. Example use is the AccountNum field in CI that looks like a number but is really a string.
                        }
                        csvRow.Append(cleanedColumnValue);
                    }
                }
                csv.AppendLine(csvRow.ToString());
            }

            HttpResponseBase response = context.HttpContext.Response;
            response.ContentType = "text/csv";
            response.AppendHeader("Content-Disposition", "attachment;filename=" + this.FileName);
            response.Write(csv.ToString());
        }

        protected string CleanCSVString(string input)
        {
            string output = "\"" + input.Replace("\"", "\"\"").Replace("\r\n", " ").Replace("\r", " ").Replace("\n", "") + "\"";
            return output;
        }

#8


1  

Looking mostly good except in your function "BookString()" you should pass all those strings through a small function like this first:

看起来很好,除了你的函数“BookString()”你应该通过这样的小函数传递所有这些字符串:

Private Function formatForCSV(stringToProcess As String) As String
    If stringToProcess.Contains("""") Or stringToProcess.Contains(",") Then
        stringToProcess = String.Format("""{0}""", stringToProcess.Replace("""", """"""))
    End If
    Return stringToProcess
End Function

'So, lines like this:
CsvLine.Append(b.Title.Replace(",", "") + ",")
'would be lines like this instead:
CsvLine.Append(formatForCSV(b.Title)) + ",")

The function will format your strings well for CSV. It replaces quotes with double quotes and add quotes around the string if there are either quotes or commas in the string.

该函数将为CSV格式化您的字符串。如果字符串中有引号或逗号,它会用双引号替换引号并在字符串周围添加引号。

Note that it doesn't account for newlines, but can only safely guarantee good CSV output for those strings that you know are free of newlines (inputs from simple one-line text forms, etc.).

请注意,它不考虑换行符,但只能安全地保证那些您知道没有换行符的字符串的良好CSV输出(来自简单的单行文本表单的输入等)。