使用C#/ .NET替换文件中文本的最佳方法是什么?

时间:2022-09-13 08:01:07

I have a text file that is being written to as part of a very large data extract. The first line of the text file is the number of "accounts" extracted.


Because of the nature of this extract, that number is not known until the very end of the process, but the file can be large (a few hundred megs).


What is the BEST way in C# / .NET to open a file (in this case a simple text file), and replace the data that is in the first "line" of text?

在C#/ .NET中打开文件(在本例中是一个简单的文本文件),并替换文本的第一个“行”中的数据的最佳方法是什么?

IMPORTANT NOTE: - I do not need to replace a "fixed amount of bytes" - that would be easy. The problem here is that the data that needs to be inserted at the top of the file is variable.

重要说明: - 我不需要替换“固定数量的字节” - 这很容易。这里的问题是需要插入文件顶部的数据是可变的。

IMPORTANT NOTE 2: - A few people have asked about / mentioned simply keeping the data in memory and then replacing it... however that's completely out of the question. The reason why this process is being updated is because of the fact that sometimes it crashes when loading a few gigs into memory.

重要说明2: - 有些人询问/提到只是将数据保存在内存中然后更换它......但这完全是不可能的。更新此过程的原因是因为有时它会在将几个演出加载到内存时崩溃。

6 个解决方案


If you can you should insert a placeholder which you overwrite at the end with the actual number and spaces.


If that is not an option write your data to a cache file first. When you know the actual number create the output file and append the data from the cache.



BEST is very subjective. For any smallish file, you can easily open the entire file in memory and replace what you want using a string replace and then re-write the file.


Even for largish files, it would not be that hard to load into memory. In the days of multi-gigs of memory, I would consider hundreds of megabytes to still be easily done in memory.


Have you tested this naive approach? Have you seen a real issue with it?


If this is a really large file (gigabytes in size), I would consider writing all of the data first to a temp file and then write the correct file with the header line going in first and then appending the rest of the data. Since it is only text, I would probably just shell out to DOS:


 TYPE temp.txt >> outfile.txt


I do not need to replace a "fixed amount of bytes"


Are you sure? If you write a big number to the first line of the file (UInt32.MaxValue or UInt64.MaxValue), then when you find the correct actual number, you can replace that number of bytes with the correct number, but left padded with zeros, so it's still a valid integer. e.g.


Replace  999999 - your "large number placeholder"
With     000100 - the actual number of accounts


Seems to me if I understand the question correctly?


What is the BEST way in C# / .NET to open a file (in this case a simple text file), and replace the data that is in the first "line" of text?

在C#/ .NET中打开文件(在本例中是一个简单的文本文件),并替换文本的第一个“行”中的数据的最佳方法是什么?

How about placing at the top of the file a token {UserCount} when it is first created.


Then use TextReader to read the file line by line. If it is the first line look for {UserCount} and replace with your value. Write out each line you read in using TextWriter



    int lineNumber = 1;
    int userCount = 1234;
    string line = null;

    using(TextReader tr = File.OpenText("OriginalFile"))
    using(TextWriter tw = File.CreateText("ResultFile"))

        while((line = tr.ReadLine()) != null)
            if(lineNumber == 1)
                line = line.Replace("{UserCount}", userCount.ToString());




If the extracted file is only a few hundred megabytes, then you can easily keep all of the text in-memory until the extraction is complete. Then, you can write your output file as the last operation, starting with the record count.



Ok, earlier I suggested an approach that would be a better if dealing with existing files.


However in your situation you want to create the file and during the create process go back to the top and write out the user count. This will do just that.


Here is one way to do it that prevents you having to write the temporary file.


    private void WriteUsers()
        string userCountString = null;
        ASCIIEncoding enc = new ASCIIEncoding();
        byte[] userCountBytes = null;
        int userCounter = 0;

        using(StreamWriter sw = File.CreateText("myfile.txt"))
            // Write a blank line and return
            // Note this line will later contain our user count.

            // Write out the records and keep track of the count 
            for(int i = 1; i < 100; i++)
                sw.WriteLine("User" + i);

            // Get the base stream and set the position to 0
            sw.BaseStream.Position = 0;

            userCountString = "User Count: " + userCounter;

            userCountBytes = enc.GetBytes(userCountString);

            sw.BaseStream.Write(userCountBytes, 0, userCountBytes.Length);



If you can you should insert a placeholder which you overwrite at the end with the actual number and spaces.


If that is not an option write your data to a cache file first. When you know the actual number create the output file and append the data from the cache.



BEST is very subjective. For any smallish file, you can easily open the entire file in memory and replace what you want using a string replace and then re-write the file.


Even for largish files, it would not be that hard to load into memory. In the days of multi-gigs of memory, I would consider hundreds of megabytes to still be easily done in memory.


Have you tested this naive approach? Have you seen a real issue with it?


If this is a really large file (gigabytes in size), I would consider writing all of the data first to a temp file and then write the correct file with the header line going in first and then appending the rest of the data. Since it is only text, I would probably just shell out to DOS:


 TYPE temp.txt >> outfile.txt


I do not need to replace a "fixed amount of bytes"


Are you sure? If you write a big number to the first line of the file (UInt32.MaxValue or UInt64.MaxValue), then when you find the correct actual number, you can replace that number of bytes with the correct number, but left padded with zeros, so it's still a valid integer. e.g.


Replace  999999 - your "large number placeholder"
With     000100 - the actual number of accounts


Seems to me if I understand the question correctly?


What is the BEST way in C# / .NET to open a file (in this case a simple text file), and replace the data that is in the first "line" of text?

在C#/ .NET中打开文件(在本例中是一个简单的文本文件),并替换文本的第一个“行”中的数据的最佳方法是什么?

How about placing at the top of the file a token {UserCount} when it is first created.


Then use TextReader to read the file line by line. If it is the first line look for {UserCount} and replace with your value. Write out each line you read in using TextWriter



    int lineNumber = 1;
    int userCount = 1234;
    string line = null;

    using(TextReader tr = File.OpenText("OriginalFile"))
    using(TextWriter tw = File.CreateText("ResultFile"))

        while((line = tr.ReadLine()) != null)
            if(lineNumber == 1)
                line = line.Replace("{UserCount}", userCount.ToString());




If the extracted file is only a few hundred megabytes, then you can easily keep all of the text in-memory until the extraction is complete. Then, you can write your output file as the last operation, starting with the record count.



Ok, earlier I suggested an approach that would be a better if dealing with existing files.


However in your situation you want to create the file and during the create process go back to the top and write out the user count. This will do just that.


Here is one way to do it that prevents you having to write the temporary file.


    private void WriteUsers()
        string userCountString = null;
        ASCIIEncoding enc = new ASCIIEncoding();
        byte[] userCountBytes = null;
        int userCounter = 0;

        using(StreamWriter sw = File.CreateText("myfile.txt"))
            // Write a blank line and return
            // Note this line will later contain our user count.

            // Write out the records and keep track of the count 
            for(int i = 1; i < 100; i++)
                sw.WriteLine("User" + i);

            // Get the base stream and set the position to 0
            sw.BaseStream.Position = 0;

            userCountString = "User Count: " + userCounter;

            userCountBytes = enc.GetBytes(userCountString);

            sw.BaseStream.Write(userCountBytes, 0, userCountBytes.Length);
