如何在文本文件中将小写字符替换为大写

时间:2022-09-06 22:47:32

I'm converting each lowercase character into uppercase, but it's not replacing in the file. What am I doing wrong?

我正在将每个小写字符转换为大写,但它不会替换文件。我究竟做错了什么?

#include <stdio.h>

int main() {
    FILE *file;
    char ch;
    file = fopen("file.txt", "r+");
    if (file == NULL) {
        printf("Error");
        exit(1);
    } else {
        while ((ch = fgetc(file)) != EOF) {
            if (ch >= 96 && ch <= 123) {
                ch = ch - 32;
                putc(ch, file);
            }
        }
        fclose(file);
        return 0;
    }
}

4 个解决方案

#1


1  

The issue is that with fopen/'+', switching from reading to writing requires an intermediate call to a file positioning function, e.g. fseek:

问题在于,使用fopen /'+',从读取切换到写入需要对文件定位功能的中间调用,例如, FSEEK:

7.21.5.3 The fopen function

7.21.5.3 fopen函数

(7) When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end- of-file. Opening (or creating) a text file with update mode may instead open (or create) a binary stream in some implementations.

(7)当以更新模式打开文件时('+'作为上述模式参数值列表中的第二个或第三个字符),可以在关联的流上执行输入和输出。但是,输入不应直接跟随输入而不干涉fflush功能或文件定位功能(fseek,fsetpos或rewind),并且输入不应直接跟随输出而不干预文件定位函数,除非输入操作遇到文件结尾。在某些实现中,打开(或创建)具有更新模式的文本文件可以改为打开(或创建)二进制流。

So you probably have to write something like

所以你可能要写一些类似的东西

fseek (fp , -1, SEEK_CUR);
putc(ch,file);
fseek (fp , 0, SEEK_CUR);

Note, however, that replacing characters one by one in a (possibly) large file is rather inefficient. The preferred way would be to read in one file, do the conversions and write to another file, and - in the end - replace the one file with the other one. If that cannot be achieved, try at least to read in / convert / write back larger chunks (not byte by byte).

但请注意,在(可能)大文件中逐个替换字符的效率相当低。首选的方法是读入一个文件,进行转换并写入另一个文件,最后 - 将一个文件替换为另一个文件。如果无法实现,请至少尝试读入/转换/写回更大的块(不是逐字节)。

Just to check whether converting byte by byte is really inefficient, I compared it to a chunk-by-chunk - approach. It turns out - at least at my machine - that chunk-by-chunk in this test is 500 times faster than byte-by-byte. The file size is about 100k:

只是为了检查逐字节转换是否真的效率低下,我把它比作一个逐块的方法。事实证明 - 至少在我的机器上 - 这个测试中的chunk-by-chunk比逐字节快500倍。文件大小约为100k:

int main(void) {
    FILE *file;
    int ch;
    file = fopen("/Users/stephanl/Desktop/ImplantmedPic_Storeblok.jpg", "rb+");
    if (file == NULL) {
        printf("Error: cannot open file.txt\n");
        return 1;
    } else {
        clock_t t1 = clock();

        // The following variant takes about 900 ticks:
        unsigned char buffer[1000];
        unsigned long l;
        while (1) {
            unsigned long fpi = ftell(file);
            l=fread(buffer, 1, 1000, file);
            if (l==0)
                break;

            for (int i=0;i<l;i++) {
                buffer[i]=~buffer[i];
            }

            fseek(file, fpi, SEEK_SET);
            fwrite(buffer,1,l, file);
            fseek(file, 0L, SEEK_CUR);
        }


        // this variant takes about 500.000 ticks
        while ((ch = fgetc(file)) != EOF) {
                fseek(file, -1L, SEEK_CUR);
                putc(~ch, file);
                fseek(file, 0L, SEEK_CUR);
        }

        fclose(file);
        clock_t t2 = clock();
        printf ("difference: %ld \n", t2-t1);
        return 0;
    }
}

#2


3  

You have to open another file to write.

你必须打开另一个文件来写。

fileOut = fopen("fileOut.txt", "w");

ch must be integer.

ch必须是整数。

int ch;

Check men page like this.

检查这样的男人页面。

#man fgetc

And:

putc(ch,fileOut); 

should be out of if block.

应该是if block。

#3


1  

There are multiple problems in your code:

您的代码中存在多个问题:

  • opening the file in read+update mode ("r+") is tricky: you must call fseek() between read and write operations, otherwise the behavior is undefined. Furthermore, you must open the file in binary mode for fseek() to operate correctly and portably on byte offsets.

    以读取+更新模式(“r +”)打开文件很棘手:必须在读取和写入操作之间调用fseek(),否则行为未定义。此外,您必须以二进制模式打开文件,以便fseek()在字节偏移上正确且可移植地运行。

  • ch must be defined as int for EOF to be properly detected.

    必须将ch定义为int才能正确检测EOF。

  • hardcoding the values of 'a' and 'z' as 96 and 123 is non portable and error prone, in fact 'a' is 97 in ASCII, not 96 and 'z' is 122, not 123. Use the functions from <ctype.h> for best portability and readability.

    将'a'和'z'的值硬编码为96和123是不可移植且容易出错的,事实上'a'在ASCII中是97而不是96而'z'是122,而不是123.使用 为了最好的便携性和可读性。 中的函数.h>

  • shifting by 32 only works for ASCII, use toupper() instead.

    移位32仅适用于ASCII,而是使用toupper()。

  • you forgot to include <stdlib.h> that declares the exit() function.

    你忘了包含声明exit()函数的

Here is a corrected (simplistic and inefficient) version:

这是一个更正(简单和低效)的版本:

#include <stdio.h>

int main(void) {
    FILE *file;
    int ch;
    file = fopen("file.txt", "rb+");
    if (file == NULL) {
        printf("Error: cannot open file.txt\n");
        return 1;
    } else {
        while ((ch = fgetc(file)) != EOF) {
            if (islower(ch)) {
                fseek(file, -1L, SEEK_CUR);
                putc(toupper(ch), file);
                fseek(file, 0L, SEEK_CUR);
            }
        }
        fclose(file);
        return 0;
    }
}

#4


0  

Use toupper(3) function (or macro call) from #include <ctype.h>, so you don't have to know which character set is available at your installation.

使用#include 中的toupper(3)函数(或宏调用),这样您就不必知道安装时可用的字符集。

#include <stdio.h>

int main() {
    FILE *file;
    int ch;
    file = fopen("file.txt", "r+");
    if (file == NULL) {
        printf("Error");
        exit(1);
    } else {
        while ((ch = fgetc(file)) != EOF) {
            putc(toupper(ch), file);
        }
        fclose(file);
        return 0;
    }
}

NOTE

There's a flaw in your original code... you check ch to be lowercase and only output it if it happens to be. You have to output all uppercase and no-case characters also.... so don't put the putc(3) call in the if block. My code just output all the characters converted to uppercase with the function. Also, you have to define ch as int, as required by fgetc(3), toupper(3) and putc(3).

您的原始代码存在缺陷...您将ch检查为小写,并且仅在碰巧出现时才输出。您还必须输出所有大写和无大小写字符....所以不要将putc(3)调用放在if块中。我的代码只输出用函数转换为大写的所有字符。此外,您必须根据fgetc(3),toupper(3)和putc(3)的要求将ch定义为int。

#1


1  

The issue is that with fopen/'+', switching from reading to writing requires an intermediate call to a file positioning function, e.g. fseek:

问题在于,使用fopen /'+',从读取切换到写入需要对文件定位功能的中间调用,例如, FSEEK:

7.21.5.3 The fopen function

7.21.5.3 fopen函数

(7) When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end- of-file. Opening (or creating) a text file with update mode may instead open (or create) a binary stream in some implementations.

(7)当以更新模式打开文件时('+'作为上述模式参数值列表中的第二个或第三个字符),可以在关联的流上执行输入和输出。但是,输入不应直接跟随输入而不干涉fflush功能或文件定位功能(fseek,fsetpos或rewind),并且输入不应直接跟随输出而不干预文件定位函数,除非输入操作遇到文件结尾。在某些实现中,打开(或创建)具有更新模式的文本文件可以改为打开(或创建)二进制流。

So you probably have to write something like

所以你可能要写一些类似的东西

fseek (fp , -1, SEEK_CUR);
putc(ch,file);
fseek (fp , 0, SEEK_CUR);

Note, however, that replacing characters one by one in a (possibly) large file is rather inefficient. The preferred way would be to read in one file, do the conversions and write to another file, and - in the end - replace the one file with the other one. If that cannot be achieved, try at least to read in / convert / write back larger chunks (not byte by byte).

但请注意,在(可能)大文件中逐个替换字符的效率相当低。首选的方法是读入一个文件,进行转换并写入另一个文件,最后 - 将一个文件替换为另一个文件。如果无法实现,请至少尝试读入/转换/写回更大的块(不是逐字节)。

Just to check whether converting byte by byte is really inefficient, I compared it to a chunk-by-chunk - approach. It turns out - at least at my machine - that chunk-by-chunk in this test is 500 times faster than byte-by-byte. The file size is about 100k:

只是为了检查逐字节转换是否真的效率低下,我把它比作一个逐块的方法。事实证明 - 至少在我的机器上 - 这个测试中的chunk-by-chunk比逐字节快500倍。文件大小约为100k:

int main(void) {
    FILE *file;
    int ch;
    file = fopen("/Users/stephanl/Desktop/ImplantmedPic_Storeblok.jpg", "rb+");
    if (file == NULL) {
        printf("Error: cannot open file.txt\n");
        return 1;
    } else {
        clock_t t1 = clock();

        // The following variant takes about 900 ticks:
        unsigned char buffer[1000];
        unsigned long l;
        while (1) {
            unsigned long fpi = ftell(file);
            l=fread(buffer, 1, 1000, file);
            if (l==0)
                break;

            for (int i=0;i<l;i++) {
                buffer[i]=~buffer[i];
            }

            fseek(file, fpi, SEEK_SET);
            fwrite(buffer,1,l, file);
            fseek(file, 0L, SEEK_CUR);
        }


        // this variant takes about 500.000 ticks
        while ((ch = fgetc(file)) != EOF) {
                fseek(file, -1L, SEEK_CUR);
                putc(~ch, file);
                fseek(file, 0L, SEEK_CUR);
        }

        fclose(file);
        clock_t t2 = clock();
        printf ("difference: %ld \n", t2-t1);
        return 0;
    }
}

#2


3  

You have to open another file to write.

你必须打开另一个文件来写。

fileOut = fopen("fileOut.txt", "w");

ch must be integer.

ch必须是整数。

int ch;

Check men page like this.

检查这样的男人页面。

#man fgetc

And:

putc(ch,fileOut); 

should be out of if block.

应该是if block。

#3


1  

There are multiple problems in your code:

您的代码中存在多个问题:

  • opening the file in read+update mode ("r+") is tricky: you must call fseek() between read and write operations, otherwise the behavior is undefined. Furthermore, you must open the file in binary mode for fseek() to operate correctly and portably on byte offsets.

    以读取+更新模式(“r +”)打开文件很棘手:必须在读取和写入操作之间调用fseek(),否则行为未定义。此外,您必须以二进制模式打开文件,以便fseek()在字节偏移上正确且可移植地运行。

  • ch must be defined as int for EOF to be properly detected.

    必须将ch定义为int才能正确检测EOF。

  • hardcoding the values of 'a' and 'z' as 96 and 123 is non portable and error prone, in fact 'a' is 97 in ASCII, not 96 and 'z' is 122, not 123. Use the functions from <ctype.h> for best portability and readability.

    将'a'和'z'的值硬编码为96和123是不可移植且容易出错的,事实上'a'在ASCII中是97而不是96而'z'是122,而不是123.使用 为了最好的便携性和可读性。 中的函数.h>

  • shifting by 32 only works for ASCII, use toupper() instead.

    移位32仅适用于ASCII,而是使用toupper()。

  • you forgot to include <stdlib.h> that declares the exit() function.

    你忘了包含声明exit()函数的

Here is a corrected (simplistic and inefficient) version:

这是一个更正(简单和低效)的版本:

#include <stdio.h>

int main(void) {
    FILE *file;
    int ch;
    file = fopen("file.txt", "rb+");
    if (file == NULL) {
        printf("Error: cannot open file.txt\n");
        return 1;
    } else {
        while ((ch = fgetc(file)) != EOF) {
            if (islower(ch)) {
                fseek(file, -1L, SEEK_CUR);
                putc(toupper(ch), file);
                fseek(file, 0L, SEEK_CUR);
            }
        }
        fclose(file);
        return 0;
    }
}

#4


0  

Use toupper(3) function (or macro call) from #include <ctype.h>, so you don't have to know which character set is available at your installation.

使用#include 中的toupper(3)函数(或宏调用),这样您就不必知道安装时可用的字符集。

#include <stdio.h>

int main() {
    FILE *file;
    int ch;
    file = fopen("file.txt", "r+");
    if (file == NULL) {
        printf("Error");
        exit(1);
    } else {
        while ((ch = fgetc(file)) != EOF) {
            putc(toupper(ch), file);
        }
        fclose(file);
        return 0;
    }
}

NOTE

There's a flaw in your original code... you check ch to be lowercase and only output it if it happens to be. You have to output all uppercase and no-case characters also.... so don't put the putc(3) call in the if block. My code just output all the characters converted to uppercase with the function. Also, you have to define ch as int, as required by fgetc(3), toupper(3) and putc(3).

您的原始代码存在缺陷...您将ch检查为小写,并且仅在碰巧出现时才输出。您还必须输出所有大写和无大小写字符....所以不要将putc(3)调用放在if块中。我的代码只输出用函数转换为大写的所有字符。此外,您必须根据fgetc(3),toupper(3)和putc(3)的要求将ch定义为int。