Base64使用NSData块编码文件

时间:2022-12-02 21:42:59

Update 4
Per Greg's suggestion I've created one pair of image/text that shows the output from a 37k image to base64 encoded, using 100k chunks. Since the file is only 37k it's safe to say the loop only iterated once, so nothing was appended. The other pair shows the output from the same 37k image to base64 encoded, using 10k chunks. Since the file is 37k the loop iterated four times, and data was definitely appended.

更新4 Per Greg的建议我创建了一对图像/文本,使用100k块显示从37k图像到base64编码的输出。由于文件只有37k,因此可以说循环只迭代一次,所以没有附加任何内容。另一对使用10k块显示从相同的37k图像到base64编码的输出。由于文件是37k,循环迭代四次,并且数据被明确附加。

Doing a diff on the two files shows that on the 10kb chunk file there's a large difference that begins on line 214 and ends on line 640.

对这两个文件进行差异显示,在10kb块文件上有一个很大的区别,它从第214行开始到第640行结束。

Update 3
Here's where my code is now. Cleaned up a bit but still producing the same effect:

更新3这是我的代码现在的位置。清理了一下但仍产生相同的效果:

// Read data in chunks from the original file
[originalFile seekToEndOfFile];
NSUInteger fileLength = [originalFile offsetInFile];
[originalFile seekToFileOffset:0];
NSUInteger chunkSize = 100 * 1024;
NSUInteger offset = 0;

while(offset < fileLength) {
    NSData *chunk = [originalFile readDataOfLength:chunkSize];
    offset += chunkSize;

    // Convert the chunk to a base64 encoded string and back into NSData
    NSString *base64EncodedChunkString = [chunk base64EncodedString];
    NSData *base64EncodedChunk = [base64EncodedChunkString dataUsingEncoding:NSASCIIStringEncoding];

    // Write the encoded chunk to our output file
    [encodedFile writeData:base64EncodedChunk];

    // Cleanup
    base64EncodedChunkString = nil;
    base64EncodedChunk = nil;

    // Update progress bar
    [self updateProgress:[NSNumber numberWithInt:offset] total:[NSNumber numberWithInt:fileLength]];
}

Update 2
So it looks like files that are larger than 100 KB get scrambled, but files under 100 KB are fine. It's obvious that something is off on my buffer/math/etc, but I'm lost on this one. Might be time to call it a day, but I'd love to go to sleep with this one resolved.

更新2所以看起来大于100 KB的文件被加扰,但100 KB以下的文件很好。很明显,我的缓冲区/数学/等等有些东西已经关闭,但是我在这个上面输了。可能是时候把它称为一天,但我很想睡觉,这一个解决了。

Here's an example:
Base64使用NSData块编码文件

这是一个例子:

Update 1
After doing some testing I have found that the same code will work fine for a small image, but will not work for a large image or video of any size. Definitely looks like a buffer issue, right?

更新1经过一些测试后,我发现相同的代码可以很好地处理小图像,但不适用于任何大小的大图像或视频。绝对看起来像一个缓冲问题,对吧?


Hey there, trying to base64 encode a large file by looping through and doing it one small chunk at a time. Everything seems to work but the files always end up corrupted. I was curious if anyone could point out where I might be going wrong here:

嘿那里,尝试通过循环遍历并对一个小块进行一次编码来对base64进行编码。一切似乎都有效,但文件总是被破坏了。我很好奇是否有人能指出我在哪里可能会出错:

    NSFileHandle *originalFile, *encodedFile;
    self.localEncodedURL = [NSString stringWithFormat:@"%@-base64.xml", self.localURL];

    // Open the original file for reading
    originalFile = [NSFileHandle fileHandleForReadingAtPath:self.localURL];
    if (originalFile == nil) {
        [self performSelectorOnMainThread:@selector(updateStatus:) withObject:@"Encoding failed." waitUntilDone:NO];
        return;
    }
    encodedFile = [NSFileHandle fileHandleForWritingAtPath:self.localEncodedURL];
    if (encodedFile == nil) {
        [self performSelectorOnMainThread:@selector(updateStatus:) withObject:@"Encoding failed." waitUntilDone:NO];
        return;
    }

    // Read data in chunks from the original file
    [originalFile seekToEndOfFile];
    NSUInteger length = [originalFile offsetInFile];
    [originalFile seekToFileOffset:0];
    NSUInteger chunkSize = 100 * 1024;
    NSUInteger offset = 0;
    do {
        NSUInteger thisChunkSize = length - offset > chunkSize ? chunkSize : length - offset;
        NSData *chunk = [originalFile readDataOfLength:thisChunkSize];
        offset += [chunk length];

        NSString *base64EncodedChunkString = [chunk base64EncodedString];
        NSData *base64EncodedChunk = [base64EncodedChunkString dataUsingEncoding:NSASCIIStringEncoding];

        [encodedFile writeData:base64EncodedChunk];

        base64EncodedChunkString = nil;
        base64EncodedChunk = nil;

    } while (offset < length);

2 个解决方案

#1


2  

I wish I could give credit to GregInYEG, because his original point about padding was the underlying issue. With base64, each chunk has to be a multiple of 3. So this resolved the issue:

我希望我能够赞扬GregInYEG,因为他关于填充的原始观点是潜在的问题。使用base64,每个块必须是3的倍数。所以这解决了这个问题:

chunkSize = 3600

Once I had that, the corruption went away. But then I ran into memory leak issues, so I added the autorelease pool apprach taken from this post: http://www.cocoadev.com/index.pl?ReadAFilePieceByPiece

一旦我这样做,腐败就消失了。但后来我遇到内存泄漏问题,所以我添加了从这篇文章中获取的autorelease pool apprach:http://www.cocoadev.com/index.pl?ReadFilePieceByPiece

Final code:

// Read data in chunks from the original file
[originalFile seekToEndOfFile];
NSUInteger fileLength = [originalFile offsetInFile];
[originalFile seekToFileOffset:0];

// For base64, each chunk *MUST* be a multiple of 3
NSUInteger chunkSize = 24000;
NSUInteger offset = 0;
NSAutoreleasePool *chunkPool = [[NSAutoreleasePool alloc] init];

while(offset < fileLength) {
    // Read the next chunk from the input file
    [originalFile seekToFileOffset:offset];
    NSData *chunk = [originalFile readDataOfLength:chunkSize];

    // Update our offset
    offset += chunkSize;

    // Base64 encode the input chunk
    NSData *serializedChunk = [NSPropertyListSerialization dataFromPropertyList:chunk format:NSPropertyListXMLFormat_v1_0 errorDescription:NULL];
    NSString *serializedString =  [[NSString alloc] initWithData:serializedChunk encoding:NSASCIIStringEncoding];
    NSRange r = [serializedString rangeOfString:@"<data>"];
    serializedString = [serializedString substringFromIndex:r.location+7];
    r = [serializedString rangeOfString:@"</data>"];
    serializedString = [serializedString substringToIndex:r.location-1];

    // Write the base64 encoded chunk to our output file
    NSData *base64EncodedChunk = [serializedString dataUsingEncoding:NSASCIIStringEncoding];
    [encodedFile truncateFileAtOffset:[encodedFile seekToEndOfFile]];
    [encodedFile writeData:base64EncodedChunk];

    // Cleanup
    base64EncodedChunk = nil;
    serializedChunk = nil;
    serializedString = nil;
    chunk = nil;

    // Update the progress bar
    [self updateProgress:[NSNumber numberWithInt:offset] total:[NSNumber numberWithInt:fileLength]];

    // Drain and recreate the pool
    [chunkPool release];
    chunkPool = [[NSAutoreleasePool alloc] init];
}
[chunkPool release];

#2


1  

How are you converting back the base64 data to an image? Some implementations limit the maximum line length they will accept. Try inserting a line break every so many characters.

你如何将base64数据转换回图像?某些实现限制了它们将接受的最大行长度。尝试每隔这么多字符插入换行符。

#1


2  

I wish I could give credit to GregInYEG, because his original point about padding was the underlying issue. With base64, each chunk has to be a multiple of 3. So this resolved the issue:

我希望我能够赞扬GregInYEG,因为他关于填充的原始观点是潜在的问题。使用base64,每个块必须是3的倍数。所以这解决了这个问题:

chunkSize = 3600

Once I had that, the corruption went away. But then I ran into memory leak issues, so I added the autorelease pool apprach taken from this post: http://www.cocoadev.com/index.pl?ReadAFilePieceByPiece

一旦我这样做,腐败就消失了。但后来我遇到内存泄漏问题,所以我添加了从这篇文章中获取的autorelease pool apprach:http://www.cocoadev.com/index.pl?ReadFilePieceByPiece

Final code:

// Read data in chunks from the original file
[originalFile seekToEndOfFile];
NSUInteger fileLength = [originalFile offsetInFile];
[originalFile seekToFileOffset:0];

// For base64, each chunk *MUST* be a multiple of 3
NSUInteger chunkSize = 24000;
NSUInteger offset = 0;
NSAutoreleasePool *chunkPool = [[NSAutoreleasePool alloc] init];

while(offset < fileLength) {
    // Read the next chunk from the input file
    [originalFile seekToFileOffset:offset];
    NSData *chunk = [originalFile readDataOfLength:chunkSize];

    // Update our offset
    offset += chunkSize;

    // Base64 encode the input chunk
    NSData *serializedChunk = [NSPropertyListSerialization dataFromPropertyList:chunk format:NSPropertyListXMLFormat_v1_0 errorDescription:NULL];
    NSString *serializedString =  [[NSString alloc] initWithData:serializedChunk encoding:NSASCIIStringEncoding];
    NSRange r = [serializedString rangeOfString:@"<data>"];
    serializedString = [serializedString substringFromIndex:r.location+7];
    r = [serializedString rangeOfString:@"</data>"];
    serializedString = [serializedString substringToIndex:r.location-1];

    // Write the base64 encoded chunk to our output file
    NSData *base64EncodedChunk = [serializedString dataUsingEncoding:NSASCIIStringEncoding];
    [encodedFile truncateFileAtOffset:[encodedFile seekToEndOfFile]];
    [encodedFile writeData:base64EncodedChunk];

    // Cleanup
    base64EncodedChunk = nil;
    serializedChunk = nil;
    serializedString = nil;
    chunk = nil;

    // Update the progress bar
    [self updateProgress:[NSNumber numberWithInt:offset] total:[NSNumber numberWithInt:fileLength]];

    // Drain and recreate the pool
    [chunkPool release];
    chunkPool = [[NSAutoreleasePool alloc] init];
}
[chunkPool release];

#2


1  

How are you converting back the base64 data to an image? Some implementations limit the maximum line length they will accept. Try inserting a line break every so many characters.

你如何将base64数据转换回图像?某些实现限制了它们将接受的最大行长度。尝试每隔这么多字符插入换行符。