如何从java中的大型加密文件中有效地读取给定范围的大块字节?

时间:2021-06-28 23:47:27

I have a large encrypted file(10GB+) in server. I need to transfer the decrypted file to the client in small chunks. When a client make a request for a chunk of bytes (say 18 to 45) I have to random access the file, read the specific bytes, decrypt it and transfer it to the client using ServletResponseStream.

我在服务器上有一个大的加密文件(10GB +)。我需要将解密后的文件以小块的形式传输到客户端。当客户端请求一个字节块(比如18到45)时,我必须随机访问该文件,读取特定字节,解密并使用ServletResponseStream将其传输到客户端。

But since the file is encrypted I have to read the file as blocks of 16 bytes in order to decrypt correctly.

但由于文件已加密,我必须将文件读取为16字节的块才能正确解密。

So if client requests to get from byte 18 to 45, in the server I have to read the file in multiples of 16 bytes block. So I have to random access the file from byte 16 to 48. Then decrypt it. After decryption I have to skip 2 bytes from the first and 3 bytes from the last to return the appropriate chunk of data client requested.

因此,如果客户端请求从字节18到45,在服务器中我必须读取16字节块的倍数的文件。所以我必须从第16位到第48位随机访问该文件。然后解密它。解密后,我必须从第一个跳过2个字节,从最后一个跳过3个字节,以返回所请求的相应数据块客户端。

Here is what I am trying to do

这是我想要做的

Adjust start and end for encrypted files

调整加密文件的开始和结束

long start = 15; // input from client
long end = 45; // input from client
long skipStart = 0; // need to skip for encrypted file
long skipEnd = 0;

// encrypted files, it must be access in blocks of 16 bytes
if(fileisEncrypted){
   skipStart = start % 16;  // skip 2 byte at start
   skipEnd = 16 - end % 16; // skip 3 byte at end
   start = start - skipStart; // start becomes 16
   end = end + skipEnd; // end becomes 48
}

Access the encrypted file data from start to end

从头到尾访问加密的文件数据

try(final FileChannel channel = FileChannel.open(services.getPhysicalFile(datafile).toPath())){
    MappedByteBuffer mappedByteBuffer = channel.map(FileChannel.MapMode.READ_ONLY, start, end-start);

    // *** No idea how to convert MappedByteBuffer into input stream ***
    // InputStream is = (How do I get inputstream for byte 16 to 48 here?)

    // the medhod I used earlier to decrypt the all file atonce, now somehow I need the inputstream of specific range
    is = new FileEncryptionUtil().getCipherInputStream(is,
                        EncodeUtil.decodeSeedValue(encryptionKeyRef), AESCipher.DECRYPT_MODE);

    // transfering decrypted input stream to servlet response
    OutputStream outputStream = response.getOutputStream();
    // *** now for chunk transfer, here I also need to 
    //     skip 2 bytes at the start and 3 bytes from the end. 
    //     How to do it? ***/
    org.apache.commons.io.IOUtils.copy(is, outputStream)
}

I am missing few steps in the code given above. I know I could try to read byte by byte and the ignore 2byte from first and 3 byte from last. But I am not sure if it will be efficient enough. Moreover, the client could request a large chunk say from byte 18 to 2048 which would require to read and decrypt almost two gigabytes of data. I am afraid creating a large byte array will consume too much memory.

我错过了上面给出的代码中的几个步骤。我知道我可以尝试逐字节读取,从第一个忽略2byte,从最后一个3字节。但我不确定它是否足够有效。此外,客户端可以请求从字节18到2048的大块,这需要读取和解密几乎2千兆字节的数据。我担心创建一个大字节数组会消耗太多内存。

How can I efficiently do it without putting too much pressure on server processing or memory? Any ideas?

如何在不对服务器处理或内存施加太大压力的情况下有效地执行此操作?有任何想法吗?

2 个解决方案

#1


4  

As you haven't specified which cipher mode you're using, I'll assume that you're using AES in CTR mode, as it's designed to read random chunks of big files without having to decrypt them completely.

由于您没有指定您正在使用的密码模式,我将假设您在CTR模式下使用AES,因为它旨在读取大文件的随机块而无需完全解密它们。

With AES-CTR, you can stream the file through the decryption code and send the blocks back to the client as soon as they are available. So you only need a few arrays the size of the AES block in memory, all the rest is read from the disk. You would need to add special logic to skip some byes on the first and last block (but you don't need to load the whole thing in memory).

使用AES-CTR,您可以通过解密代码流式传输文件,并在可用时立即将块发送回客户端。所以你只需要一些内存中AES块大小的数组,其余的都是从磁盘中读取的。您需要添加特殊逻辑以跳过第一个和最后一个块上的一些字节(但您不需要在内存中加载整个内容)。

There's an example of how to do this in another SO question (this only performs the seek): Seeking in AES-CTR-encrypted input . After that you can skip the first few bytes, read until the last block and adjust that to the number of bytes your client requested.

有一个如何在另一个SO问题中执行此操作的示例(这仅执行搜索):在AES-CTR加密的输入中寻找。之后,您可以跳过前几个字节,读取到最后一个块并将其调整为客户端请求的字节数。

#2


0  

After researching for awhile. This is how I solved it. First I created a ByteBufferInputStream class. To read from MappedByteBuffer

经过一段时间的研究。这就是我解决它的方式。首先,我创建了一个ByteBufferInputStream类。从MappedByteBuffer中读取

public class ByteBufferInputStream extends InputStream {
    private ByteBuffer byteBuffer;

    public ByteBufferInputStream () {
    }

    /** Creates a stream with a new non-direct buffer of the specified size. The position and limit of the buffer is zero. */
    public ByteBufferInputStream (int bufferSize) {
        this(ByteBuffer.allocate(bufferSize));
        byteBuffer.flip();
    }

    /** Creates an uninitialized stream that cannot be used until {@link #setByteBuffer(ByteBuffer)} is called. */
    public ByteBufferInputStream (ByteBuffer byteBuffer) {
        this.byteBuffer = byteBuffer;
    }

    public ByteBuffer getByteBuffer () {
        return byteBuffer;
    }

    public void setByteBuffer (ByteBuffer byteBuffer) {
        this.byteBuffer = byteBuffer;
    }

    public int read () throws IOException {
        if (!byteBuffer.hasRemaining()) return -1;
        return byteBuffer.get();
    }

    public int read (byte[] bytes, int offset, int length) throws IOException {
        int count = Math.min(byteBuffer.remaining(), length);
        if (count == 0) return -1;
        byteBuffer.get(bytes, offset, count);
        return count;
    }

    public int available () throws IOException {
        return byteBuffer.remaining();
    }
}

Then created BlockInputStream class by extending InputStream which will allow to skip the extra bytes and read internal input stream in multiples of 16 bytes block.

然后通过扩展InputStream创建BlockInputStream类,这将允许跳过额外的字节并以16字节块的倍数读取内部输入流。

public class BlockInputStream extends InputStream {
    private final BufferedInputStream inputStream;
    private final long totalLength;
    private final long skip;
    private long read = 0;
    private byte[] buff = new byte[16];
    private ByteArrayInputStream blockInputStream;

    public BlockInputStream(InputStream inputStream, long skip, long length) throws IOException {
        this.inputStream = new BufferedInputStream(inputStream);
        this.skip = skip;
        this.totalLength = length + skip;
        if(skip > 0) {
            byte[] b = new byte[(int)skip];
            read(b);
            b = null;
        }
    }


    private int readBlock() throws IOException {
        int count = inputStream.read(buff);
        blockInputStream = new ByteArrayInputStream(buff);
        return count;
    }

    @Override
    public int read () throws IOException {
        byte[] b = new byte[1];
        read(b);
        return (int)b[1];
    }

    @Override
    public int read(byte[] b) throws IOException {
        return read(b, 0, b.length);
    }

    @Override
    public int read (byte[] bytes, int offset, int length) throws IOException {
        long remaining = totalLength - read;
        if(remaining < 1){
            return -1;
        }
        int bytesToRead = (int)Math.min(length, remaining);
        int n = 0;
        while(bytesToRead > 0){
            if(read % 16 == 0 && bytesToRead % 16 == 0){
                int count = inputStream.read(bytes, offset, bytesToRead);
                read += count;
                offset += count;
                bytesToRead -= count;
                n += count;
            } else {
                if(blockInputStream != null && blockInputStream.available() > 0) {
                    int len = Math.min(bytesToRead, blockInputStream.available());
                    int count = blockInputStream.read(bytes, offset, len);
                    read += count;
                    offset += count;
                    bytesToRead -= count;
                    n += count;
                } else {
                    readBlock();
                }
            }
        }
        return n;
    }

    @Override
    public int available () throws IOException {
        long remaining = totalLength - read;
        if(remaining < 1){
            return -1;
        }
        return inputStream.available();
    }

    @Override
    public long skip(long n) throws IOException {
        return inputStream.skip(n);
    }

    @Override
    public void close() throws IOException {
        inputStream.close();
    }

    @Override
    public synchronized void mark(int readlimit) {
        inputStream.mark(readlimit);
    }

    @Override
    public synchronized void reset() throws IOException {
        inputStream.reset();
    }

    @Override
    public boolean markSupported() {
        return inputStream.markSupported();
    }
}

This is my final working implementation using this two classes

这是我使用这两个类的最终工作实现

private RangeData getRangeData(RangeInfo r) throws IOException, GeneralSecurityException, CryptoException {

    // used for encrypted files
    long blockStart = r.getStart();
    long blockEnd = r.getEnd();
    long blockLength = blockEnd - blockStart + 1;

    // encrypted files, it must be access in blocks of 16 bytes
    if(datafile.isEncrypted()){
        blockStart -= blockStart % 16;
        blockEnd = blockEnd | 15; // nearest multiple of 16 for length n = ((n−1)|15)+1
        blockLength = blockEnd - blockStart + 1;
    }

    try ( final FileChannel channel = FileChannel.open(services.getPhysicalFile(datafile).toPath()) )
    {
        MappedByteBuffer mappedByteBuffer = channel.map(FileChannel.MapMode.READ_ONLY, blockStart, blockLength);
        InputStream inputStream = new ByteBufferInputStream(mappedByteBuffer);
        if(datafile.isEncrypted()) {
            String encryptionKeyRef = (String) settingsManager.getSetting(AppSetting.DEFAULT_ENCRYPTION_KEY);
            inputStream = new FileEncryptionUtil().getCipherInputStream(inputStream,
                    EncodeUtil.decodeSeedValue(encryptionKeyRef), AESCipher.DECRYPT_MODE);
            long skipStart = r.getStart() - blockStart;
            inputStream = new BlockInputStream(inputStream, skipStart, r.getLength()); // this will trim the data to n bytes at last
        }
        return new RangeData(r, inputStream);
    }
}

#1


4  

As you haven't specified which cipher mode you're using, I'll assume that you're using AES in CTR mode, as it's designed to read random chunks of big files without having to decrypt them completely.

由于您没有指定您正在使用的密码模式,我将假设您在CTR模式下使用AES,因为它旨在读取大文件的随机块而无需完全解密它们。

With AES-CTR, you can stream the file through the decryption code and send the blocks back to the client as soon as they are available. So you only need a few arrays the size of the AES block in memory, all the rest is read from the disk. You would need to add special logic to skip some byes on the first and last block (but you don't need to load the whole thing in memory).

使用AES-CTR,您可以通过解密代码流式传输文件,并在可用时立即将块发送回客户端。所以你只需要一些内存中AES块大小的数组,其余的都是从磁盘中读取的。您需要添加特殊逻辑以跳过第一个和最后一个块上的一些字节(但您不需要在内存中加载整个内容)。

There's an example of how to do this in another SO question (this only performs the seek): Seeking in AES-CTR-encrypted input . After that you can skip the first few bytes, read until the last block and adjust that to the number of bytes your client requested.

有一个如何在另一个SO问题中执行此操作的示例(这仅执行搜索):在AES-CTR加密的输入中寻找。之后,您可以跳过前几个字节,读取到最后一个块并将其调整为客户端请求的字节数。

#2


0  

After researching for awhile. This is how I solved it. First I created a ByteBufferInputStream class. To read from MappedByteBuffer

经过一段时间的研究。这就是我解决它的方式。首先,我创建了一个ByteBufferInputStream类。从MappedByteBuffer中读取

public class ByteBufferInputStream extends InputStream {
    private ByteBuffer byteBuffer;

    public ByteBufferInputStream () {
    }

    /** Creates a stream with a new non-direct buffer of the specified size. The position and limit of the buffer is zero. */
    public ByteBufferInputStream (int bufferSize) {
        this(ByteBuffer.allocate(bufferSize));
        byteBuffer.flip();
    }

    /** Creates an uninitialized stream that cannot be used until {@link #setByteBuffer(ByteBuffer)} is called. */
    public ByteBufferInputStream (ByteBuffer byteBuffer) {
        this.byteBuffer = byteBuffer;
    }

    public ByteBuffer getByteBuffer () {
        return byteBuffer;
    }

    public void setByteBuffer (ByteBuffer byteBuffer) {
        this.byteBuffer = byteBuffer;
    }

    public int read () throws IOException {
        if (!byteBuffer.hasRemaining()) return -1;
        return byteBuffer.get();
    }

    public int read (byte[] bytes, int offset, int length) throws IOException {
        int count = Math.min(byteBuffer.remaining(), length);
        if (count == 0) return -1;
        byteBuffer.get(bytes, offset, count);
        return count;
    }

    public int available () throws IOException {
        return byteBuffer.remaining();
    }
}

Then created BlockInputStream class by extending InputStream which will allow to skip the extra bytes and read internal input stream in multiples of 16 bytes block.

然后通过扩展InputStream创建BlockInputStream类,这将允许跳过额外的字节并以16字节块的倍数读取内部输入流。

public class BlockInputStream extends InputStream {
    private final BufferedInputStream inputStream;
    private final long totalLength;
    private final long skip;
    private long read = 0;
    private byte[] buff = new byte[16];
    private ByteArrayInputStream blockInputStream;

    public BlockInputStream(InputStream inputStream, long skip, long length) throws IOException {
        this.inputStream = new BufferedInputStream(inputStream);
        this.skip = skip;
        this.totalLength = length + skip;
        if(skip > 0) {
            byte[] b = new byte[(int)skip];
            read(b);
            b = null;
        }
    }


    private int readBlock() throws IOException {
        int count = inputStream.read(buff);
        blockInputStream = new ByteArrayInputStream(buff);
        return count;
    }

    @Override
    public int read () throws IOException {
        byte[] b = new byte[1];
        read(b);
        return (int)b[1];
    }

    @Override
    public int read(byte[] b) throws IOException {
        return read(b, 0, b.length);
    }

    @Override
    public int read (byte[] bytes, int offset, int length) throws IOException {
        long remaining = totalLength - read;
        if(remaining < 1){
            return -1;
        }
        int bytesToRead = (int)Math.min(length, remaining);
        int n = 0;
        while(bytesToRead > 0){
            if(read % 16 == 0 && bytesToRead % 16 == 0){
                int count = inputStream.read(bytes, offset, bytesToRead);
                read += count;
                offset += count;
                bytesToRead -= count;
                n += count;
            } else {
                if(blockInputStream != null && blockInputStream.available() > 0) {
                    int len = Math.min(bytesToRead, blockInputStream.available());
                    int count = blockInputStream.read(bytes, offset, len);
                    read += count;
                    offset += count;
                    bytesToRead -= count;
                    n += count;
                } else {
                    readBlock();
                }
            }
        }
        return n;
    }

    @Override
    public int available () throws IOException {
        long remaining = totalLength - read;
        if(remaining < 1){
            return -1;
        }
        return inputStream.available();
    }

    @Override
    public long skip(long n) throws IOException {
        return inputStream.skip(n);
    }

    @Override
    public void close() throws IOException {
        inputStream.close();
    }

    @Override
    public synchronized void mark(int readlimit) {
        inputStream.mark(readlimit);
    }

    @Override
    public synchronized void reset() throws IOException {
        inputStream.reset();
    }

    @Override
    public boolean markSupported() {
        return inputStream.markSupported();
    }
}

This is my final working implementation using this two classes

这是我使用这两个类的最终工作实现

private RangeData getRangeData(RangeInfo r) throws IOException, GeneralSecurityException, CryptoException {

    // used for encrypted files
    long blockStart = r.getStart();
    long blockEnd = r.getEnd();
    long blockLength = blockEnd - blockStart + 1;

    // encrypted files, it must be access in blocks of 16 bytes
    if(datafile.isEncrypted()){
        blockStart -= blockStart % 16;
        blockEnd = blockEnd | 15; // nearest multiple of 16 for length n = ((n−1)|15)+1
        blockLength = blockEnd - blockStart + 1;
    }

    try ( final FileChannel channel = FileChannel.open(services.getPhysicalFile(datafile).toPath()) )
    {
        MappedByteBuffer mappedByteBuffer = channel.map(FileChannel.MapMode.READ_ONLY, blockStart, blockLength);
        InputStream inputStream = new ByteBufferInputStream(mappedByteBuffer);
        if(datafile.isEncrypted()) {
            String encryptionKeyRef = (String) settingsManager.getSetting(AppSetting.DEFAULT_ENCRYPTION_KEY);
            inputStream = new FileEncryptionUtil().getCipherInputStream(inputStream,
                    EncodeUtil.decodeSeedValue(encryptionKeyRef), AESCipher.DECRYPT_MODE);
            long skipStart = r.getStart() - blockStart;
            inputStream = new BlockInputStream(inputStream, skipStart, r.getLength()); // this will trim the data to n bytes at last
        }
        return new RangeData(r, inputStream);
    }
}