是否可以在不将文件加载到内存的情况下读取文件?
问题描述:
我想读取一个文件,但是它太大了,无法完全将其加载到内存中.
I want to read a file but it is too big to load it completely into memory.
有没有一种方法可以读取它而不将其加载到内存中?还是有更好的解决方案?
Is there a way to read it without loading it into memory? Or there is a better solution?
答
我需要内容来执行校验和,所以我需要完整的消息
许多校验和库都支持对校验和的增量更新.例如,GLib具有 g_checksum_update()
.因此,您可以使用 fread
一次读取一个文件块并更新读取时校验和.
Many checksum libraries support incremental updates to the checksum. For example, the GLib has g_checksum_update()
. So you can read the file a block at a time with fread
and update the checksum as you read.
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <stdlib.h>
#include <glib.h>
int main(void) {
char filename[] = "test.txt";
// Create a SHA256 checksum
GChecksum *sum = g_checksum_new(G_CHECKSUM_SHA256);
if( sum == NULL ) {
fprintf(stderr, "Could not create checksum.\n");
exit(1);
}
// Open the file we'll be checksuming.
FILE *fp = fopen( filename, "rb" );
if( fp == NULL ) {
fprintf(stderr, "Could not open %s: %s.\n", filename, strerror(errno));
exit(1);
}
// Read one buffer full at a time (BUFSIZ is from stdio.h)
// and update the checksum.
unsigned char buf[BUFSIZ];
size_t size_read = 0;
while( (size_read = fread(buf, 1, sizeof(buf), fp)) != 0 ) {
// Update the checksum
g_checksum_update(sum, buf, (gssize)size_read);
}
// Print the checksum.
printf("%s %s\n", g_checksum_get_string(sum), filename);
}
我们可以通过将结果与sha256sum
进行比较来检查其是否有效.
And we can check it works by comparing the result with sha256sum
.
$ ./test
0c46af5bce717d706cc44e8c60dde57dbc13ad8106a8e056122a39175e2caef8 test.txt
$ sha256sum test.txt
0c46af5bce717d706cc44e8c60dde57dbc13ad8106a8e056122a39175e2caef8 test.txt