Java resumable hash computation
I would like to achieve resumable on-the-fly hash generation of some file being uploaded on the server. The files are big so I am using the update(byte[]) method of MessageDigest class (as described here, for instance: How can I generate an MD5 hash? ) on the fly, as new bytes arrive from the HttpServletRequest's InputStream.
Everything is going well, however, it's becoming interesting at the moment when I want to add resumable upload support. If upload is prematurely terminated, the incomplete file is stored on the disk. However, the controller (and underlying service) exits, so the MessageDigest object is lost. Before that happens, can I serialize the MessageDigest object to the disk (or DB, it doesn't matter) in the way that when I deserialize the object again, it will remember its temporary state, so when I resume uploading (from the exact place where it has been terminated before, so no bytes are redundant, nor are some bytes missing) and continue update()ing that deserialized MessageDigest, ultimately I get the same result (hash) as if the file was uploaded whole at once?
Grab one of the custom MD5 implementations like this one or this one. Make it serializable or just make its internal state public. Preserve the state when the upload is aborted, and restore it when the upload is resumed.
Hashes are cheap to compute (MD5 doubly so; are you sure you don't want SHA1?). I would recommend rehashing everything from the beginning as soon as you detect that an upload has been resumed. Runtime should be low unless the uploads are truly huge - hopefully large, interrupted uploads will be scarce.
链接地址: http://www.djcxy.com/p/62092.html上一篇: 关闭超大文件上传的Apache连接
下一篇: Java可恢复散列计算