Hi all, I'm trying to write a little script to read files in a directory (x bytes at a time), do an md5 checksum of the bytes and print them to a text file. Everything is working fine at the moment except the reading, I have managed to get it to read the first x bytes from the file but I'm not sure how to get it to keep reading while the EOF hasn't been reached. This is what I want to achieve: * Specify a blocksize to use (not a problem) * Read a file in chunks (using the blocksize) * md5 checksum the bytes (i've worked this part out) * write the md5sum to a file (i've got this also) How can I retrieve the chunks until the EOF, maybe returning a smaller chunk at the end if there isn't enough data left. I hope this post isn't too badly written, it's very late at night and i've been googling this for ages :P Any help much appreciated. Matt
on 19.08.2008 02:17
on 19.08.2008 02:24
I've just played around and found this seems to work:
File.open(path, "r") do |fh|
while (chunk = fh.read(blocksize))
outFH.puts Digest::MD5.hexdigest(chunk) + "\n"
end
end
Is this a good way to do it?
Thanks
Matt
on 19.08.2008 16:56
On 19 Aug., 02:21, Matt Harrison <iwasinnamuk...@genestate.com> wrote: > I've just played around and found this seems to work: > > File.open(path, "r") do |fh| > while (chunk = fh.read(blocksize)) > outFH.puts Digest::MD5.hexdigest(chunk) + "\n" > end > end > > Is this a good way to do it? Somehow my posting from today morning neither made it to Google news nor the mailing list. Strange... To sum it up: yes, that's a good way to do it. Few remarks: You do not need + "\n" because #puts will do this already. I prefer to open with "rb" instead of "r" in these cases. Makes scripts more portable plus helps documenting that this is really a binary stream. You can preallocate the buffer, this saves a bit of GC: File.open(path, "rb") do |fh| chunk = "" while fh.read(blocksize, chunk) outFH.puts Digest::MD5.hexdigest(chunk) end end Kind regards robert