I am currently playing with a hashing program to go through our file server and dump all the file hashes into a database so that I can compare them later on and remove duplicates.

But I have found a handful of files that I am unable to pull the hash from, or rather to do an rb.read().

Here is my code
Code:
import os, hashlib, time, pyodbc, gc

gc.enable()

StartTime = time.localtime()

cnxn = pyodbc.connect("DSN=Work2", autocommit=True)
cursor = cnxn.cursor()

for root, dirs, files in os.walk('K:\\'):
	for name in files:
		try:
			FileInfo = os.stat(os.path.join(root, name))
			FilePath = open(os.path.join(root, name), 'r')
			FileHash = open(os.path.join(root, name), 'rb').read()
			FileCorrectPath = os.path.join(root, name).replace('"', '`').replace("'", "`")
			cursor.execute("INSERT INTO NetworkFileInfo (FileName, Hash, CreationDate, ModifiedDate, DateStamp) VALUES('" + FileCorrectPath + "', '" + hashlib.md5(FileHash).hexdigest() + "', '" + time.strftime('%Y-%m-%d', time.gmtime(FileInfo.st_ctime)) + "', '" + time.strftime('%Y-%m-%d', time.gmtime(FileInfo.st_mtime)) + "', '" + time.strftime('%Y-%m-%d') + "')")
		except:
			EndTime = time.localtime()
			print time.mktime(EndTime) - time.mktime(StartTime)
			cnxn.close()
			print os.path.join(root, name)
			raise
		try:
			del(FileInfo)
			del(FilePath)
			del(FileHash)
			del(FileCorrectPath)
			gc.collect()
		except:
			raise
I am just starting again at Python, so be gentle.

My problem comes from the
FileHash = open(os.path.join(root, name), 'rb').read()
On certain files when I run that I receive an error
K:\Altium Support Files\Altium Designer 6 Updates\Build 6.8.1.11735\AltiumDesigner6Update(9346to11735).exe
Traceback (most recent call last):
File "C:\Python\Hash.py", line 16, in <module>
FileHash = open(os.path.join(root, name), 'rb').read()
MemoryError
This particular file is about 500 MB in size, so I can see why it is dieing, but so far I have not learned a way to gracefully handle these errors and keep on processing, nor have I learned how to fix these errors.

I must admit, I don't really understand garbage collection or how to properly clear out all of my variables in Python, so that might be part of the problem.

Is there any other way to pull file hashes without reading the whole file into memory? Or is there a better way to handle exceptions?
Links, books, critisisms, and help are all welcome.