Tuesday, April 4, 2017

[Python] How to anonymize logs - removing email addresses and replacing them with SHA1 hashes

This script will remove all email addresses and replace them with a SHA1 hash



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import hashlib
import re
thefile = open('/var/log/z-push/z-push-hashed.log', 'w')
with open('/var/log/z-push/z-push.log') as f:
    content = f.readlines()
content = [x.strip() for x in content]

for x in xrange(0,len(content)):
        mail=re.findall(r'[\w\.-]+@[\w\.-]+', content[x])
        if not mail:
                thefile.write(content[x])
                continue
        teststring=re.sub(str(mail[0]),hashlib.sha1(str(mail[0])).hexdigest(),content[x])
        thefile.write(teststring+'\n')