Reducing my online imprint

this is not really related to scanlines at all, but i wanted to write it down somewhere and didnt know where else to put it lol…

i had the idea today to try reduce my online imprint (well something i have thought about before but never acted on yet… :sweat_smile: ) - in particular to close as many accounts as possible that i have opened over the years… and for the ones that are left change the passwords/add to my password manager

(an aside: switching to bitwarden now too → im sure they are fine but seeing the size of “security issues” segment on lastpass’ wikipedia page made me nervous haha )

so i started by trying to find a list of everything i have signed up for. found a few articles online around this - seems like if you mostly use google/facebook authenticate to register this makes things a lot easier since they will have this list available for you. unfortunately for me i never really used those things - opting to use email signups mostly.

there is a service called Deseat.me that reads all your emails and tells you what you signed up for. sounds a little suspect to let some random app read all my emails, also it doesnt seem to work any more anyway. but it did get me thinking … :thinking: :face_with_monocle:

since im still using gmail (challenge for another day to find another email provider - recommendations welcome :wink: ) i downloaded a copy of all my emails using takeout.google.com (also found that name much funnier than it probably is :joy: ) - this was a 2.4gb .mbox file.

next i wrote a small python script to collect all the unique domain names of any email iv ever received:

import mailbox
from tqdm import tqdm
from_set = set()

mbox = mailbox.mbox('email_backup.mbox') #sample.mbox , email_backup
for i, message in tqdm(enumerate(mbox)):
    try:
        from_set.add(message['From'])
    except:
        print(f'error on item {i}')

domain = {a.split('@')[1].split('>')[0] for a in from_set}

with open("Output.txt", "w") as text_file:
    for value in domain:
        text_file.write(f'{value}\n')

the output looked like:

alphabethead.bandcamp.com
spektrum.community
eventcinemas.co.nz
ipom.co.nz
harcourts.co.nz
slack.com
stackoverflow.email
kinoheld.com
dpd.de
aegeanair.com
account.pinterest.com
...

there were 878 unique domains in my inbox ! :astonished:

i then tried to filter it down a little by modifying the script slightly:


phrases = [
'welcome',
'password',
'username',
'activate',
'confirm',
'subscription',
'unsubscribe',
'log in',
'verify',
'joining',
'account',
'$',
'free trial',
'register',
'forum',
'thread',
'community',
]
...

if any([phrase in message.get_payload() for phrase in phrases]): 
    from_set.add(message['From'])
...

from all the emails i had that contained any of those phrases i now have 211 unique domains.

i found these sites that provide direct links / info about how to get your accounts removed form various platforms:

i guess the next step is to start combing through my 211 domain list by hand and figure out where i have accounts etc. will keep ya posted :blush:

extra stuff:

  • if you havnt already definitely try your email on https://haveibeenpwned.com/ (and at least change/manage your passwords !)
  • since loading my >2gbs of emails into python takes afew minutes each time i used this sample.mbox to test my code on first

anyone else thought about doing this ? or has any other thoughts ?

6 Likes

i have been thinking about this a lot too. i have been feeling more and more vulnerable online, and like nothing is private anymore. particularly the idea of ultrasonic tracking. and not wanting to be constantly logged into all of these sites that are communicating and data mining

2 Likes