Today I Leaned (the hard way): The difference between dict and list and Python

Lookup time for dict() is O(1) and for list() is O(n) no matter what the number of elements is.

Given a list of elements in a file, this piece of code:

from itertools import izip
lists = map(str.strip, open('dict.txt').readlines())
i = iter(lists)
j = range(len(lists))
dicts = dict(izip(i, j))

index = dicts[w]

is n time faster than this piece of code:

dicts = map(str.strip, open('dict.txt').readlines())

index = dicts.index(w)

I learned that the hard way. My dict has 5,000,000 elements, which mean my code will run 5 millions time slower than it should be. As such, I ran my code on 20 machines, each with 64 cores and it took ~10 hours without finish. Changing from list to dict, it took about 900 seconds on a single machine (of course 64 cores)

Next time, when you want to do lookup in Python, use dict(). Happy coding!