Here’s a Python gotcha that I spent some time tracking down. I’m writing it up in the spirit of saving developers a debugging headache in the future.
I had an integer with a single bit set, and I wanted to find the index of that bit. For example, 4 in binary is 00000100. The 1 is at the third position from the right, which should give an index of 2 – since the first position is 0.
You can do this in two ways; either check each bit in turn until you find a 1, or you can use math as a shortcut. I chose the math solution:
>>> import math >>> myint = 4 >>> int(math.log(myint, 2)) 2
Simple right? Finally staying awake in high school maths paid off. So simple that it was the last bit of code I suspected to be broken (spoiler: it was).
This is Python 2.7 which still has two types of integer; type int and arbitrary long integer type long. I was testing with ints because that’s what you get when you type 4. However the numbers I was getting out of the Django db where longs. Look what happens with the above code when you use 4L rather than 4:
>>> myint = 4L
>>> int(math.log(myint, 2))
And that was the result of my headache. Longs and ints are generally interchangeable. But not in this case. math.log gives a different result with long and ints. Which we can see here.
>>> math.log(4, 2)
>>> math.log(4L, 2)
That tiniest of rounding errors for the long version would be insignificant for most applications, but not if you are converting to an integer and discarding the fractional part. The fix is simple. Round the return value of math.log to the nearest whole.
>>> int(round(math.log(4L, 2)))
If you are working with Python 3, this problem goes away. Another reason to migrate if you have the option!