Saturday, December 11, 2010

Consider the tuple...

I remember learning Python and wondering what a tuple was for. Why wouldn't you just use a list? Or a dict?

What I've come down to is a few thoughts such as:
1) if I want sortable items, use a list. (list.sort in Python is way fast!)
2) if I want to use an item as a key and find the related info (value), put the key/value pair into a dict.
3) if I want to have possibly-heterogenous-but-related items in a specific unchangeable order, consider a tuple.


Tuples can often be overlooked as a data structure in Python - but can be really useful for things like x,y coordinates or GPS coordinates or timestamps or addresses. It's important to have them in order - you don't want to mix up the x and the y, or the hour and the minutes. You can't do much *to* tuples, they have no methods and are immutable(unchangeable), although you can search in them with in. If you have a tuple that needs something changed, you'll just have to replace it with a new tuple.

Of course, a kewl thing in Python is that you can mix and match. You can have a list of tuples. Or a dict of tuples. Tuples are really useful as keys in a dict, because they're immutable, unlike lists. So, for example, you could have a dict, with keys that are tuple of gps coordinates, and values of a place name at that location, or a house price at a particular address, or you can use "in" to find all the keys that have, say, Sunnyvale, as the name of the city, and find out some value related to Sunnyvale.

Lists and dicts get all the press in Python, because they're really useful and fast. But tuples are pretty awesome in their own way...

EDIT to add examples:

In a list mylist = [2,1], if you call mylist.sort() you get [1,2]. If 2,1 is a coordinate point, then sorting it to 1,2 is a REALLY BAD THING and could lead to really bad bugs.

OTOH: mytuple = 1,2 and you call mytuple.sort(), you get an error because (1,2) is not sortable. This is a GOOD THING, leading it to be usable as a hash (key) in a dict, and for identifying specific things.

A list has no business being coerced to a tuple in order to act as a key in a dict. I think that's an abuse of list. You'd be better off thinking of tuples as a record in a database. "213 1st Street" means nothing when sorted in a list ['1st', '213','Street'), but means everything when left as a tuple ('213', '1st', 'Street') as it should be. Tuples are great for data - when the position *means* something, not just an "ordering". People ask, "why don't you just have named field". A phone number doesn't really need named fields for the 800 to mean something and for it to be important that it not have its order rearranged. Take the following two tuples: (408,555,1212) and (555,408,1212). If you treat them just as lists, you could end up sorting them and they'd be "identical". But they're not identical - they're phone numbers for completely different parts of the country and the structure is meaningful. Which means, tuples would be appropriate here, and that structure is what makes them good as keys.