Tuesday, April 10, 2012

Python csv reader and namedtuple: together at last

Here is a really tasty code snippet that is a good example of the power of combining named tuples with the python csv reader.
with open(myfilename, 'rb') as f:
  LogLine = namedtuple('LogLine','fqdn,ip,user')
  for line in map(LogLine._make, csv.reader(f)):
    print 'user %s visited %s from %s' % (line.user,
                                          line.fqdn,
                                          line.ip)
What's going on here?
  1. csv.reader(f) reads from the file descriptor f line-by-line and returns an array like ['user','blah.com','192.168.1.1']
  2. We use map to call LogLine._make on each member of the csv iterator (i.e. each line), which creates a LogLine tuple from the array.
The advantages are:
  • No need to split() and strip(), the csv reader figures it out for you.
  • No referring to line[0], line[1] etc. forgetting which one is which, and having to change a bunch of code if you add a field or the order changes. The tuple gives you names that are human readable and the file structure is contained in one place, the tuple constructor, so changes to fields are simple.
  • Also, if you aren't familiar with the 'with' statement, it is very nice for working with files since it replaces the 'try: read, finally: close' boilerplate code.

1 comment:

Anonymous said...

I often create the namedtuple from the header:

header = reader.next()
DataRowTuple = namedtuple('DataRowTuple',header)