When I read data back in from a CSV file, every cell is interpreted as a string.
Props to Jon Clements and cortopy for teaching me about ast.literal_eval! Here's what I ended up going with (Python 2; changes for 3 should be trivial):
from ast import literal_eval
from csv import DictReader
import csv
def csv_data(filepath, **col_conversions):
"""Yield rows from the CSV file as dicts, with column headers as the keys.
Values in the CSV rows are converted to Python values when possible,
and are kept as strings otherwise.
Specific conversion functions for columns may be specified via
`col_conversions`: if a column's header is a key in this dict, its
value will be applied as a function to the CSV data. Specify
`ColumnHeader=str` if all values in the column should be interpreted
as unquoted strings, but might be valid Python literals (`True`,
`None`, `1`, etc.).
Example usage:
>>> csv_data(filepath,
... VariousWordsIncludingTrueAndFalse=str,
... NumbersOfVaryingPrecision=float,
... FloatsThatShouldBeRounded=round,
... **{'Column Header With Spaces': arbitrary_function})
"""
def parse_value(key, value):
if key in col_conversions:
return col_conversions[key](value)
try:
# Interpret the string as a Python literal
return literal_eval(value)
except Exception:
# If that doesn't work, assume it's an unquoted string
return value
with open(filepath) as f:
# QUOTE_NONE: don't process quote characters, to avoid the value
# `"2"` becoming the int `2`, rather than the string `'2'`.
for row in DictReader(f, quoting=csv.QUOTE_NONE):
yield {k: parse_value(k, v) for k, v in row.iteritems()}
(I'm a little wary that I might have missed some corner cases involving quoting. Please comment if you see any issues!)