I\'m using Python 2.6 on a Mac Mini with 1GB RAM. I want to read in a huge text file
$ ls -l links.csv; file links.csv; tail links.csv
-rw-r--r-- 1 user u
All python objects have a memory overhead on top of the data they are actually storing. According to getsizeof on my 32 bit Ubuntu system a tuple has an overhead of 32 bytes and an int takes 12 bytes, so each row in your file takes a 56 bytes + a 4 byte pointer in the list - I presume it will be a lot more for a 64 bit system. This is in line with the figures you gave and means your 30 million rows will take 1.8 GB.
I suggest that instead of using python you use the unix sort utility. I am not a Mac-head but I presume the OS X sort options are the same the linux version, so this should work:
sort -n -t, -k2 links.csv
-n means sort numerically
-t, means use a comma as the field separator
-k2 means sort on the second field
This will sort the file and write the result to stdout. You could redirect it to another file or pipe it to you python program to do further processing.
edit: If you do not want to sort the file before you run your python script, you could use the subprocess module to create a pipe to the shell sort utility, then read the sorted results from the output of the pipe.