I have to work with a big CSV file, up to 2GB. More specifically I have to upload all this data to the mySQL database, but before I have to make a few calculation on that, so I need to do all this thing in MATLAB (also my supervisor want to do in MATLAB because he familiar just with MATLAB :( ).
Any idea how can I handle these big files?
You should probably use textscan to read the data in in chunks and then process. This will probably be more efficient than reading a single line at a time. For example, if you have 3 columns of data, you could do:
filename = 'fname.csv';
[fh, errMsg] = fopen( filename, 'rt' );
if fh == -1, error( 'couldn''t open file: %s: %s', filename, errMsg ); end
N = 100; % read 100 rows at a time
while ~feof( fh )
c = textscan( fh, '%f %f %f', N, 'Delimiter', ',' );
doStuff(c);
end
EDIT
These days (R2014b and later), it's easier and probably more efficient to use a datastore
.
There is good advice on handling large datasets in MATLAB in this file exchange item.
Specific topics include:
* Understanding the maximum size of an array and the workspace in MATLAB
* Using undocumented features to show you the available memory in MATLAB
* Setting the 3GB switch under Windows XP to get 1GB more memory for MATLAB
* Using textscan to read large text files and memory mapping feature to read large binary files
来源:https://stackoverflow.com/questions/5702872/working-with-a-big-csv-file-in-matlab