Read a txt file fscanf vs. fread vs. textscan [duplicate]

混江龙づ霸主 提交于 2019-12-12 20:43:14

问题


I have a .txt file that has been generated from SQL-2005 (in ANSI format). I have tried textscan and fscanf. The entire txt file has only numeric data.

Online resources suggest that fscanf is FASTER than textscan but I found it otherwise.

  • Textscan was much faster than fscanf

I want to try this with fread as well but I do not know how to import data using fread. Can you please suggest/comment? Thanks.

fName     = 'Test.txt'    % From SQL in ANSI format, 5million rows, 5 Cols
Numofrows = 1000000 ; %1million
Numcols   = 5 ;

fid = fopen(fName, 'r');
C   = textscan(fid, '%f %f %f %f %f', Numofrows ) ;
C   = cell2mat(C);

fclose(fid); fid = fopen(fName, 'r');
[C, Count] = fscanf(fid, '%f %f %f %f %f', Numofrows * Numcols ) ;
C = reshape(C, Count./Numofrows , Numofrows ) ; C=C';

回答1:


Ideally you would be able to get your data into a binary format and then use fread to directly read double precision number in. I would expect fread to be a lot faster in that case. (String-to-number conversions are expensive, and a raw binary format will result in a much smaller file).

Otherwise you can read characters using fread and then run a string-to-number conversion on the incoming data (sscanf seems to be the best). The only trick is that you need to get your read batches to end on a line break, otherwise your text-to-string operation is likely to give unpredictable results. You can do that be first reading a large batch of characters, then either backing up until you reach a line break, or reading in additional characters until you find the end of the line. I have found this is slightly faster than either textscan of fscanf ... but our numbers do not match for other reasons; I'm not sure what to believe.

Example code of the second method is included in a previous answer (including a lot of overlap with this question), as well as some timing results. https://stackoverflow.com/a/9441839/931379.




回答2:


There is another option that you did not list: load

   L = load(fName);

It is very simple, and will figure out the format automatically for you. It does have some limitations - The format should have same amount of numbers in each line.



来源:https://stackoverflow.com/questions/9543281/read-a-txt-file-fscanf-vs-fread-vs-textscan

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!