C++ iostream vs. C stdio performance/overhead

后端未结

关注

 3  1117

I\'m trying to comprehend how to improve the performance of this C++ code to bring it on par with the C code it is based on. The C code looks like this:

#inc


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2020-12-18 07:11
              
            
            
                                                                       
Update: I did some more testing and (if you have enough memory) there is a surprisingly simple solution that - at least on my machine with VS2015 - outperforms the c-solution: Just buffer the file in a stringstream.

ifstream input("biginput.txt");
std::stringstream buffer;
buffer << input.rdbuf();
point p;
while (buffer >> p) {
    i++
}


So the problem seems to be not so much related to the c++ streaming mechanism itself, but to the internals of ifstream in particular.



Here is my original (outdated) Answer: 
@Frederik already explained, that the performance mismatch is (at least partially) tied to a difference in functionality. 

As to how to get the performance back: On my machine with VS2015 the following runs in about 2/3 of the time the C-solution requries (although, on my machine, there is "only" a 3x performance gap between your original versions to begin with):

istream &operator >> (istream &in, point &p) {
    thread_local std::stringstream ss;
    thread_local std::string s;

    if (std::getline(in, s)) {
        ss.str(s);
        ss >> p.x >> p.y;
    }
    return in;
}


I'm not too happy about the thread_local variables, but they are necessary to eliminate the overhead of repeatedly dynamic memory allocation. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  误落风尘        
                
              
                            
                2020-12-18 07:20
              
            
            
                                                                       
What's causing a significant difference in performance is a significant difference in the overall functionality.

I will do my best to compare both of your seemingly equivalent approaches in details.

In C:

Looping


Read characters until a newline or end-of-file is detected or max length (1024) is reached
Tokenize looking for the hardcoded white-space delimiter
Parse into double without any questions


In C++:

Looping


Read characters until one of the default delimiters is detected. This isn't limiting the detection to your actual data pattern. It will check for more delimiters just in case. Overhead everywhere.
Once it found a delimiter, it will try to parse the accumulated string gracefully. It won't assume a pattern in your data. For example, if there is 800 consecutive numeric characters and isn't a good candidate for the type anymore, it must be able to detect that possibility by itself, so it adds some overhead for that.


One way to improve performance that I'd suggest is very near of what Peter said in above comments. Use getline inside operator>> so you can tell about your data. Something like this should be able to give some of your speed back, thought it's somehow like C-ing a part of your code back:

istream &operator>>(istream &in, point &p) {
    char bufX[10], bufY[10];
    in.getline(bufX, sizeof(bufX), ' ');
    in.getline(bufY, sizeof(bufY), '\n');
    p.x = atof(bufX);
    p.y = atof(bufY);
    return in;
}


Hope it's helpful.

Edit: applied nneonneo's comment
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2020-12-18 07:21
              
            
            
                                                                       
As noted in the comments, make sure the actual algorithm for reading input is as good in C++ as in C. And make sure that you have 
     std::ios::sync_with_stdio(false)
so the iostreams are not slowed down by synching with C stdio.

But in my experience, C stdio is faster than C++ iostreams, but the C lib is not typesafe and extensible.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复