C standard : Character set and string encoding specification

后端未结

关注

 2  1730

I found the C standard (C99 and C11) vague with respect to character/string code positions and encoding rules:

Firstly the standard defines the source characte


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  情书的邮戳        
                
              
                            
                2021-01-04 09:37
              
            
            
                                                                       
C is not greedy about character sets. There's no such thing as "default character set", it's implementation defined - although it's mostly ASCII or UTF-8 on most modern systems.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2021-01-04 09:48
              
            
            
                                                                       
The standard doesn't specify a default encoding because existing practice already had C implemented on machines with lots of different encodings, for example Honeywell mainframes and IBM mainframes.

I would expect gcc to take its default from the locale currently specified by LC_CHARSET, but I've never tested it.

VC++ takes its default from a Control Panel setting.  That default Control Panel setting varies according to which country Windows was purchased in, and most users never change it, but they can change it while installing Windows can change it later.

Trigraphs were invented so that a source program could be copied from an environment with one locale to an environment with a slightly different locale and still be compiled.  For example if a Windows user in China uses trigraphs then a Windows user in Greece would be able to compile the same source program.  However, if the locales differ too much, for example one using EBCDIC and one using EUC, trigraphs won't suffice.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复