Escaping unicode strings in python

前端未结

关注

 4  1465

不要未来只要你来 2021-01-03 04:33

In python these three commands print the same emoji:

print \"\\xF0\\x9F\\x8C\\x80\"


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   青春惊慌失措
                                             
                
                
                (楼主)
            
              
              
                2021-01-03 05:00
              

            
            
                        
The first one is a byte string:

>>> "\xF0\x9F\x8C\x80".decode('utf8')
u'\U0001f300'


The u"\ud83c\udf00" one is the UTF16 version (four digit unicode escape)

The u"\U0001F300" one is actual index of the codepoint.  



But how do the numbers relate?  This is the difficult question.  It's defined by the encoding and there is no obvious relationship.  To give you an idea, here is an example of "manually" encoding the codepoint at index 0x1F300 into UTF-8:

The cyclone character 
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复