Linux command (like cat) to read a specified quantity of characters

前端未结

关注

 9  2066

Is there a command like cat in linux which can return a specified quantity of characters from a file?

e.g., I have a text file like:

Hel


                      
              相关标签:


      
      
        
          9条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  不知归路        
                
              
                            
                2020-12-12 16:46
              
            
            
                                                                       
Here's a simple script that wraps up using the dd approach mentioned here:
extract_chars.sh
#!/usr/bin/env bash

function show_help()
{
  IT="
extracts characters X to Y from stdin or FILE
usage: X Y {FILE}

e.g. 

2 10 /tmp/it     => extract chars 2-10 from /tmp/it
EOF
  "
  echo "$IT"
  exit
}

if [ "$1" == "help" ]
then
  show_help
fi
if [ -z "$1" ]
then
  show_help
fi

FROM=$1
TO=$2
COUNT=`expr $TO - $FROM + 1`

if [ -z "$3" ]
then
  dd skip=$FROM count=$COUNT bs=1 2>/dev/null
else
  dd skip=$FROM count=$COUNT bs=1 if=$3 2>/dev/null 
fi

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不要未来只要你来        
                
              
                            
                2020-12-12 16:49
              
            
            
                                                                       
you could also grep the line out and then cut it like for instance:

grep 'text' filename | cut -c 1-5
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤街浪徒        
                
              
                            
                2020-12-12 16:50
              
            
            
                                                                       
Even though this was answered/accepted years ago, the presently accepted answer is only correct for one-byte-per-character encodings like iso-8859-1, or for the single-byte subsets of variable-byte character sets (like Latin characters within UTF-8). Even using multiple-byte splices instead would still only work for fixed-multibyte encodings like UTF-16. Given that now UTF-8 is well on its way to being a universal standard, and when looking at this list of languages by number of native speakers and this list of top 30 languages by native/secondary usage, it is important to point out a simple variable-byte character-friendly (not byte-based) technique, using cut -c and tr/sed with character-classes.

Compare the following which doubly fails due to two common Latin-centric mistakes/presumptions regarding the bytes vs. characters issue (one is head vs. cut, the other is [a-z][A-Z] vs. [:upper:][:lower:]):

$ printf 'Πού μπορώ να μάθω σανσκριτικά;\n' | \
$     head -c 1 | \
$     sed -e 's/[A-Z]/[a-z]/g'
[[unreadable binary mess, or nothing if the terminal filtered it]]


to this (note: this worked fine on FreeBSD, but both cut & tr on GNU/Linux still mangled Greek in UTF-8 for me though):

$ printf 'Πού μπορώ να μάθω σανσκριτικά;\n' | \
$     cut -c 1 | \
$     tr '[:upper:]' '[:lower:]'
π



  Another more recent answer had already proposed "cut", but only because of the side issue that it can be used to specify arbitrary offsets, not because of the directly relevant character vs. bytes issue.


If your cut doesn't handle -c with variable-byte encodings correctly, for "the first X characters" (replace X with your number) you could try:


sed -E -e '1 s/^(.{X}).*$/\1/' -e q - which is limited to the first line though
head -n 1 | grep -E -o '^.{X}' - which is limited to the first line and chains two commands though
dd - which has already been suggested in other answers, but is really cumbersome
A complicated sed script with sliding window buffer to handle characters spread over multiple lines, but that is probably more cumbersome/fragile than just using something like dd


If your tr doesn't handle character-classes with variable-byte encodings correctly you could try:


sed -E -e 's/[[:upper:]]/\L&/g (GNU-specific)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复