Delete line from text file with line numbers from another file

北城余情 提交于 2019-12-03 16:22:16

awk oneliner should work for you, see test below:

kent$  head lines.txt doc.txt 
==> lines.txt <==
1
3
5
7

==> doc.txt <==
a
b
c
d
e
f
g
h

kent$  awk 'NR==FNR{l[$0];next;} !(FNR in l)' lines.txt doc.txt
b
d
f
h

as Levon suggested, I add some explanation:

awk                     # the awk command
 'NR==FNR{l[$0];next;}  # process the first file(lines.txt),save each line(the line# you want to delete) into an array "l"

 !(FNR in l)'           #now come to the 2nd file(doc.txt), if line number not in "l",print the line out
 lines.txt              # 1st argument, file:lines.txt
 docs.txt               # 2nd argument, file:doc.txt

Well, I speak no Perl and bash I develop painful trial after trial after trial. However, Rexx would do this easily;

lines_to_delete = ""

do while lines( "lines.txt" )
   lines_to_delete = lines_to_delete linein( "lines.txt" )
end

n = 0
do while lines( "documents.txt" )
   line = linein( "documents.txt" )
   n = n + 1
   if ( wordpos( n, lines_to_delete ) == 0 )
      call lineout "temp_out,txt", line
end

This will leave your output in temp_out.txt which you may rename to documents.txt as desired.

Here's a way to do it with sed:

sed ':a;${s/\n//g;s/^/sed \o47/;s/$/d\o47 documents.txt/;b};s/$/d\;/;N;ba' lines.txt | sh

It uses sed to build a sed command and pipes it to the shell to be executed. The resulting sed command simply looks like `sed '3d;5d;11d' documents.txt.

To build it the outer sed command adds a d; after each number, loops to the next line, branching back to the beginning (N; ba). When the last line is reached ($), all the newlines are removed, sed ' is prepended and the final d and ' documents.txt are appended. Then b branches out of the :a - ba loop to the end since no label is specified.

Here's how you can do it using join and cat -n (assuming that lines.txt is sorted):

join -t $'\v' -v 2 -o 2.2 lines.txt <(cat -n documents.txt | sed 's/^ *//;s/\t/\v/')

If lines.txt isn't sorted:

join -t $'\v' -v 2 -o 2.2 <(sort lines.txt) <(cat -n documents.txt | sed '^s/ *//;s/\t/\v/')

Edit:

Fixed a bug in the join commands in which the original versions only output the first word of each line in documents.txt.

This might work for you (GNU sed):

sed 's/.*/&d/' lines.txt | sed -i -f - documents.txt

or:

sed ':a;$!{N;ba};s/\n/d;/g;s/^/sed -i '\''/;s/$/d'\'' documents.txt/' lines.txt | sh
miku

I asked a similar question on Unix SE and got wonderful answers, among them the following awk script:

#!/bin/bash
#
# filterline keeps a subset of lines of a file.
#
# cf. https://unix.stackexchange.com/q/209404/376
#
set -eu -o pipefail

if [ "$#" -ne 2 ]; then
    echo "Usage: filterline FILE1 FILE2"
    echo
    echo "FILE1: one integer per line indicating line number, one-based, sorted"
    echo "FILE2: input file to filter"
    exit 1
fi

LIST="$1" LC_ALL=C awk '
  function nextline() {
    if ((getline n < list) <=0) exit
  }
  BEGIN{
    list = ENVIRON["LIST"]
    nextline()
  }
  NR == n {
    print
    nextline()
  }' < "$2"

And another C version, which is a bit more performant:

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!