问题
This answer tells me how to find the files with the same filename in two directories in bash:
diff -srq dir1/ dir2/ | grep identical
Now I want to consider files which satisfy a condition. If I use ls E*, I get back files starting with E. I want to do the same with the above command: give me the filenames which are different in dir1/ and dir2/, but consider only those starting with E.
I tried the following:
diff -srq dir1/E* dir2/E* | grep identical
but it did not work, I got this output:
diff: extra operand '/home/pal/konkoly/c6/elesbe3/1/EPIC_212291374- c06-k2sc.dat.flag.spline' diff: Try 'diff --help' for more information.
((/home/pal/konkoly/c6/elesbe3/1/EPIC_212291374-
c06-k2sc.dat.flag.spline is a file in the so-called dir1, but EPIC_212291374-
c06-k2sc.dat.flag.spline is not in the so-called dir2))
How can I solve this?
I tried doing it in the following way, based on this answer:
DIR1=$(ls dir1)
DIR2=$(ls dir2)
for i in $DIR1; do
for j in $DIR2; do
if [[ $i == $j ]]; then
echo "$i == $j"
fi
done
done
It works as above, but if I write DIR1=$(ls path1/E*) and DIR2=$(ls path2/E*), it does not, I get no output.
回答1:
This is untested, but I'd try something like:
comm -12 <(cd dir1 && ls E*) <(cd dir2 && ls E*)
Basic idea:
Generate a list of filenames in
dir1that satisfy our condition. This can be done withls E*because we're only dealing with a flat list of files. For subdirectories and recursion we'd usefindinstead (e.g.find . -name 'E*' -type f).Put the filenames in a canonical order (e.g. by sorting them). We don't have to do anything here because
E*expands in sorted order anyway. Withfindwe might have to pipe the output intosortfirst.Do the same thing to
dir2.Only output lines that are common to both lists, which can be done with
comm -12.commexpects to be passed two filenames on the command line, so we use the<( ... )bash feature to spawn a subprocess and connect its output to a named pipe; the name of the pipe can then be given tocomm.
回答2:
The accepted answer works fine. Though if someone needs a python implementation, this also works:
import glob
dir1withpath=glob.glob("path/to/dir1/E*")
dir2withpath=glob.glob("path/to/dir2/E*")
dir1=[]
for index,each in enumerate(dir1withpath):
dir1list=dir1withpath[index].split("/")
dir1.append(dir1list[-1])
dir2=[]
for index,each in enumerate(dir2withpath):
dir2list=dir2withpath[index].split("/")
dir2.append(dir2list[-1])
for each1 in dir1:
for each2 in dir2:
if each1 == each2:
print(each1 + "is in both directories")
来源:https://stackoverflow.com/questions/52350039/given-two-directory-trees-how-to-find-which-filenames-are-the-same-considering