Split a folder into multiple subfolders in terminal/bash script

痴心易碎 提交于 2020-06-24 07:26:29

问题


I have several folders, each with between 15,000 and 40,000 photos. I want each of these to be split into sub folders - each with 2,000 files in them.

What is a quick way to do this that will create each folder I need on the go and move all the files?

Currently I can only find how to move the first x items in a folder into a pre-existing directory. In order to use this on a folder with 20,000 items... I would need to create 10 folders manually, and run the command 10 times.

ls -1  |  sort -n | head -2000| xargs -i mv "{}" /folder/

I tried putting it in a for-loop, but am having trouble getting it to make folders properly with mkdir. Even after I get around that, I need the program to only create folders for every 20th file (start of a new group). It wants to make a new folder for each file.

So... how can I easily move a large number of files into folders of an arbitrary number of files in each one?

Any help would be very... well... helpful!


回答1:


This solution can handle names with whitespace and wildcards and can be easily extended to support less straightforward tree structures. It will look for files in all direct subdirectories of the working directory and sort them into new subdirectories of those. New directories will be named 0, 1, etc.:

#!/bin/bash

maxfilesperdir=20

# loop through all top level directories:
while IFS= read -r -d $'\0' topleveldir
do
        # enter top level subdirectory:
        cd "$topleveldir"

        declare -i filecount=0 # number of moved files per dir
        declare -i dircount=0  # number of subdirs created per top level dir

        # loop through all files in that directory and below
        while IFS= read -r -d $'\0' filename
        do
                # whenever file counter is 0, make a new dir:
                if [ "$filecount" -eq 0 ]
                then
                        mkdir "$dircount"
                fi

                # move the file into the current dir:
                mv "$filename" "${dircount}/"
                filecount+=1

                # whenever our file counter reaches its maximum, reset it, and
                # increase dir counter:
                if [ "$filecount" -ge "$maxfilesperdir" ]
                then
                        dircount+=1
                        filecount=0
                fi
        done < <(find -type f -print0)

        # go back to top level:
        cd ..
done < <(find -mindepth 1 -maxdepth 1 -type d -print0)

The find -print0/read combination with process substitution has been stolen from another question.

It should be noted that simple globbing can handle all kinds of strange directory and file names as well. It is however not easily extensible for multiple levels of directories.




回答2:


Try something like this:

for i in `seq 1 20`; do mkdir -p "folder$i"; find . -type f -maxdepth 1 | head -n 2000 | xargs -i mv "{}" "folder$i"; done

Full script version:

#!/bin/bash

dir_size=2000
dir_name="folder"
n=$((`find . -maxdepth 1 -type f | wc -l`/$dir_size+1))
for i in `seq 1 $n`;
do
    mkdir -p "$dir_name$i";
    find . -maxdepth 1 -type f | head -n $dir_size | xargs -i mv "{}" "$dir_name$i"
done

For dummies:

  1. create a new file: vim split_files.sh
  2. update the dir_size and dir_name values to match your desires
    • note that the dir_name will have a number appended
  3. navigate into the desired folder: cd my_folder
  4. run the script: sh ../split_files.sh



回答3:


The code below assumes that the filenames do not contain linefeeds, spaces, tabs, single quotes, double quotes, or backslashes, and that filenames do not start with a dash. It also assumes that IFS has not been changed, because it uses while read instead of while IFS= read, and because variables are not quoted. Add setopt shwordsplit in Zsh.

i=1;while read l;do mkdir $i;mv $l $((i++));done< <(ls|xargs -n2000)

The code below assumes that filenames do not contain linefeeds and that they do not start with a dash. -n2000 takes 2000 arguments at a time and {#} is the sequence number of the job. Replace {#} with '{=$_=sprintf("%04d",$job->seq())=}' to pad numbers to four digits.

ls|parallel -n2000 mkdir {#}\;mv {} {#}

The command below assumes that filenames do not contain linefeeds. It uses the implementation of rename by Aristotle Pagaltzis which is the rename formula in Homebrew, where -p is needed to create directories, where --stdin is needed to get paths from STDIN, and where $N is the number of the file. In other implementations you can use $. or ++$::i instead of $N.

ls|rename --stdin -p 's,^,1+int(($N-1)/2000)."/",e'



回答4:


I would go with something like this:

#!/bin/bash
# outnum generates the name of the output directory
outnum=1
# n is the number of files we have moved
n=0

# Go through all JPG files in the current directory
for f in *.jpg; do
   # Create new output directory if first of new batch of 2000
   if [ $n -eq 0 ]; then
      outdir=folder$outnum
      mkdir $outdir
      ((outnum++))
   fi
   # Move the file to the new subdirectory
   mv "$f" "$outdir"

   # Count how many we have moved to there
   ((n++))

   # Start a new output directory if we have sent 2000
   [ $n -eq 2000 ] && n=0
done



回答5:


This solution worked for me on MacOS:

i=0; for f in *; do d=dir_$(printf %03d $((i/100+1))); mkdir -p $d; mv "$f" $d; let i++; done

It creates subfolders of 100 elements each.




回答6:


You'll certainly have to write a script for that. Hints of things to include in your script:

First count the number of files within your source directory

NBFiles=$(find . -type f -name *.jpg | wc -l)

Divide this count by 2000 and add 1, to determine number of directories to create

NBDIR=$(( $NBFILES / 2000 + 1 ))

Finally loop through your files and move them accross the subdirs. You'll have to use two imbricated loops : one to pick and create the destination directory, the other to move 2000 files in this subdir, then create next subdir and move the next 2000 files to the new one, etc...




回答7:


The answer above is very useful, but there is a very import point in Mac(10.13.6) terminal. Because xargs "-i" argument is not available, I have change the command from above to below.

ls -1 | sort -n | head -2000| xargs -I '{}' mv {} /folder/

Then, I use the below shell script(reference tmp's answer)

#!/bin/bash

dir_size=500
dir_name="folder"
n=$((`find . -maxdepth 1 -type f | wc -l`/$dir_size+1))
for i in `seq 1 $n`;
do
    mkdir -p "$dir_name$i";
    find . -maxdepth 1 -type f | head -n $dir_size | xargs -I '{}' mv {} "$dir_name$i"
done



回答8:


This is a tweak of Mark Setchell's

Usage: bash splitfiles.bash $PWD/directoryoffiles splitsize

It doesn't require the script to be located in the same dir as the files for splitting, it will operate on all files, not just the .jpg and allows you to specify the split size as an argument.

#!/bin/bash
# outnum generates the name of the output directory
outnum=1
# n is the number of files we have moved
n=0

if [ "$#" -ne 2 ]; then
    echo Wrong number of args
    echo Usage: bash splitfiles.bash $PWD/directoryoffiles splitsize
    exit 1
fi

# Go through all files in the specified directory
for f in $1/*; do
   # Create new output directory if first of new batch
   if [ $n -eq 0 ]; then
      outdir=$1/$outnum
      mkdir $outdir
      ((outnum++))
   fi
   # Move the file to the new subdirectory
   mv "$f" "$outdir"

   # Count how many we have moved to there
   ((n++))

   # Start a new output directory if current new dir is full
   [ $n -eq $2 ] && n=0
done


来源:https://stackoverflow.com/questions/29116212/split-a-folder-into-multiple-subfolders-in-terminal-bash-script

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!