Is there a workaround for Java's poor performance on walking huge directories?

前端 未结 10 803
予麋鹿
予麋鹿 2020-12-02 23:30

I am trying to process files one at a time that are stored over a network. Reading the files is fast due to buffering is not the issue. The problem I have is just listing

10条回答
  •  时光说笑
    2020-12-03 00:34

    An alternative is to have the files served over a different protocol. As I understand you're using SMB for that and java is just trying to list them as a regular file.

    The problem here might not be java alone ( how does it behaves when you open that directory with Microsoft Explorer x:\shared ) In my experience it also take a considerably amount of time.

    You can change the protocol to something like HTTP, only to fetch the file names. This way you can retrieve the list of files over http ( 10k lines should't be too much ) and let the server deal with file listing. This would be very fast, since it will run with local resources ( those in the server )

    Then when you have the list, you can process them one by exactly the way you're doing right now.

    The keypoint is to have an aid mechanism in the other side of the node.

    Is this feasible?

    Today:

    File [] content = new File("X:\\remote\\dir").listFiles();
    
    for ( File f : content ) {
        process( f );
    }
    

    Proposed:

    String [] content = fetchViaHttpTheListNameOf("x:\\remote\\dir");
    
    for ( String fileName : content ) {
        process( new File( fileName ) );
    }
    

    The http server could be a very small small and simple file.

    If this is the way you have it right now, what you're doing is to fetch all the 10k files information to your client machine ( I don't know how much of that info ) when you only need the file name for later processing.

    If the processing is very fast right now it may be slowed down a bit. This is because the information prefetched is no longer available.

    Give it a try.

提交回复
热议问题