Processing SFTP files using C# Parallel.ForEach loop not processing downloads

大城市里の小女人 提交于 2020-12-30 03:28:04

问题


I am using the Renci SSH.NET package version 2016. I am downloading files from an external server. I usually can download about one file every 6 seconds which is bad when you have thousands of files. I recently tried to change the foreach loops to Parallel.ForEach. Doing that changed the files downloaded times to 1.5 seconds. Except when I checked the files they all had 0 KB's so it did not download anything. Is there anything wrong with the parallel loop? I am new to C# and trying to improve download times

Parallel.ForEach(summary.RemoteFiles, (f, loopstate) =>
{
    //Are we still connected? If not, reestablish a connection for up to a max of "MaxReconnectAttempts" 
    if (!sftp.IsConnected)
    {
        int maxAttempts = Convert.ToInt32(ConfigurationManager.AppSettings["MaxReconnectAttempts"]);

        StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service has been connected from remote system, attempting to reconnect (" + sftpConnInfo.Host + ":" + sftpConnInfo.Port.ToString() + remotePath + " - Attempt 1 of " + maxAttempts.ToString() + ")", Location = locationName });

        for (int attempts = 1; attempts <= maxAttempts; attempts++)
        {
            sftp.Connect();

            if (sftp.IsConnected)
            {
                StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service - Connection reestablished (" + remotePath + ")", Location = locationName });
                break;
            }
            else
            {
                if ((attempts + 1) <= maxAttempts)
                {
                    StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service still disconnected from remote system, preparing another reconnect attempt (" + sftpConnInfo.Host + ":" + sftpConnInfo.Port.ToString() + remotePath + " - Attempt " + (attempts + 1).ToString() + " of " + maxAttempts.ToString() + ")", Location = locationName });
                    System.Threading.Thread.Sleep(2000);
                }
                else
                {
                    //Max reconnect attempts reached - end the session and ensure the appropriate "failure" workflow is triggered
                    connectionLost = true;
                }
            }
        }
    }

    if (connectionLost)
        loopstate.Break();
       // break;


    totalFileCount++;
    try
    {
      if (!System.IO.File.Exists(localSaveLocation + f.FileName))

        {
            System.Diagnostics.Debug.WriteLine("\tDownloading file " + totalFileCount.ToString() + "(" + f.FileName + ")");

            System.IO.Stream localFile = System.IO.File.OpenWrite(localSaveLocation + f.FileName);
            //Log remote file name, local file name, date/time start
            start = DateTime.Now;
            sftp.DownloadFile(f.FullName, localFile);
            end = DateTime.Now;

            //Log remote file name, local file name, date/time complete (increment the "successful" downloads by 1)
            timeElapsed = end.Subtract(start);
            runningSeconds += timeElapsed.TotalSeconds;
            runningAvg = runningSeconds / Convert.ToDouble(totalFileCount);
            estimatedSecondsRemaining = (summary.RemoteFiles.Count - totalFileCount) * runningAvg;

            elapsedTimeString = timeElapsed.TotalSeconds.ToString("#.####") + " seconds";
            System.Diagnostics.Debug.WriteLine("\tCompleted downloading file in " + elapsedTimeString + " " + "(" + f.FileName + ")");
            downloadedFileCount++;
            ProcessFileComplete(this, new Types.ProcessFileCompleteEventArgs() { downloadSuccessful = true, elapsedTime = timeElapsed.TotalSeconds, fileName = f.FileName, fullLocalPath = localSaveLocation + f.FileName, Location = locationName, FilesDownloaded = totalFileCount, FilesRemaining = (summary.RemoteFiles.Count - totalFileCount), AvgSecondsPerDownload = runningAvg, TotalSecondsElapsed = runningSeconds, EstimatedTimeRemaining = TimeSpan.FromSeconds(estimatedSecondsRemaining) });

            f.FileDownloaded = true;

            if (deleteAfterDownload)
                sftp.DeleteFile(f.FullName);
        }
        else
        {
            System.Diagnostics.Debug.WriteLine("\tFile " + totalFileCount.ToString() + "(" + f.FileName + ") already exists locally");
            downloadedFileCount++;

            ProcessFileComplete(this, new Types.ProcessFileCompleteEventArgs() { downloadSuccessful = true, elapsedTime = 0, fileName = f.FileName + " (File already exists locally)", fullLocalPath = localSaveLocation + f.FileName, Location = locationName, FilesDownloaded = totalFileCount, FilesRemaining = (summary.RemoteFiles.Count - totalFileCount), AvgSecondsPerDownload = runningAvg, TotalSecondsElapsed = runningSeconds, EstimatedTimeRemaining = TimeSpan.FromSeconds(estimatedSecondsRemaining) });
            f.FileDownloaded = true;

            if (deleteAfterDownload)
                sftp.DeleteFile(f.FullName);
        }
    }
    catch (System.Exception ex)
    {
       // We log stuff here
    }

}); 

回答1:


I cannot tell why you get empty file. Though I'd suspect the fact that you do not close the localFile stream.

Though, even if your code worked, you will get hardly any performance benefit if you use the same connection for the downloads, as SFTP transfers tend to be limited by a network latency or CPU. You have to use multiple connections to overcome that.

See my answer on Server Fault about factors that affect SFTP transfer speed.

Implement some connection pool and pick a free connection each time.


Simple example:

var clients = new ConcurrentBag<SftpClient>();

Parallel.ForEach(files, (f, loopstate) => {
    if (!clients.TryTake(out var client))
    {
        client = new SftpClient(hostName, userName, password);
        client.Connect();
    }

    string localPath = Path.Combine(destPath, f.Name);
    Console.WriteLine(
        "Thread {0}, Connection {1}, File {2} => {3}",
        Thread.CurrentThread.ManagedThreadId, client.GetHashCode(),
        f.FullName, localPath);

    using (var stream = File.Create(localPath))
    {
        client.DownloadFile(f.FullName, stream);
    }

    clients.Add(client);
});

Console.WriteLine("Closing {0} connections", clients.Count);

foreach (var client in clients)
{
    client.Dispose();
}

You should limit number of connections somehow though.


Another approach is to start a fixed number of threads with one connection for each and have them pick files from a queue.

For an example of implementation see my article for WinSCP .NET assembly:
Automating transfers in parallel connections over SFTP/FTP protocol



来源:https://stackoverflow.com/questions/48833005/processing-sftp-files-using-c-sharp-parallel-foreach-loop-not-processing-downloa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!