“IOError: size mismatch in get!” when retrieving files via SFTP

匆匆过客 提交于 2020-11-29 03:57:29

问题


I have a script which I use to retrieve specific files via SFTP on a regular basis. On occasion, the script will error out with the following output:

Traceback (most recent call last):
  File "ETL.py", line 304, in <module>
    get_all_files(startdate, enddate, "vma" + 
foldernumber + "/logs/", txtype[1] + single_date2 + ".log", txtype[2] + 
foldernumber + "\\", sftp)
  File "ETL.py", line 283, in get_all_files
    sftp.get(sftp_dir + filename, local_dir + filename)
  File "C:\Python27\lib\site-packages\pysftp\__init__.py", line 249, in get
    self._sftp.get(remotepath, localpath, callback=callback)
  File "C:\Python27\lib\site-packages\paramiko\sftp_client.py", line 806, in get
    "size mismatch in get!  {} != {}".format(s.st_size, size)
IOError: size mismatch in get!  950272 != 1018742

I have looked through the Paramiko documentation and do not see an explanation for what would trigger this error. Furthermore, the code often works successfully on subsequent tries, or will run successfully for the first few files in the date range and then error out in the middle of downloading all the files I need to retrieve. Other answers on SO say it might be related to the space available on the drive, but I have tried clearing out the destination folder and it hasn't helped. I am trying to download to a network drive/cloud storage if that makes any difference.

Here is the function and code I am using to retrieve the files (via Paramiko):

def get_all_files(start_date, end_date, sftp_dir, filename, local_dir,  \
                sftp_connection):

    sftp.get(sftp_dir + filename, local_dir + filename)

with pysftp.Connection('******.com', username='*****', password='******',  cnopts=cnopts) as sftp:
    get_all_files(startdate, enddate, "vma" + foldernumber + "/logs/", txtype[1] + single_date2 + ".log", txtype[2] + foldernumber + "\\", sftp)

I would like all downloadable files to be retrieved without producing this error.


回答1:


The error message IOError: size mismatch in get! 950272 != 1018742 is being thrown by the get-function of the Paramiko-library if the size of the copied file on the local directory does not match the prefetched size of the remote file:

with open(localpath, "wb") as fl:
    size = self.getfo(remotepath, fl, callback)
s = os.stat(localpath)
if s.st_size != size:
    raise IOError(
        "size mismatch in get!  {} != {}".format(s.st_size, size)
    )

Why does this happen if there is no issue regarding the connection and the transfer-process?

While checking the Paramiko-code and trying to debug this issue a strange behaviour of my local file system caught my attention. With every copied file from the remote file system, the local file system took some time processing the file registering the correct file-size.

This behaviour leads me to my assumption, that while the get-function of the Paramiko-library does process the file correctly it does not wait for the local file system to adapt and hence may get the status (including the size) of the local file right after the file was finished being processed by the getfo-function using s = os.stat(localpath).

This could lead to inconsistencies between the local file-size and the correctly prefetched remote file-size and therefore could throw the IOError "size mismatch in get! {} != {}".format(s.st_size, size).

It would also explain why the Error cannot be reproduced consistently because the Python interpreter always works with different environments regarding the synchronicity of the local operating system.

How was I able to solve this issue for me?

I manipulated the Paramiko-code of the get-function which can be found on line 785 in the "sftp_client.py" and added localsize = fl.tell() within the file-handling updating the size-checking accordingly:

with open(localpath, "wb") as fl:
    size = self.getfo(remotepath, fl, callback)
    localsize = fl.tell()
if localsize != size:
    raise IOError(
        "size mismatch  {} != {}".format(localsize, size)
    )

This should avoid the somehow flawed local file-size check s = os.stat(localpath) replacing it with a properly working one that uses the file-object during file-handling to get the size of the local file.



来源:https://stackoverflow.com/questions/53945594/ioerror-size-mismatch-in-get-when-retrieving-files-via-sftp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!