问题
My goal is to unpack a .tar.gz
file and not its sub-directories leading up to the file.
My code is based off this question except instead of unpacking a .zip
I am unpacking a .tar.gz
file.
I am asking this question because the error I'm getting is very vague and doesn't identify the problem in my code:
import os
import shutil
import tarfile
with tarfile.open('RTLog_20150425T152948.gz', 'r:gz') as tar:
for member in tar.getmembers():
filename = os.path.basename(member.name)
if not filename:
continue
# copy file (taken from zipfile's extract)
source = member
target = open(os.path.join(os.getcwd(), filename), "wb")
with source, target:
shutil.copyfileobj(source, target)
As you can see I copied the code from the linked question and tried to change it to deal with .tar.gz members instead of .zip members. Upon running the code I get the following error:
Traceback (most recent call last):
File "C:\Users\dzhao\Desktop\123456\444444\blah.py", line 27, in <module>
with source, target:
AttributeError: __exit__
From the reading I've done, shutil.copyfileobj
takes as input two "file-like" objects. member
is a TarInfo
object. I'm not sure if a TarInfo
object is a file-like object so I tried changing this line from:
source = member #to
source = open(os.path.join(os.getcwd(), member.name), 'rb')
But this understandably raised an error where the file wasn't found.
What am I not understanding?
回答1:
This code has worked for me:
import os
import shutil
import tarfile
with tarfile.open(fname, "r|*") as tar:
counter = 0
for member in tar:
if member.isfile():
filename = os.path.basename(member.name)
if filename != "myfile": # do your check
continue
with open("output.file", "wb") as output:
shutil.copyfileobj(tar.fileobj, output, member.size)
break # got our file
counter += 1
if counter % 1000 == 0:
tar.members = [] # free ram... yes we have to do this manually
But your problem might not be the extraction, but rather that your file is indeed no .tar.gz but just a .gz file.
Edit: Also your getting the error on the with line because python is trying to call the __enter__ function of the member object (wich does not exist).
来源:https://stackoverflow.com/questions/37752400/how-do-i-extract-only-the-file-of-a-tar-gz-member