问题
I'm trying to inspect my appengine backup files to work out when a data corruption occured. I used gsutil to locate and download the file:
gsutil ls -l gs://my_backup/ > my_backup.txt
gsutil cp gs://my_backup/LongAlphaString.Mymodel.backup_info file://1.backup_info
I then created a small python program, attempting to read the file and parse it using the appengine libraries.
#!/usr/bin/python
APPENGINE_PATH='/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/'
ADDITIONAL_LIBS = [
'lib/yaml/lib'
]
import sys
sys.path.append(APPENGINE_PATH)
for l in ADDITIONAL_LIBS:
sys.path.append(APPENGINE_PATH+l)
import logging
from google.appengine.api.files import records
import cStringIO
def parse_backup_info_file(content):
"""Returns entities iterator from a backup_info file content."""
reader = records.RecordsReader(cStringIO.StringIO(content))
version = reader.read()
if version != '1':
raise IOError('Unsupported version')
return (datastore.Entity.FromPb(record) for record in reader)
INPUT_FILE_NAME='1.backup_info'
f=open(INPUT_FILE_NAME, 'rb')
f.seek(0)
content=f.read()
records = parse_backup_info_file(content)
for r in records:
logging.info(r)
f.close()
The code for parse_backup_info_file
was copied from
backup_handler.py
When I run the program, I get the following output:
./view_record.py
Traceback (most recent call last):
File "./view_record.py", line 30, in <module>
records = parse_backup_info_file(content)
File "./view_record.py", line 19, in parse_backup_info_file
version = reader.read()
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/files/records.py", line 335, in read
(chunk, record_type) = self.__try_read_record()
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/files/records.py", line 307, in __try_read_record
(length, len(data)))
EOFError: Not enough data read. Expected: 24898 but got 2112
I've tried with a half a dozen different backup_info files, and they all show the same error (with different numbers.)
I have noticed that they all have the same expected length: I was reviewing different versions of the same model when I made that observation, it's not true when I view the backup files of other Modules.
EOFError: Not enough data read. Expected: 24932 but got 911
EOFError: Not enough data read. Expected: 25409 but got 2220
Is there anything obviously wrong with my approach?
I guess the other option is that the appengine backup utility is not creating valid backup files. Anything else you can suggest would be very welcome. Thanks in Advance
回答1:
There are multiple metadata files created when an AppEngine Datastore backup is run:
LongAlphaString.backup_info is created once. This contains metadata about all of the entity types and backup files that were created in datastore backup.
LongAlphaString.[EntityType].backup_info is created once per entity type. This contains metadata about the the specific backup files created for [EntityType] along with schema information for the [EntityType].
Your code works for interrogating the file contents of LongAlphaString.backup_info, however it seems that you are trying to interrogate the file contents of LongAlphaString.[EntityType].backup_info. Here's a script that will print the contents in a human-readable format for each file type:
import cStringIO
import os
import sys
sys.path.append('/usr/local/google_appengine')
from google.appengine.api import datastore
from google.appengine.api.files import records
from google.appengine.ext.datastore_admin import backup_pb2
ALL_BACKUP_INFO = 'long_string.backup_info'
ENTITY_KINDS = ['long_string.entity_kind.backup_info']
def parse_backup_info_file(content):
"""Returns entities iterator from a backup_info file content."""
reader = records.RecordsReader(cStringIO.StringIO(content))
version = reader.read()
if version != '1':
raise IOError('Unsupported version')
return (datastore.Entity.FromPb(record) for record in reader)
print "*****" + ALL_BACKUP_INFO + "*****"
with open(ALL_BACKUP_INFO, 'r') as myfile:
parsed = parse_backup_info_file(myfile.read())
for record in parsed:
print record
for entity_kind in ENTITY_KINDS:
print os.linesep + "*****" + entity_kind + "*****"
with open(entity_kind, 'r') as myfile:
backup = backup_pb2.Backup()
backup.ParseFromString(myfile.read())
print backup
来源:https://stackoverflow.com/questions/21035194/reading-appengine-backup-info-file-gives-eoferror