问题
I have a requirement in which a zip files arrives on s3 bucket, I need to write a lambda using python to read the zip file perform some validation and unzip on another S3 bucket.
Zip file contains below:
a.csv b.csv c.csv trigger_file.txt
trigger_file.txt -- contain names of files in zip and record count (example: a.csv:120 , b.csv:10 , c.csv:50 )
So using lambda I need to read trigger file check if number files in zip folder is equal to number of files mentioned in trigger file if pass the unzip to s3 bucket.
Below code I have prepared :
def write_to_s3(config_dict):
inp_bucket = config_dict["inp_bucket"]
inp_key = config_dict["inp_key"]
out_bucket = config_dict["out_bucket"]
des_key = config_dict["des_key"]
processed_key = config_dict["processed_key"]
obj = S3_CLIENT.get_object(Bucket=inp_bucket, Key=inp_key)
putObjects = []
with io.BytesIO(obj["Body"].read()) as tf:
# rewind the file
tf.seek(0)
# Read the file as a zipfile perform transformations and process the members
with zipfile.ZipFile(tf, mode='r') as zipf:
for file in zipf.infolist():
fileName = file.filename
print("file name before while loop :",fileName)
try:
found = False
while not found :
if fileName == "Trigger_file.txt" :
with zipf.open(fileName , 'r') as thefile:
my_list = [i.decode('utf8').split(' ') for i in thefile]
my_list = str(my_list)[1:-1]
print("my_list :",my_list)
print("fileName :",fileName)
found = True
break
thefile.close()
else:
print("Trigger file not found ,try again")
except Exception as exp_handler:
raise exp_handler
if 'csv' in fileName :
try:
if fileName in my_list:
print("Validation Success , all files in Trigger file are present procced for extraction")
else:
print("Validation Failed")
except Exception as exp_handler:
raise exp_handler
# *****FUNCTION TO UNZIP ********
def lambda_handler(event, context):
try:
inp_bucket = event['Records'][0]['s3']['bucket']['name']
inp_key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
config_dict = build_conf_obj(os.environ['config_bucket'],os.environ['config_file'], os.environ['param_name'])
write_to_s3(config_dict)
except Exception as exp_handler:
print("ERROR")
All was going well, only issue I am facing is in validation part, I think while loop is wrong, since it is going into infinite loop.
Expectation:
Search for trigger_file.txt in zip folder if found then break the loop do validation and unzip it to s3 folder. If not found keep searching until end of dict.
ERROR OUTPUT ( timing out):
Response:
{
"errorMessage": "2020-06-16T20:09:06.168Z 39253b98-db87-4e65-b288-b585d268ac5f Task timed out after 60.06 seconds"
}
Request ID:
"39253b98-db87-4e65-b288-b585d268ac5f"
Function Logs:
again
Trigger file not found ,try again
Trigger file not found ,try again
Trigger file not found ,try again
Trigger file not found ,try again
Trigger file not found ,trEND RequestId: 39253b98-db87-4e65-b288-b585d268ac5f
REPORT RequestId: 39253b98-db87-4e65-b288-b585d268ac5f Duration: 60060.06 ms Billed Duration: 60000 ms Memory Size: 3008 MB Max Memory Used: 83 MB Init Duration: 389.65 ms
2020-06-16T20:09:06.168Z 39253
回答1:
In the following while loop in your code, if fileName is not "Trigger_file.txt", it falls into infinite loop.
found = False while not found: if fileName == "Trigger_file.txt": with zipf.open(fileName , 'r') as thefile: my_list = [i.decode('utf8').split(' ') for i in thefile] my_list = str(my_list)[1:-1] print("my_list :",my_list) print("fileName :",fileName) found = True break thefile.close() else: print("Trigger file not found ,try again")
I think you can replace part of your write_to_s3 function code by the following code:
def write_to_s3(config_dict):
######################
#### Do something ####
######################
# Read the file as a zipfile perform transformations and process the members
with zipfile.ZipFile(tf, mode='r') as zipf:
found = False
for file in zipf.infolist():
fileName = file.filename
if fileName == "Trigger_file.txt":
with zipf.open(fileName, 'r') as thefile:
my_list = [i.decode('utf8').split(' ') for i in thefile]
my_list = str(my_list)[1:-1]
print("my_list :", my_list)
print("fileName :", fileName)
found = True
thefile.close()
break
if found is False:
print("Trigger file not found ,try again")
return
for file in zipf.infolist():
fileName = file.filename
if 'csv' in fileName:
if fileName not in my_list:
print("Validation Failed")
return
print("Validation Success , all files in Trigger file are present procced for extraction")
# *****FUNCTION TO UNZIP ********
来源:https://stackoverflow.com/questions/62417455/aws-lambda-read-zip-file-perform-validation-and-unzip-to-s3-bucket-if-validation