问题
I am doing a lab on the website LinuxAcademy.com. The Course name is Automating AWS with Lambda, Python, and Boto3 and the specific lab I am having trouble with is Lecture: Importing CSV Files into DynamoDB.
In this lab we upload a .csv file into S3, an S3 event is generated in a specified bucket which then kicks off the Lambda function shown below. The function parses the .csv then uploads the contents into DynamoDB.
I was originally having issues with Line 23:
items = read_csv(download_file)
as Python was unable to define download_file. When changing to:
items = read_csv(download_path)
I was able to get past that error.
Now I am having an issue with Line 26:
for item in items:
The new error for #26 from CloudWatch is as follows:
[ERROR] TypeError: 'NoneType' object is not iterable Traceback (most recent call last): File "/var/task/lambda_function.py", line 26, in lambda_handler
for item in items:
Here is the code:
import csv
import os
import tempfile
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Movies')
s3 = boto3.client('s3')
def lambda_handler(event, context):
for record in event['Records']:
source_bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
with tempfile.TemporaryDirectory() as tmpdir:
download_path = os.path.join(tmpdir, key)
s3.download_file(source_bucket, key, download_path)
items = read_csv(download_path)
with table.batch_writer() as batch:
**for item in items:**
batch.put_item(Item=item)
def read_csv(file):
items=[]
with open(file) as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
data = {}
data['Meta'] = {}
data['Year'] = int(row['Year'])
data['Title'] = row['Title'] or none
data['Meta']['Length'] = int(row['Length'] or 0)
#data['Meta']['Length'] = int(row['Length'] or 0)
data['Meta']['Subject'] = row['Subject'] or None
data['Meta']['Actor'] = row['Actor'] or None
data['Meta']['Actress'] = row['Actress'] or None
data['Meta']['Director'] = row['Director'] or None
data['Meta']['Popularity'] = row['Popularity'] or None
data['Meta']['Awards'] = row['Awards'] == 'Yes'
data['Meta']['Image'] = row['Image'] or None
data['Meta'] = {k: v for k,
v in data['Meta'].items() if v is not None}
I'm starting to think that this is related to the function not reading the .csv properly. The .csv is a small test file, contents below.
Year,Length,Title,Subject,Actor,Actress,Director,Popularity,Awards,Image
1990,111,Tie Me Up, Comedy,"Banderas, Antonio","April, Victoria","Al, Pedreo",68,No,NicholasCage.png
1991,112,Tie Me Up2, Comedy2,"Banderas, Antonio2","April, Victoria2","Al, Pedreo2",682,No2,NicholasCage2.png
1993,113,Tie Me Up3, Comedy3,"Banderas, Antonio3","April, Victoria3","Al, Pedreo3",683,No3,NicholasCage3.png
回答1:
The read_csv()
function does not contain a return
statement.
Therefore this line:
items = read_csv(download_path)
does not assign any value to the items
variable.
Therefore, it generates an error on this line:
for item in items:
Within the read_csv()
function you probably want to add data
to items
after each row and then, at the end of the function, return items
.
回答2:
You can use pandas. Using that will give u a data frame which you can loop through easily. You can refer this link for pandas lib. - https://www.datacamp.com/community/tutorials/pandas-read-csv
来源:https://stackoverflow.com/questions/59460882/unable-to-parse-csv-in-python