问题
I have a Lambda with an SQS trigger. When it gets hit, a batch of records from SQS comes in (usually about 10 at a time, I think). If I return a failed status code from the handler, all 10 messages will be retried. If I return a success code, they'll all be removed from the queue. What if 1 out of those 10 messages failed and I want to retry just that one?
exports.handler = async (event) => {
for(const e of event.Records){
try {
let body = JSON.parse(e.body);
// do things
}
catch(e){
// one message failed, i want it to be retried
}
}
// returning this causes ALL messages in
// this batch to be removed from the queue
return {
statusCode: 200,
body: 'Finished.'
};
};
Do I have to manually re-add that ones message back to the queue? Or can I return a status from my handler that indicates that one message failed and should be retried?
回答1:
Yes you have to manually re-add the failed messages back to the queue.
What I suggest doing is setting up a fail count, so that if all messages failed you can simply return a failed status for all messages, otherwise if the fail count is < 10 then you can individually send back the failed messages to the queue.
回答2:
You need to design your app iin diffrent way here is few ideas not best but will solve your problem.
Solution 1:
- Create sqs delivery queues - sq1
- Create delay queues as per delay requirment sq2
- Create dead letter queue sdl
Now inside lambda function if message failed in sq1 then delete it on sq1 and drop it on sq2 for retry Any Lambda function invoked asynchronously is retried twice before the event is discarded. If the retries fail.
If again failed after give retry move into dead letter queue sdl .
- AWS Lambda - processing messages in Batches
- https://docs.aws.amazon.com/lambda/latest/dg/retries-on-errors.html
Note :When an SQS event source mapping is initially created and enabled, or first appear after a period with no traffic, then the Lambda service will begin polling the SQS queue using five parallel long-polling connections, as per AWS documentation, the default duration for a long poll from AWS Lambda to SQS is 20 seconds.
https://docs.aws.amazon.com/lambda/latest/dg/lambda-services.html#supported-event-source-sqs
- https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-delay-queues.html
- https://nordcloud.com/amazon-sqs-as-a-lambda-event-source/
Solution 2:
Use AWS StepFunction
- https://aws.amazon.com/step-functions/
StepFunction will call lambda and handle the retry logic on failure with configurable exponential back-off if needed.
- https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html
- https://cloudacademy.com/blog/aws-step-functions-a-serverless-orchestrator/
**Solution 3: **
CloudWatch scheduled event to trigger a Lambda function that polls for FAILED.
Error handling for a given event source depends on how Lambda is invoked. Amazon CloudWatch Events invokes your Lambda function asynchronously.
- https://docs.aws.amazon.com/lambda/latest/dg/retries-on-errors.html
- https://engineering.opsgenie.com/aws-lambda-performance-series-part-2-an-analysis-on-async-lambda-fail-retry-behaviour-and-dead-b84620af406
- https://dzone.com/articles/asynchronous-retries-with-aws-sqs
- https://medium.com/@ron_73212/how-to-handle-aws-lambda-errors-like-a-pro-e5455b013d10
来源:https://stackoverflow.com/questions/55497907/how-do-i-fail-a-specific-sqs-message-in-a-batch-from-a-lambda