Parse multipart request string in Python

此生再无相见时 提交于 2019-12-05 06:45:14

问题


I have a string like this

"--5b34210d81fb44c5a0fdc1a1e5ce42c3\r\nContent-Disposition: form-data; name=\"author\"\r\n\r\nJohn Smith\r\n--5b34210d81fb44c5a0fdc1a1e5ce42c3\r\nContent-Disposition: form-data; name=\"file\"; filename=\"example2.txt\"\r\nContent-Type: text/plain\r\nExpires: 0\r\n\r\nHello World\r\n--5b34210d81fb44c5a0fdc1a1e5ce42c3--\r\n"

I also have request headers available in other vairbles.

How do I easily parse this with Python3?

I am handling a file upload in AWS Lambda via API Gateway, request body and headers are available via Python dicts.

There are other similar questions on stackoverflow, but most are assuming use of the requests module or other modules and expect the request details to be in a specific object or format.

NOTE: I am aware its possible to have user upload to S3 and trigger Lambda, but I am intentionally choosing not to do that in this case.


回答1:


It can be parsed by using something like

from requests_toolbelt.multipart import decoder
multipart_string = "--ce560532019a77d83195f9e9873e16a1\r\nContent-Disposition: form-data; name=\"author\"\r\n\r\nJohn Smith\r\n--ce560532019a77d83195f9e9873e16a1\r\nContent-Disposition: form-data; name=\"file\"; filename=\"example2.txt\"\r\nContent-Type: text/plain\r\nExpires: 0\r\n\r\nHello World\r\n--ce560532019a77d83195f9e9873e16a1--\r\n"
content_type = "multipart/form-data; boundary=ce560532019a77d83195f9e9873e16a1"
decoder.MultipartDecoder(multipart_string, content_type)



回答2:


Expanding on sam-anthony' answer (I had to make some fixes for it to work on python 3.6.8):

from requests_toolbelt.multipart import decoder

multipart_string = b"--ce560532019a77d83195f9e9873e16a1\r\nContent-Disposition: form-data; name=\"author\"\r\n\r\nJohn Smith\r\n--ce560532019a77d83195f9e9873e16a1\r\nContent-Disposition: form-data; name=\"file\"; filename=\"example2.txt\"\r\nContent-Type: text/plain\r\nExpires: 0\r\n\r\nHello World\r\n--ce560532019a77d83195f9e9873e16a1--\r\n"
content_type = "multipart/form-data; boundary=ce560532019a77d83195f9e9873e16a1"

for part in decoder.MultipartDecoder(multipart_string, content_type).parts:
  print(part.text)

John Smith
Hello World

What you'd have to do is install this library through pip install requests-toolbelt -target=. and then upload it along with your lambda script

Here's a working example:

from requests_toolbelt.multipart import decoder

def lambda_handler(event, context):

    content_type_header = event['headers']['Content-Type']

    body = event["body"].encode()

    response = ''
    for part in decoder.MultipartDecoder(body, content_type_header).parts:
      response += part.text + "\n"

    return {
        'statusCode': 200,
        'body': response
    }



回答3:


If you want to use Python's CGI,

from cgi import parse_multipart
from io import BytesIO

c_type, c_data = parse_header(event['headers']['Content-Type'])
assert c_type == 'multipart/form-data'
decoded_string = base64.b64decode(event['body'])
form_data = parse_multipart(BytesIO(decoded_string), c_data)

for image_str in form_data['file']:
    ...


来源:https://stackoverflow.com/questions/50925083/parse-multipart-request-string-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!