cgi.parse_multipart function throws TypeError in Python 3

别说谁变了你拦得住时间么 提交于 2019-12-04 04:35:27

I've came across here to solve the same problem like you have. I found a silly solution for that. I just convert 'boundary' item in the dictionary from string to bytes with an encoding option.

    ctype, pdict = cgi.parse_header(self.headers['content-type'])
    pdict['boundary'] = bytes(pdict['boundary'], "utf-8")
    if ctype == 'multipart/form-data':
            fields = cgi.parse_multipart(self.rfile, pdict)

In my case, It seems work properly.

To change the tutor's code to work for Python 3 there are three error messages you'll have to combat:

If you get these error messages

c_type, p_dict = cgi.parse_header(self.headers.getheader('Content-Type'))
AttributeError: 'HTTPMessage' object has no attribute 'getheader'

or

 boundary = pdict['boundary'].decode('ascii')
AttributeError: 'str' object has no attribute 'decode'

or

headers['Content-Length'] = pdict['CONTENT-LENGTH']
KeyError: 'CONTENT-LENGTH'

when running

c_type, p_dict = cgi.parse_header(self.headers.getheader('Content-Type'))
if c_type == 'multipart/form-data':
                fields = cgi.parse_multipart(self.rfile, p_dict)
                message_content = fields.get('message')

this applies to you.

Solution

First of all change the first line to accommodate Python 3:

- c_type, p_dict = cgi.parse_header(self.headers.getheader('Content-Type'))
+  c_type, p_dict = cgi.parse_header(self.headers.get('Content-Type'))

Secondly, to fix the error of 'str' object not having any attribute 'decode', it's because of the change of strings being turned into unicode strings as of Python 3, instead of being equivalent to byte strings as in Python 3, so add this line just under the above one:

p_dict['boundary'] = bytes(p_dict['boundary'], "utf-8")

Thirdly, to fix the error of not having 'CONTENT-LENGTH' in pdict just add these lines before the if statement:

content_len = int(self.headers.get('Content-length'))
p_dict['CONTENT-LENGTH'] = content_len

Full solution on my Github:

https://github.com/rSkogeby/web-server

Johannes H

I am doing the same course and was running into the same problem. Instead of getting it to work with cgi I am now using the parse library. This was shown in the same course just a few lessons earlier.

from urllib.parse import parse_qs

length = int(self.headers.get('Content-length', 0))
body = self.rfile.read(length).decode()
params = parse_qs(body)

messagecontent = params["message"][0]

And you have to get rid of the enctype='multipart/form-data' in your form.

Another hack solution is to edit the source of the cgi module.

At the very beginning of the parse_multipart (around the 226th line): Change the usage of the boundary to str(boundary)

...
boundary = b""
if 'boundary' in pdict:
    boundary = pdict['boundary']
if not valid_boundary(boundary):
    raise ValueError('Invalid boundary in multipart form: %r'
                        % (boundary,))

nextpart = b"--" + str(boundary)
lastpart = b"--" + str(boundary) + b"--" 
...

In my case I used cgi.FieldStorage to extract file and name instead of cgi.parse_multipart

form = cgi.FieldStorage(
    fp=self.rfile,
    headers=self.headers,
    environ={'REQUEST_METHOD':'POST',
             'CONTENT_TYPE':self.headers['Content-Type'],
             })

print('File', form['file'].file.read())
print('Name', form['name'].value)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!