Apache, LDAP and WSGI encoding issue

空扰寡人 提交于 2019-12-13 04:36:46

问题


I am using Apache 2.4.7 with mod_wsgi 3.4 on Ubuntu 14.04.2 (x86_64) and python 3.4.0. My python app relies on apache to perform user authentication against our company’s LDAP server (MS Active Directory 2008). It also passes some additional LDAP data to the python app using the OS environment. In the apache config, I query the LDAP like so:

…
AuthLDAPURL "ldap://server:389/DC=company,DC=lokal?sAMAccountName,sn,givenName,mail,memberOf?sub?(objectClass=*)"
AuthLDAPBindDN …
AuthLDAPBindPassword …
AuthLDAPRemoteUserAttribute sAMAccountName
AuthLDAPAuthorizePrefix AUTHENTICATE_
…

This passes some user data to my WSGI script where I handle the info as follows:

# Make sure the packages from the virtualenv are found
import site
site.addsitedir('/home/user/.virtualenvs/ispot-cons/lib/python3.4/site-packages')

# Patch path for app (so that libispot can be found)
import sys
sys.path.insert(0, '/var/www/my-app/')

import os
from libispot.web import app as _application

def application(environ, start_response):
    os.environ['REMOTE_USER'] = environ.get('REMOTE_USER', "")
    os.environ['REMOTE_USER_FIRST_NAME'] = environ.get('AUTHENTICATE_GIVENNAME', "")
    os.environ['REMOTE_USER_LAST_NAME'] = environ.get('AUTHENTICATE_SN', "")
    os.environ['REMOTE_USER_EMAIL'] = environ.get('AUTHENTICATE_MAIL', "")
    os.environ['REMOTE_USER_GROUPS'] = environ.get('AUTHENTICATE_MEMBEROF', "")
    return _application(environ, start_response)

I can then access this info in my python app using os.environ.get(…). (BTW: If you have a more elegant solution, please let me know!)

The problem is that some of the user names contain special characters (German umlauts, e.g., äöüÄÖÜ) that are not encoded correctly. So, for example, the name Tölle arrives in my python app as Tölle.

Obviously, this is an encoding problem, because

$ echo "Tölle" | iconv --from utf-8 --to latin1 

gives me the correct Tölle.

Another observation that might help: in my apache logs I found the character ü represented as \xc3\x83\xc2\xbc.

I told my Apache in /etc/apache2/envvars to use LANG=de_DE.UTF-8 and python 3 is utf-8 aware as well. I can’t seem to specify anything about my LDAP server. So my question is: where is the encoding getting mixed up and how do I mend it?


回答1:


It is bad practice to copy the values to os.environ on each request as this will fail miserable if the WSGI server is running with a multithreaded configuration, with concurrent requests interfering with each other. Look at thread locals instead.

As to the issue of encoded data from LDAP, if I under stand the problem, you would need to do:

"Tölle".encode('latin-1').decode('utf-8')


来源:https://stackoverflow.com/questions/31703451/apache-ldap-and-wsgi-encoding-issue

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!