Django and UTF8
I have to deal with it now and a lot of other people too, judging by the utf8/unicode/encoding topics on the django mailing list. I have found this one thread quite interesting and looks like the problem solver, but may be I also just need to learn a bit more about the bits and pieces that make this whole thing work. This message obviously tells how to make mysql completely aware and well-handling utf8.
Next thing on the list the django setting parameter DEFAULT_CHARSET.
Doug Napoleone said,
February 2, 2007 at 6:06 pm
For the PyCon apps, we ran into alot of UTF8 issues. I thought I was well ready for them, but that turned out not to be the case, and the thread you mentioned didn’t help me at the time.
There is one MAJOR problem still unaddresses, and that is e-mail. It turns out that django’s e-mail hooks improperly use DEFAULT_CHARSET and go out of their way to do the wrong thing with the email and smtlib modules. The end result is your e-mail will have a UTF8 encoded subject. 90% of all spam filters automaticly filter such e-mails. (or any e-mail containing a subject with its own MIME header). This includes google, yahoo, MSN, Comcast, and Verizon. There could be many others, those are the ones we ran into.
This caused major problems for us as many of the people who submitted talk proposals or signed up for login accounts never got their e-mail notifications.
The solution was to write our own version of the django sendmail interfaces and monkey patch those back into django. We added two settings DEFAULT_EMAIL_CHARSET and DEFAULT_EMAIL_SUBJECT_CHARSET.
The code is currently hosted on svn.python.org, but requires ssh access. We hope to get anonymous http access soon.
For more information on what we went through to get UTF8 fully working you an chech out our old bugs wiki:
http://us.pycon.org/TX2007/PyConTechBugs
We ran into more than one UTF8 issue. Setting the postgresql DB encoding to UTF8 setting the DEFAULT_CHARSET, and properly dealing with some custom form wrappers to make them UTF8 aware was the least of it. We knew about that out of the box.
Pythoneer » Django’s dev server and UTF-8 said,
November 23, 2007 at 12:35 pm
[...] time to time I got this UnicodeEncodeError, but I had done all the things (sitecustomize.py, some more) right in order to configure the system in UTF-8. I [...]