Multithreading with MySQLdb and weakrefs
I was fighting four days now, with a threading problem, which are known to be hard to track. But I finally found it and learned that I actually had made a beginner’s mistake.
What happened?
From the front end I trigger via AJAX a view that again starts a thread that does some import work, that might take quite a while. This enables the user to keep going and have the import run without interrupting him/her. Every once in a while an asynchronous call checks on the state of the import.
And here lies the problem: while the thread is running and busy like a bee adding data in the DB the asynchronous call to check on the state also tries to run a query and that causes the following exception:
...
File "/Users/cain/programming/django/trunk/django/db/backends/mysql/base.py",
line 42, in execute
return self.cursor.execute(sql, params)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/MySQLdb/cursors.py",
line 137, in execute
self.errorhandler(self, exc, value)
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/MySQLdb/connections.py",
line 33, in defaulterrorhandler
raise errorclass, errorvalue
ReferenceError: weakly-referenced object no longer exists
That really made me mad, as you can imagine. The problem was that I didn’t really know where to start debugging and seraching. Well, debugging was hard anyway, since it was multi threaded and I didn’t yet spend the time to get my WingIDE to run Django so I could debug it, but that’s another problem.
The problem and it’s solution
So after trying and a lot of thinking I found out that it had something to do with the connection. Finally I found out that the connection object was shared between the threads in the MySQLdb/cursors.py, so it seemed not to be thread-safe. The problem only occured when args was passed to the execute() method, so I dove deeper this way. And found out that the connection.literal() call was actually causing the problem. My guess now is that the connection got reseted by the main thread and was not available anymore, so that the exception above was thrown.
I found it. But too late, at this moment I thought I should upgrade MySQLdb, and I did. (I am working with the Django trunk, so I could not be more up to date on this front.) And after this upgrade the problem was solved. How could I forget to try and upgrade first? I got an excuse, I thought the problem was in Django, not in MySQLdb.
Now I only needed to see if the problem I had found was also the one that got fixed. Since I had learned some stuff about weakrefs and threading, etc. these days I more or less knew what to look for. And there it was.
from weakref import proxy
self.connection = proxy(connection)
That part in the __init__() method of BaseCursor was the bit that obviously solved my problem. I didn’t verify it, but I am pretty sure.
The upgrade
When it occured to me that I should upgrade MySQLdb, I checked my current version number of course.
>>> import MySQLdb
>>> MySQLdb.version_info
(1, 2, 0, 'final', 1)
Note: That is the version that has the threading problem!!!
So I upgraded to MySQLdb 1.2.1_p2 the version that fixes the problem above.
But I didn’t say it was easy to “just” upgrade:
>>> import MySQLdb
Traceback (most recent call last):
File "", line 1, in ?
File "/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site-packages/MySQLdb/__init__.py", line 34, in ?
from sets import ImmutableSet
ImportError: cannot import name ImmutableSet
Grrrr …
But the problem and it’s solution were found quickly, even on a Django site
http://code.djangoproject.com/wiki/InstallationPitfalls.
After removing the sets.py(c) it worked just fine.
>>> import MySQLdb
>>> MySQLdb.version_info
(1, 2, 1, 'final', 2)
Good luck threading away …