Migrating from CPython to Jython

坚强是说给别人听的谎言 提交于 2019-11-30 04:37:53

First off, I have to say the Jython implementation is very good. Most things "just work".

Here are a few things that I have encountered:

  • C modules are not available, of course.

  • open('file').read() doesn't automatically close the file. This has to do with the difference in the garbage collector. This can cause issues with too many open files. It's better to use the "with open('file') as fp" idiom.

  • Setting the current working directory (using os.setcwd()) works for Python code, but not for Java code. It emulates the current working directory for everything file-related but can only do so for Jython.

  • XML parsing will try to validate an external DTD if it's available. This can cause massive slowdowns in XML handling code because the parser will download the DTD over the network. I reported this issue, but so far it remains unfixed.

  • The __ del __ method is invoked very late in Jython code, not immediately after the last reference to the object is deleted.

There is an old list of differences, but a recent list is not available.

So far, I have noticed two further issues:

  • String interning 'a' is 'a' is not guaranteed (and it is just an implementation fluke on CPython). This could be a serious problem, and really was in one of the libraries I was porting (Jinja2). Unit tests are (as always) your best friends!
Jython 2.5b0 (trunk:5540, Oct 31 2008, 13:55:41)
>>> 'a' is 'a'
True
>>> s = 'a'
>>> 'a' is s
False
>>> 'a' == s   
True
>>> intern('a') is intern(s)
True

Here is the same session on CPython:

Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49)
>>> 'a' is 'a'
True
>>> s = 'a'
>>> 'a' is s
True
>>> 'a' == s
True
>>> intern('a') is intern(s)
True

  • os.spawn* functions are not implemented. Instead use subprocess.call. I was surprised really, as the implementation using subprocess.call would be easy, and I am sure they will accept patches.

(I have been doing a similar thing as you, porting an app recently)

I'm starting this as a wiki collected from the other answers and my experience. Feel free to edit and add stuff, but please try to stick to practical advice rather than a list of broken things. Here's an old list of differences from the Jython site.

Resource management

Jython does not use reference counting, and so resources are released as they are garbage collected, which is much later then you'd see in the equivalent CPython program

  • open('file').read() doesn't automatically close the file. Better use the with open('file') as fp idiom.
  • The __ del __ method is invoked very late in Jython code, not immediately after the last reference to the object is deleted.

MySQL Integration

mysqldb is a c module, and therefore will not work in jython. Instead, you should use com.ziclix.python.sql.zxJDBC, which comes bundled with Jython.

Replace the following MySQLdb code:

connection = MySQLdb.connect(host, user, passwd, db, use_unicode=True, chatset='utf8')

With:

url = "jdbc:mysql://%s/%s?useUnicode=true&characterEncoding=UTF-8&zeroDateTimeBehavior=convertToNull" % (host, db)
connections = zxJDBC.connect(url, user, passwd, "com.mysql.jdbc.Driver")

You'll also need to replace all _mysql_exception with zxJDBC.

Finally, you'll need to replace the query placeholders from %s to ?.

Unicode

  • You can't express illegal unicode characters in Jython. Trying something like unichr(0xd800) would cause an exception, and having a literal u'\ud800' in your code will just wreak havoc.

Missing things

  • C modules are not available, of course.
  • os.spawn* functions are not implemented. Instead use subprocess.call.

Performance

  • For most workloads, Jython will be much slower than CPython. Reports are anything between 3 to 50 times slower.

Community

The Jython project is still alive, but is not fast-moving. The dev mailing list has about 20 messages a month, and there seem to be only about 2 developers commiting code lately.

When I switched a project from CPython to Jython some time ago I realized a speed-down of up to 50x for time-critical sections. Because of that I stayed with CPython.

However, that might have changed now with the current versions.

You might also want to research JPype. I'm not sure how mature it is compared to Jython, but it should allow CPython to access Java code.

Recently, I worked on a project for a professor at my school with a group. Early on, it was decided that we would write the project in Python. We definitely should have used CPython. We wrote the program in Python and all of our unit tests eventually worked. Because most people already have Java installed on their computers, and not Python, we decided to just deploy it as a Jython jar. We therefore wrote the GUI with Swing, because that's included in Java's standard library.

The first time I ran the program with Jython, it crashed immediately. For one, csv.reader's ".fieldnames" always seemed to be None. Therefore I had to change several parts of our code to work around this.

A different section of my code crashed as well, which worked fine with CPython. Jython accused me of referencing a variable before it was assigned anything (which drove me nuts and really wasn't the case). This is one example: ActiveState's Code Recipe's external sort

Worse yet, the performance was awful. Basically this code combined several CSV files, one of which was about 2 GB. In CPython, it ran in 8.5 minutes. In Jython, it ran in 25 minutes.

These problems happened with 2.5.2rc2 (the latest at the time of writing this post).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!