17 December 2012

Why I've never really liked the Facebook API

The other day, I got an email from Virgin Media stating that my connection had been "upgraded to 100Mb/s". I went to a bunch of speed-testing websites, and reported speeds were indeed much higher than in the past. I was tempted to brag about it on Facebook then I remembered that, last time I did something similar, I was humbled by a bunch of Dutch friends with "big pipes". I wondered what sort of speed they reported then, so I went to Facebook to search for that old status. And that's where my problems started.

The standard FB search UI failed to return anything even vaguely related, as usual. So I started googling for apps that would allow me to search my previous posts, and found a few which just wanted to gather all my personal data (on FB -- you don't say!). Then I found that you can actually request a complete download of all your data from FB (under Settings) and launched the request, but it looked like it would take a long time (for the record, I finally got it about 24 hours later). So I thought "hey, surely I can work with the FB API". How naive of me!

There is, in fact, a straightforward API call to get your statuses: /me/statuses. By default, it will return some 25 records, and links to further paginated results. Except pagination is ridiculously buggy: after the first 50 records, it will just return a blank page. If you try to use the limit parameter, it will return a maximum of 100 records per page, and again it will stop after the second page (i.e. max 200 results, which it's actually 199 because everybody knows "there are only two hard things in computer science"). Time-based parameters (until, since) didn't seem to work at all. Using wrappers rather than direct calls didn't seem to make any difference. Being very late, I gave up and went to sleep.

A day later, still incredulous and obviously fairly frustrated, I googled harder and finally found a relevant question on StackOverflow, which pointed to a Facebook bug about pagination. As the bug says, you can work around the problem by always using offset rather than relying on 'next' and 'previous' links returned in JSON responses. I verified and that's actually the case. By now, my export was available for download anyway. You can imagine how happy I am (not).

Lessons learnt from this whole debacle:

  • The unofficial facebook-sdk for python doesn't work with Python 3. There is an experimental fork with very limited testing (i.e. it passes 2to3 and that's it).
  • the json module in Python 3 Standard Library, as used by facebook-sdk, chokes on Facebook data. Don't even ask me how I found out. Trying with a more up-to-date version from the original upstream doesn't help. There is a Python 3 fork which didn't help. Juggling between json.load and json.loads didn't seem to help, and I didn't want to rip the guts out of facebook-sdk in fear of dropping compatibility with 2.x (although I cringed at times: using "file" as variable name? Really?). No wonder @kennethreitz rolled his own JSON parser in Requests.
  • facebook-sdk should probably be rewritten from scratch in Python 3 using Requests. Not that I'll ever do it.
  • After so many years and botched revamps, the Facebook API is still terrible. For something reportedly so essential to "2.0" internet infrastructure, and with so many uber-smart people on their payroll, the whole thing still feels incredibly hackish.

11 December 2012

Public therapy

I just experienced my first real professional failure in almost two years of consulting for my current employer. I'm not saying I've always been perfect before, but it's the first time a customer basically told me to give up and go home.

As tempted as I am to blame the damn tool (which I had never seen before in my life, and unsurprisingly refused to do my bidding), the hard truth is that:

  • I failed to properly and fully "sniff out" customer requirements in advance. This should have been a huge red flag, but I thought I was good enough to just deal with it. Lesson 1: there is a reason hubris is a cardinal sin.
  • On finding myself in trouble, I kept hacking at the problem for days when I should have just taken a step back straight away. I kept googling for a magic bullet, when I should have admitted that I did not know how the product was supposed to work, should have gone back to studying from first principles, and should have built a local proof-of-concept before attempting a real-world deployment. Despite being completely honest with the customer at all times, I ended up over-promising and under-delivering, which is the exact opposite of what I always try to do. Lesson 2: if you find your axe is blunt, hitting faster and from all directions will not compensate; just stop and sharpen up, there is no shame in it.
  • Because of this "just a little hack" attitude, deep down I was not fully committed and concentrated on the problem. I kept assuming that solving a second main task would "make up" for failure on the first one. Unfortunately, this second task depended in part on other people, who also failed to deliver on time. Lesson 3: deus ex-machina is a literary device, not an action plan and certainly not a plan B. Also Lesson 4: be brutally honest with yourself at all times.

Obviously, I'm not happy today. Regardless of the actual task at hand, I failed at thinking strategically and being self-aware, and at my age this should not happen; I'm pretty sure I've learnt all these lessons when I was 22, and still managed to forget them. I hope this little recap will help me focus... my next engagement looks like a slam-dunk and I owe to myself to make it so.

21 November 2012

"Mark of the Ninja" does not work under VmWare Fusion 5

I installed "Mark of the Ninja" in an XP image under VmWare Fusion 5, hoping the game would have been "simple enough" to run. The image has 4 cores and 4GB of RAM, so plenty of resources. Unfortunately, it looks like it will just crash XP soon after launch. Turning 3D support off, the game doesn't even start. Ah well, i'll have to dig out some old laptop...

12 September 2012

Free disk space by removing TimeMachine local snapshots

Mac OSX TimeMachine's "local snapshots" are basically backup files stored on your local hard-disk (as opposed to regular ones stored on the external TimeMachine server). This is a useful feature for people who travel a lot, so they can revert files to recent versions even when they are not connected to their home TimeMachine.

Unfortunately, this feature tends to take a lot of disk space whenever you deal with big files (like movies etc). In some cases, you might want to temporarily disable it in order to claim back a few GigaBytes. Note that your regular TimeMachine backups will NOT be affected; you'll just lose the ability to revert to recent (i.e. from last week) versions without being connected to the external TimeMachine.

TimeMachine is managed from the command line with the tmutil command. You can type man tmutil to see all options. To disable the snapshots and delete those big files (which are found under /.MobileBackups, by the way), use "sudo tmutil disablelocal". Once snapshots are purged, you can restart the feature (if you need it) with "sudo tmutil enablelocal" (yeah, not exactly rocket science).

In my case, this procedure freed some 139 GB on a 512 GB disk. Not bad!

26 August 2012

VmWare tip

This might be obvious to most, but it's easy for novice VmWare users to forget.

When you take a snapshot, VmWare has to write down the full current status of your image, including RAM. If you take a snapshot while the image is running, VmWare will have to save the full RAM content, which might run up to several GBs. If you take the snapshot after your image has properly shut down, RAM content will simply be discarded, and VmWare will only have to deal with the actual disk.

TL;DR: Always shut down your image before snapshotting, and you'll save a lot of disk space.

25 August 2012

Encrypting and Decrypting SQLDeveloper 3 Passwords

Some Oracle products are fairly sweet, let's be honest. One of them is the revamped SQLDeveloper, which has finally caught up (mostly) with MSSQL Management Studio.

One of its best features is the ability to import and export list of connections via XML (right-click on Connections to find the relevant menu). The resulting file is very readable, and hence easily manipulable. The only opaque item is the encrypted password, but it turns out that they are not particularly hardened. This is what you have to do to be able to manipulate them for fun and profit.

  1. Get Jython, or your favourite choice of JVM dialect that can work with jars. I picked Jython because 1) it's Python! and 2) Oracle ships it with most products, under oracle_common\util\jython.
  2. Load ojmisc.jar and db-ca.jar. These can be found in different places depending on your SQLDeveloper version.
  3. Import oracle.jdevimpl.db.adapter.DatabaseProviderHelper. That class has the two methods you need: goingOut (i.e. encrypt) and comingIn (i.e. decrypt).

So here's a complete Jython script for the Windows version of SQLDeveloper:

# set this to the path where you extracted SQLDeveloper
SQLDEV_ROOT = r'C:\sqldeveloper' 
# here's the real stuff
import sys
from os.path import join
sys.path.append(join(SQLDEV_ROOT,r'sqldeveloper\extensions\oracle.datamodeler\lib\ojmisc.jar'))
sys.path.append(join(SQLDEV_ROOT,r'sqldeveloper\modules\oracle.adf.model_11.1.1\db-ca.jar'))
from oracle.jdevimpl.db.adapter.DatabaseProviderHelper import goingOut as encrypt, comingIn as decrypt

if __name__=='__main__':
    print "Encrypted 'password': " + encrypt('password')
    print "Decrypted 'password': " + decrypt(encrypt('password'))

24 August 2012

Oracle XE 11.2.0.2 + Oracle Client 11.2.0.1 = cannot "connect / as sysdba"

While having fun with some complicated (and likely illegal!) scenarios I won't go into now, I ended up this morning with an Oracle XE database to which I wanted to connect with the usual DBA routine:

sqlplus /nolog
connect / as sysdba
I kept getting the dreadful "ORA-12560: TNS:protocol adapter error" and couldn't understand why. Surely I didn't break it that much? The service was still up and I could connect regularly in every way except that.

It turns out this is what happens when you try to do too many things on the same machine. I had installed an Oracle Client on top, because I needed a few things from there (OleDB provider etc) and because other products are guaranteed to work with that client rather than XE.

The problem is that the sqlplus version that comes with the Client is older (!) than the one from XE, but because of how the Windows PATH system variable gets manipulated by the installer, the old version takes over and gives you this problem.

Solution? In your PATH, make sure the folder from the oraclexe directory comes before the one from any client you might have there. However, understand that this is likely to break your Client, so either make sure the change is temporary, or set it locally only when you need it (batch files!).

19 August 2012

How to compile PyObjC for Python 3 on OSX 10.8 Mountain Lion

Another one for teh Google...

It so happens that I am curious about PyObjC, the Python bindings for Objective-C, which is the "native" language of choice for OSX/iOS.

As usual, my timing is completely wrong: recent versions of Xcode dropped support for PyObjC, and the project has shrunk to basically one person (that Ronald Oussoren I previously mentioned). The version on PyPI seems to work with Python 2.x only. Even the official page on Sourceforge is basically abandoned, and packages available from there are obsolete. This is a problem because I'm really trying hard to do everything with Python 3 these days, and the PyObjC version shipped with OSX 10.8 "Mountain Lion" is for 2.7 (the only Python version Apple ships and supports).

Luckily, from my past tribolations I knew that Ronald had his own repository on BitBucket, so I tried that and it worked fine. However, the documentation on how to build PyObjC from source is quite scarce (in fine geek tradition), and I had to figure out the following principles in the hard way:

  • Ronald's repository is split into many separate packages that have to be individually built. This is very fine-grained, but a bit cumbersome for the general case.
  • Do not use the setup.py script you'll find under /pyobjc . These are just for people pulling from PyPI, i.e. post-release.
  • /pyobjc-xcode is obsolete, and there's nothing to build there.
  • /pyobjc-framework-XgridFoundation simply refuses to build under ML. Xgrid is a somewhat obscure, proprietary Apple technology for highly-parallel computation. If you don't know what it is, chances are that you won't need it. I personally don't care about it.
  • /pyobjc-core is a requirement for all other packages, so it should be built first.
  • In order of importance, /pyobjc-framework-Cocoa, -Quartz and -CoreData are dependencies of other packages, so they should be built in this order before any other pyobjc-framework-*.
  • Python 3 support is occasionally shaky. In one occasion, one file had to be patched to remove unicode literals (the u'mystring' notation from Python 2 that was dropped in Python 3.0), but that's just a temporary snag: Python 3.3 will reintroduce that syntax as a compatibility hack for exactly this type of situation. I've submitted a patch anyway, but if you can't wait for Ronald to consider it, it is available in the below-mentioned repository.
  • Looking at BitBucket, I noticed there's at least one significant fork that is arguably targeting Python 3 more consistently. You might want to try that if Ronald's version is not good enough for you.

Because I don't plan to do this sort of work every day, I've put together a script so that I won't have to remember all this stuff when starting a new virtualenv environment. It's now available from my utils repo on BitBucket. There is no documentation but OMG IT'S FULL OF COMMENTS so there. As usual, any feedback is more than welcome.

13 August 2012

How to run py2app on OSX 10.8 Mountain Lion and Python 3.2

This post is for teh Google and all poor souls trying to use py2app on Mountain Lion.

To make it short, the latest official release of py2app does not work with ML and Python 3.2, you have to get the current development snapshot. Unfortunately, py2app requires a number of smaller libraries written by his developer, Ronald Oussoren, and most of them have to be upgraded as well (and before you curse his name: he's single-handedly maintaining py2app, pyObjC and virtualenv-mac; what have you done recently for the community?).

So here's my recipe:

  1. Clone all required repos.
    Oussoren uses Bitbucket, which is better accessed through Mercurial (hg); you can get hg from your favourite package manager (Homebrew/MacPorts/Fink/whatever).
    Then:
    hg clone https://bitbucket.org/ronaldoussoren/altgraph
    hg clone https://bitbucket.org/ronaldoussoren/macholib
    hg clone https://bitbucket.org/ronaldoussoren/py2app
    
  2. Install the packages. Since you're basically tracking trunk, you should probably use the develop mode of setuptools:
    cd altgraph && python setup.py develop && cd ..
    cd macholib && python setup.py develop && cd ..
    cd py2app && python setup.py develop && cd ..
    Note that this means you'll have to keep these "source" folders available forever. If you don't like that, you should create an egg (e.g. python setup.py bdist_egg), then install it (easy_install dist/your-resulting.egg).

    For the record, altgraph will present itself as version 0.10, macholib as 0.7, and py2app as 1.5.
  3. Now you should be able to run your python setup.py py2app

Bonus achievement: if you're using PyQt, this version of py2app will give you Retina-ready packages, by automatically adding the NSPrincipalClass key to the generated Info.plist and setting it to NSApplication. Nice one, Roland!

13 July 2012

MacBook Pro "Retina" Quick Review - Or How I'm Learning To Stop Worrying And Love The Mac

Today I finally picked up my new MacBook Pro "Retina" (2.6Ghz/16GB/512GB). I have to say that it's the first Mac I've ever truly wanted; I've used others in the past (the original "60s-tv" iMac, the white iBook, a few G5s and occasional MBP), but they were bought by other people for other people. So I'm not really a "Mac person": since 2002 I've mostly used Linux at home and Windows at work, and kinda abandoned Linux last year for Windows 7.

The first thing I noticed, and a big reason for the switch to The Land Of Steve, is how thin and light this MBP is. It's roughly half as thick as my Dell Latitude E6510 (which is powerful and packs a nice screen, but by God is it bulky), and probably about 35% lighter.

My second thought was about silence. I don't think I've ever owned a laptop this quiet, let alone one with an i7 CPU on board. You can't really appreciate it in a chaotic Apple Store, but when you're all alone at night, the lack of noise is incredibly refreshing. Note that I live in a quiet residential suburb; for any city-dweller this MBP can be considered completely silent.

Most things you've read elsewhere are also true: it boots faster than a phone, and you don't really need to actually shut it down unless forced by installs/updates; the screen is gorgeous (as long as you stick to updated apps like the Developer branch of Google Chrome, avoiding sucky ones like the official Twitter client) and it feels faster than any laptop I've ever used.

I have to say I couldn't notice any lag on my model until now, although I did feel it was a bit sluggish when I first tried one at the Store. I suspect the 2.3Ghz/8Gb configuration (which is what you get there, and what most reviewers have tried) is not enough to smoothly drive the über-screen; but I don't think I'll regret the choice not to shell out another £200 for the 2.7Ghz model, which comes with a larger cache (8Mb vs 6Mb).

And this is all that separates this particular Mac from other Macs out there. But what about differences between OSX and Windows? Obviously there are zillions of flame-threads on this subject, but these are my first thoughts as a switcher:

  • The damn keyboard layout. I must find a way to REMAP ALL THE THINGS!
  • No PgUp/PgDn/Del/PrtScrn (update: for PgUp/Dn use Fn-ArrowUp/ArrowDown). I'm wearing a black armband to mourn them right now.
  • Shortcuts are all fucked up. If you minimize a window and you want to bring it back, you have to Cmd-Tab back to the program, keep Cmd pressed, then press Alt before leaving Cmd. This is just Wrong. To maximize a window without going fullscreen, you have to invoke the "Zoom" feature, which often requires a custom shortcut. And so on and so forth. There is a steep learning curve for keyboard monkeys.
  • Trackpad gestures become necessary to survive. This might not be such a bad thing.
  • Any non-Apple peripheral will require an Internet connection to automatically download a half-decent, probably-uncustomizable driver. But, of course, why would you ever buy anything not Made By Steve?
  • Some DMG files, you launch them and they launch the app right there. Others will show a funny "drag app from this icon to that icon" screen. Others will open a window and expect you to know what to drag where. In comparison, Windows installers look admirably consistent.
  • Many developers (VmWare, Ascendo...) make you pay twice for the same product on a different platform. This is not nice. I should be able to deactivate a product on computer A and activate it on computer B with the same license. I'm still one person using one program.
  • gfxCardStatus, Little Snitch and iTerm2 are lovely. GPlus Tab and Facebook Tab suck.

More to follow in a few weeks; until Mountain Lion is officially out, I'm not going to deploy the full barrage of developer tools -- I'll probably format the disk anyway to install the new OS, so there's no point in wasting time now.

04 June 2012

Django 1.4 help file CHM version (and how to build your own)

UPDATE 2012-06-04: God save the Queen! Thanks to the long "Jubilee Holiday Weekend", I got around generating an updated version for Django 1.4. Here it is: Djangodocs 1.4 in CHM format. Enjoy!

----------

This is a funny story.

I happen to think Microsoft's proprietary CHM format is lovely. So I went looking for a CHM version of docs for Django, and google found it for me on this blog. I duly downloaded it, tried to to open it and... it wouldn't display. I could only see the TOC, but not the actual documents. I thought this might be a corrupted version, and it was for an alpha release of Django anyway, so I though I'd compile a version myself. After all, these docs are built with Sphinx, which apparently can generate all sorts of formats...

So here's the procedure to compile django's docs:

  1. download and install Sphinx.
    Easy_install Sphinx
    was all I needed. Hurrah for Python.
  2. ADDITIONAL STEP for v1.4: modify _theme\djangodocs\layout.html to remove all javascript tags, otherwise you'll get jQuery-related errors in the final output. This is a known bug.
  3. Run Sphinx to generate the initial files:
    cd Django-1.1/docs
    mkdir _build/html
    %PYTHONDIR%\scripts\sphinx-build.exe -b htmlhelp -d _build\doctrees . _build\html
    
  4. Download and install Htmlhelp.exe from the Microsoft site. This will give you the HTML Help Workshop. Note: it doesn't matter if you get a final message saying you already have a more recent version.
  5. launch the workshop, File -> Compile..., select the file Djangodoc.hhc which should now be in _build/html, and this will produce the chm.
  6. ...??? Profit!

... Then I found out the reason that downloaded CHM didn't work was a stupid patch from Microsoft. Ouch.

Anyway, if you need it, here's the file: Djangodocs 1.1 in CHM format. If it doesn't work, make sure you follow this suggested procedure, and save yourself some time...

27 March 2012

RDP Quick Screenshots, Or: How I've Learnt To Stop Worrying And Reverse The Problem

My work involves installing stuff on customers' servers, mostly running Windows. I usually have very limited access to them, often having to go through the customers' own computers, and what I can or cannot install is regulated by strict policies (which is good practice). And of course, one wants to minimize potential problems and maximize performance, so only the minimum amount of necessary applications and tools are installed. This would all be fine, if I didn't have to take lots and lots of screenshots in order to document (and prove) what I'm doing and how I'm doing it.

This is not a problem if I can work from my laptop, where I can run a powerful app like SnagIt or Camtasia, but it's a real pain if I have to use other hardware. If it's a simple environment with a handful of machines, I can make do with the default Remote Desktop client (mstsc.exe); if I'm lucky, it'll be a modern version that supports the CTRL-ALT-+ shortcut, which takes a screenshot of the active window inside the RDP session. That's not ideal: the resulting images are large BMP files, and you have to manually paste each one into a document right after taking the screenshot; it breaks your flow and there's a good chance you'll forget to paste it right away and lose the image after some careless CTRL-C... but I guess I could live with it.

Unfortunately, I mostly have to work on environments including dozens of machines, so the only practical approach is to use a RDP manager; since I cannot install any fancy app, it usually means I have to make do with the Remote Desktop Console (tsmmc.msc) or its modern equivalent Remote Desktop Manager. That means saying bye-bye to CTRL-ALT-+ and hello PrintScreen and mspaint.exe/Edit/Crop. Argh.

Today I thought I'd solve this problem once and for all. As Bruno Oliveira eloquently illustrated in his chart, automation is The Way of The Geek, and I am a goddamn geek. Embracing my Google-fu, I set off to find The One True Tool for this task.

My first stop was QuickScreenShots. It's a simple screenshotting app that doesn't require installation; just unzip it on the server and off you go. It features shortcuts to take screenshots of an active window, arbitrary region or full desktop; images can be automatically saved to a specific folder; best of all, it's written in (ta-daaa!) Python! w00t!

Unfortunately, it doesn't feature anything similar to CTRL-ALT-+. Not a problem, I thought: where there's Python, there's a way. Except that it didn't turn out to be the case here. RDP deals in graphic screens, not desktop widgets, and it has no concept of something like "the active window"; this is what Raymond Chen himself told me, and Raymond knows a thing or two about Windows (euphemism of the month). Mstsc.exe probably uses an undocumented extension (I guess through the Virtual Channel interfaces for RDP "plugins") to get the active window, and as far as I can see, it doesn't expose the feature through automation objects (although I haven't looked very hard, to be honest; at the end of the day, I figured it would probably be inaccessible when run through tsmmc.msc anyway). At one point I've even tried to hack it by using WshShell.SendKeys to fake a CTRL-ALT-+, but somehow it didn't work (I find SendKeys quite "temperamental" and very dependent on the Windows version; on one XP image, for example, the documented {PRTSC} keycode simply wouldn't work for me).

Sad and lonely, I was almost resigned to long, intimate sessions with mspaint, when I had the most classic epiphany. I realized my problem could be easily solved by reversing the approach: instead of trying to pull screenshots through the RDP client, I could run QuickScreenShots on all machines (after all, it's portable!), inside the RDP server sessions. I just need to point the "autosave folder" to a network share and lo, all my screenshots of the active window should end up there, nicely saved as PNG. It's so easy it almost hurts, considering I've wasted a couple of hours going through MSDN, but I'm happy I've found a decent solution anyway.

19 March 2012

Simple Python script to clean up HTML produced by Excel

Here's a throwaway Python script to clean up HTML produced by Microsoft Excel 2010. I leave it here just so that I can find it later, or if anybody else has the same problem -- for some reason, I couldn't google an easy solution anywhere. I'm sure this doesn't cover all the corner cases and complex layouts, but it's a starting point showing most of the techniques you'll ever need: tag stripping, attribute stripping (either en-masse or selective), and handling crappy declarations ("<!if" tags).

It's for Python 3 (although I think it'll work almost unmodified in 2.7, you'll just have to change open() calls with codecs.open()) and requires BeautifulSoup 4+, which really does all the magic. I don't know if it's the power of Py3k or BS getting better and better, but it's gone through a dozen files in a blink.

16 March 2012

Some Useful Windows 7 Utilities

I just finished one of my periodic rounds of "Windows 7 improvements", and I thought I'd share my findings.

First, Text Editor Anywhere.
If you are familiar with the classic Firefox extension "It's All Text", then you know what this is about: TEA will launch an external editor where you can edit the contents of any text area. No more losing long posts because of some random refresh! And the joy of using all the shortcuts you love in your preferred text editor. The beauty of TAE is that it works in *any* text area, regardless of it being in a browser or a program, and you can even invoke different editors.

Another incredibly useful little app is WinLaunch.
It provides a full-screen iOS-style launcher in Windows 7, which you invoke with a custom shortcut (Shift-Tab by default). It's a fantastic way to rid your desktop of all those application icons, so that your Rainmeter skin can look fabulous without sacrificing ease of access. Now, if only I could have some sort of drawer where to drop all the files I casually drop on the desktop...

Users of the Microsoft Touch Mouse will appreciate Touch Mouse Mate.
It currently adds three features to the mouse: middle-click (tapping with three fingers), tap-to-click (IMHO the mouse is a bit too sensitive for that, but at least you have the option), and a left-handed mode. The project is open source and very active, so I expect further improvements will soon follow. Personally, I'd love to be able to define custom gestures, which is the real killer feature this mouse is missing.

And that's it! Any other utility out there that I should know about? :)

10 January 2012

Conversation silos are an anti-pattern

It looks (to me) like people just stopped commenting on blog posts. More and more, I might find some great, very technical post, just to discover it has zero comments; ironically, I'd probably got there through a link on Hacker News or Reddit or G+, where it'd have dozens (if not hundreds) of comments. The original author might or might not know about the conversation; if it's happening on one of the large portals, he'll probably find out because of the traffic surge and its side-effects (db crash, bandwidth bill, etc), but if it's happening in a smaller community he doesn't participate in, he might remain completely oblivious to its existence.

This is a sad anti-pattern, exactly as bad as Disqus; it's just another way of building information silos. It's even more sad to see this model being pushed by the geek community, who should know better. Yes, trackback/pingback and  RSS have failed; but there must be a better way to interlink the debate across this bunch of URLs we call "the web".

04 January 2012

IntelliJ PyCharm 2.0 and Jython 2.2 don't really go together

UPDATE: Dmitry Jemerov from IntelliJ responded, explaining what the situation is. TL;DR: 2.2 is simply too old, other features might come if there is demand. I've amended the post to reflect this.

Let me preface this rant by saying that I've been happily using JetBrains PyCharm for a few months, and it's certainly one of the best Python IDEs out there. The price is ridiculously low and if you're serious about Python, buying PyCharm is one of the best investments you can make. It can be used for free for 30 days, so you really should give it a shot.

This said, if you happen to work with Jython 2.2, you'll probably want to use something else. The claim that Jython is fully supported as a runtime, while literally true, is somehow stretched is only valid for 2.5+.

Let's say you work on Windows, and you have your Jython installed under C:\jython-2.2 (yes, it's damn old, but it's still the most widely-deployed release out there -- just ask IBM and Oracle).

You create a new project in PyCharm, then go to Settings -> Python interpreter, remove the preconfigured CPython runtime, then click on Add and point to your C:\jython-2.2\jython.bat. Bang, "SDK is not valid". The list of library paths, which supposed to be automatically generated by the IDE picking up the environment configuration, is now empty.

Still, PyCharm should be smart enough to parse arbitrary .py files in specific directories, right? So let's click on Add... to point to C:\jython-2.2\Lib, then OK.
Now let's create a .py file, "from pprint import pprint", the module is recognised; "Run" the script, output is correct, life is good: Jython is indeed supported as a Python runtime.

Ok, let's do some file I/O, "import os"... uh, os is not recognised as a valid module. Same for sys. Apparently, they are somewhat special in Jython and are implemented directly into the main jar, so PyCharm can't see them for autocompletion or any other smart feature. I don't know what else is "special" in Jython 2.2, but I'd rather not have to find out.

Which brings us to the main shortcoming of PyCharm as a Jython IDE: it simply won't recognise or parse any Java jar. This is somewhat surprising, considering how the program is basically a spin-off of IntelliJ IDEA, a Java IDE, and is completely built on Java. In fact, it shares the codebase with the Python Plugin for IDEA. One would think PyCharm would be ideally suited to the task of handling the "Python on Java" mesh that is Jython, but alas that's not the case. A quick search on the IntelliJ forum brings up recent posts stating that full Jython support for autocompletion is simply not on the cards; Jython is supported as a runtime and nothing more. In fact, the Python plugin for IDEA probably handles a Jython setup better than PyCharm, and that's not going to change any time soon. The main target for PyCharm are clearly Django/web developers, not integrators. UPDATE: see this blog post for more details on the real situation.

This state of things is a bit saddening. I don't know if this is a way for JetBrains to avoid cannibalizing its main cash-cow (IDEA), or simply a commercial oversight; the fact is that we have a product, ideally positioned to completely own a niche, which simply refuses to do so and actually delivers a second-rate experience. I hope JetBrains will re-evaluate their stance at some point, because it's a bit of a shame really.