04 March 2009

On The Burden Of Legacy

I recently thought it would be good to write a few utilities for one of the products I support. This product was built on DCOM (don't ask me why). Obviously, being a pythonaro, I just had to use Python. And so my tribulations began.

Python supports COM objects thanks to the lovely Win32 package by Mark Hammond. When you have a DLL that you want to use in python, run "makepy.py yourlib.dll" (or just makepy.py, and browse through the registered libraries to choose one) and lo, it's available as a first-class python module. So I do that, and start poking around with various stuff, and eventually I call a method that returns an object whose definition is in another DLL, so it only comes up as a PyIUnknown object. Ah well. I run makepy.py on the second DLL, to no avail.

After a bit of googling and fiddling, I finally stumbled on the right incantation.

myobj = win32com.client.Dispatch(myobj.QueryInterface(pythoncom.IID_IDispatch))
myobj = win32com.client.CastTo(myobj,'MyIClassName')
Basically, I have to find the right interface ID for the object, dispatch it, and then recast the object with a different type which is specified by a string. Man, isn't that ugly. Not because of having to recast (which I can understand) but because the recasting needs the class name as a string, you can't somehow derive it from the object. If you don't know the "secret" class name, you can't use its methods even if you know it's of the right type. From a modern OOP point of view, this is quite absurd... but there is an explanation.

COM was built when reflection was not yet mainstream, and types were static and immutable. Programmers were used to declare types for each variable, and if the type wasn't right, the compiler would cry. Polimorphism, reflection and dynamic interpreters were slowly reaching mainstream acceptance, but Microsoft had to retrofit them on top of existing frameworks, one feature at a time. The result was the incomplete hybrid I now have to deal with.

Well, I thought, this open-source CPython stuff in Redmond is considered akin to communism, no wonder it doesn't play well with old MS frameworks. Let's try to move further down "the Microsoft Way" and see if things improve: I shall use IronPython. After all, it's just a thin layer on top of the .Net CLR, right? Certainly it will be able to guess types a bit better...

So, a few installs later (you need at minimum the "Windows SDK 6.0" for good interop tools for .Net 2.0, plus the IronPython installer), there I was, trying to do the same thing.
The .Net version of makepy.py is called "tlbimp.exe", and it will basically build a new DLL to wrap the old one. You then import the new DLL in IronPython and lo, you can use the objects. But what happens when a function returns a reference to an object not defined in the DLL? Well, more or less the same thing as for win32com -- you get an unknown interface. So again,you have to wrap all the other DLLs, and explicitly instance these objects with the right type. Slightly better but not really that much different (so I'll stick to CPython, thank you).

As I said, this sort of explicit casting was perfectly acceptable 15 or even 10 years ago. But honestly, don't we live so much better without ? Why win32com and (more damning) IronPython cannot yet look up this sort of thing automatically? I understand that I'm working with legacy interop, but still... this sort of "programmer usability improvement" would help a lot in real-world situation where The Big Rewrite is simply not an option.

1 comment:

Michael Foord said...

The type of COM interop you are doing is early binding. There is a more convenient form available with IronPython that doesn't require the interop assemblies - late binding.

See this page on the IronPython cookbook for details:


The cookbook has other examples of COM interop from IronPython.