19 July 2014

Oracle ODBC Connection Strings - how I learnt to stop googling and RTFM

I just wasted four hours on the most idiotic thing, so I thought I'd document it here as self-reference.

Background: to connect to some Oracle db, I'm using the excellent pypyodbc module, which is a pure-Python ODBC implementation - basically a not-so-thin layer on top of your installed ODBC providers - that works great with Python 3. If you have to support multiple database vendors (in my case, Oracle, MSSQL, DB2 and maybe others), it makes sense to avoid packing a module for each product and just let ODBC work its magic.

The main problem with ODBC has always been the dark magic involved in crafting connection strings. Each driver provides different options, and when the syntax is not correct, in most cases there is precious little feedback. This is why we have sites like connectionstrings.com.

In my case, the connection string I was using worked fine with TNS names (the stuff in tnsnames.ora) like this:

Driver={Oracle in OraClient11g_home1};DBQ=myTnsServiceName; Uid=myUsername; Pwd=myPassword;

However, I did not want to rely on that particular catalog (which is often misconfigured/broken in the real world), and would rather specify the usual host, port and sid trimurti. So I went on connectionstrings.com and found the following:

Driver={Oracle in OraClient11g_home1}; Server=serverSID; Uid=myUsername; Pwd=myPassword;

... and then I spent four hours figuring out why it wasn't working. I turned on all tracing options, spent ages reading tracing logs, tried umpteen different values for SERVER... all for nought: from logs, it was clear that my SERVER option was completely disregarded and replaced with some default "orcl" values.

Desperate, I eventually thought of daring the (usually unwieldy) original driver documentation from Oracle. And lo, I've found in the FAQ doc for Oracle ODBC, on page 13, a very helpful table listing all the options you can specify in a connection string. "SERVER" was nowhere to be seen. Ouch.

It turns out the trick was to keep using "DBQ" and just replace it with the standard Oracle network syntax:

Driver={Oracle in OraClient11g_home1}; DBQ=myserver.mydomain.com:1521/mySid; Uid=myUsername; Pwd=myPassword;

In the end, I wasted 4 hours because I thought googling would have been faster than Reading The Fine Manual. Lesson learnt.

06 June 2014

Dash docset for Python 2.2.1 (i.e. Jython for Weblogic / Websphere)

I use Dash quite a bit, so I just spent a little bit of time creating a docset from Python 2.2.1 documentation. This old Python version matches the Jython implementation shipped with Oracle WebLogic ("WebLogic Scripting Tool", or WLST) and IBM WebSphere.

To install it in your Dash, just click on this link:dash-feed://https%3A%2F%2Fraw.githubusercontent.com%2Ftoyg%2Fpy221dashdocs%2Fmaster%2Ffeed.xml

The source script is in my GitHub repo, and you can manually download resulting packages on the Release page.

As tempting as it is, the idea to repackage webapp-specific documentation (e.g. for connect(), startEdit() etc) is a non-starter due to Oracle and IBM being quite trigger-happy with their copyright lawyers.

03 June 2014

OpenAir API shock

From the NetSuite OpenAir API documentation (PDF):

Since we are using HTTP, each connection is isolated, and must go through authorization each time. This authorization consists of sending the server an XML data structure consisting of company name, user name, and user password.

... really? In 2014? Ever heard of tokens? I'm not asking for full OAuth, but a simple header-based token mechanism is banal, faster and much more secure than sending XML with user and password for each request.

Oh, your API endpoint is a Perl script. That explains it, I guess... you are not "using HTTP", you are using CGI. Badly.

After this gem, I'm not surprised to learn that they implement simple data-retrieval actions with POST (or PUT -- what?) rather than GET, that the whole API basically consists in exposing database tables as they are, and that their XML is entirely custom. Excuse me, I think I've just thrown up in my mouth...

25 May 2014

Python and cmd.exe on Windows - a world of pain.

As I mentioned a few days ago on Hacker News in a Ruby thread, CPython support for Windows is, overall, extraordinarily good for a runtime with clear Unix roots. This said, occasionally you'll eventually hit a wall and find yourself cursing Guido & Bill under your breath. Yesterday was one of those times for me.

I'm currently working on a project using Python 3.3.5 on Windows 2008 r2, building a program that will talk to Weblogic 10.3 via the Jython-powered WLST interface (it actually does more than that: it leverages Jython to also instantiate several complex Java classes, launch VBS scripts and so on and so forth). In order to correctly set up WLST/Jython, I have to launch a batch file which in turns calls several other batch files in order to set up all sorts of environment variables. These are all pure-DOS batch files doing very little except creating or reading environment variables, but they're nested two or three levels deep from the entry-point batch.

For some reason, when I launched this batch with Popen(), variables were not set correctly. In fact, it looked like "call" statements in the batch were just silently ignored. I tried using shell=True, and it made no difference whatsoever. I put it down to some weird cmd.exe behaviour; tried to switch extension from .bat to .cmd and things started to move a bit more (so much for all those posts saying there is no difference between the two) but still some stuff wouldn't work, so I eventually settled for reimplementing the whole batch chain in Python (which is terrible and will likely bite me a year down the line as the version of Weblogic changes, but beggars can't be choosers).

The most frustrating thing, however, was that opening an interactive pipe to test and do exploratory programming was just too difficult. There are a lot of examples out there talking about .send(), .communicate() and stdin=subprocess.PIPE, but nobody seems to mention what I experienced: as soon as you call communicate() on a cmd.exe launched with Popen(), all pipes are closed and there is no obvious way to reopen them. I don't think this is due to cmd.exe, because the last output I got was always ">More ?"; I think this is just CPython being too eager to clean up.

Luckily, I found a solution in WinPexpect, a fork of Pexpect that actually deals with Windows weirdness. Processes launched with winpexpect.winspawn() actually keep their stdin pipes up long enough for me to figure out enough stuff to fully re-implement the batch chain.

The result is a Python script three times as long as the original batch and likely to break the first time Oracle changes a line here or there. It will do for now, but the experience left a sour taste in my mouth, so to speak; cmd.exe is a crappy shell, but it shouldn't be that hard to open a long-running prompt-like process piping stuff to it. If I'm missing an obviously-better solution, please let me know and I'll happily blog about it, because clearly Google and DuckDuckGo need to know about it.

04 January 2014

How to run Twister on ARM (Debian) and OSX

I'm growing very fond of Twister, a new project from Brazilian developer Miguel Freitas. Miguel basically leveraged Bitcoin and Bittorrent concepts (and code!) to build a fully-decentralised Twitter clone. The P2P architecture means that it's completely immune to censorship and "chilling effects": nobody in the network can hand your account over to the authorities or disclose who you are, nobody can delete your posts -- and neither can you, so be careful what you post :)

It's definitely early days (it's still a bit hacky to set up, and the html interface needs work), but I've decided to help a bit, so I managed to make it work on OSX and now on my little ARM-based always-on home server, the CuBox. Like most lightweight P2P services, Twister is perfect for long-running, low-power ARM instances (Raspberry Pi etc), so this is what you'll need to know if you want to set it up there.

  1. Download all libdb4.8* packages from Bittylicious. Berkeley DB 4.8 is a Bitcoin requirement, and at this point in time there's no way around it. Unfortunately, recent distributions (i.e. Debian Wheezy and newer) replaced it with 5.1, so apt-get won't help you (NOTE: if you don't care about wallet compatibility with already-running instances, you can just use libdb5.1). You can compile libdb4.8 from source if you want, but they'll require about 500 MB of dependencies (java, X etc); the Bittylicious packages worked fine for my vanilla Debian install. There's a chance that BDB as a whole will be dropped at some point in the future.
  2. Download other dependencies: openssl-dev, libboost-all-dev, miniupnpc (miniUPNPc is technically optional but strongly recommended, it makes it a breeze to go through firewalls). These should come down fine with apt-get. Note that OpenSSL should be version 1.0+ (check by running openssl version). NOTE: it's been reported that default Fedora openssl packages (built without support for Elliptic Curve crypto) will make Twister crash at the moment. Make sure you find alternatives until the bug is fixed.
  3. clone Twister repositories:
    git clone https://github.com/miguelfreitas/twister-core.git
    git clone https://github.com/miguelfreitas/twister-html.git
  4. prepare your configuration (note: DON'T edit user and pwd, they're hardcoded elsewhere at the moment):
    mkdir ~/.twister
    echo -e "rpcuser=user\nrpcpassword=pwd" > ~/.twister/twister.conf
    ln -s /path/to/your/twister-html ~/.twister/html
  5. Follow these instructions to build twisterd. You might want to link it under /usr/local/bin once done.
  6. Launch it. Give it 5 to 10 minutes to download the full blockchain, then connect to http://your-server:28332/home.html . You'll be prompted to create an ID - do so.
  7. After creation, make sure you backup your key by doing:
    ./twisterd dumpprivkey my-user > my-key.txt
    Backup my-key.txt (or just its content) in a safe place, it's your all-important key: you lose that, you lose access to your ID and there is no way to get it back!
  8. You should also backup ~/.twister/user_data as it contains your direct messages; these can theoretically be retrieved if lost, but it takes a while, so it's probably good practice to just back them up regularly.
  9. Enjoy! The community is very small at the moment, so feel free to go on a follow-spree. There are two mailing lists, Twister-Users (more active) and Twister-Dev (more technical). The project is brand new (Miguel released it at the end of November 2013) so expect a bumpy ride, but the community is very friendly and in need of help with HTML and jQuery as well as C++, if you can spare some cycles :)

For OSX, the process is slightly more convoluted at the moment, so I documented it in the official doc/build-osx.md. Note that I'm not a make guru, I'm certain that there's plenty of room for improvement there -- feel free to contribute!