A Taste of Scala

August 30, 2011 at 08:33 PM | categories: scala | View Comments

Recently at the Queensland JVM Meetup group I gave a Scala introduction presentation, with some good 'ol Java bashing to boot.

Check it out. Slides here.

Read and Post Comments

Java is the new COBOL

August 03, 2011 at 08:00 PM | categories: cobol, java | View Comments

Apart from complaining about Clearcase I have actually done some real work the past few years.

Chapter 1: How we converted our entire COBOL code-base to Java (sort-of).

Already you might start to feel a sick feeling in the back of your throat. That's normal and should pass shortly after you finish reading.

COBOL? WTF?!? This is usually the first thing people say to me when I tell them about what I've been working on. I'm sad to say that COBOL is still very much in use.

The dilemma that our company faced was that our main product has been around for many years and over time we have built up a unbelievably large code-base of COBOL, consisting of millions of lines. Re-writing this is both costly and time-consuming, not to mention risky. We needed another option.

Another serious problem is that the upfront cost of CICS, the COBOL 'application server' if you will, running on dedicated Mainframe hardware, plus the cost of Micro Focus licenses, for compiling COBOL, is bloody expensive. If we could run on a 100% Java stack, using open source technologies, we could save ourselves, and our customers, cold, hard cash.

At this point I need to mention something 'special' about how we use COBOL. To support a wide-range of transaction systems and databases we developed a custom variation of the language, which included custom-built 'macros' which generate unique code depending on the environment. While not especially relevant to this article, this leads to larger-than-expected COBOL (which is large enough as it is). The size of the program is significant for a few reasons, which I'll discuss below.

Initially we started with LegacyJ, a commercial product that advertised productive COBOL to Java conversion. What was nice about using LegacyJ was that we quickly discovered that it was, in fact, possible to convert our code successfully and have a running system. However, we ran into a few serious problems that made us hesitate.

Firstly, the Java generated by LegacyJ was quite lengthy and often didn't compile due to the length of some methods and number of fields. Apparently Java has a limit, not that you would ever conceivably reach it. To work around this I had to re-parse the Java to break these methods into smaller chunks and introduce a hierarchy of classes to work-around the field limit. Yuck.

Secondly, the classes generated by LegacyJ didn't separate the idea of 'meta' information such as variable types from runtime data. For each instance of a program, variables had effectively duplicate copies of type information, resulting in an extra-large memory footprint.

The other issue, and perhaps the most compelling, was that of money; LegacyJ was not cheap. We were trading one expensive platform, CICS, with another.

At the same time the following article appeared, introducing an open-source COBOL to Java converter called NACA. I tried it almost immediately but quickly found that many of our COBOL programs didn't compile due to some commands that NACA hadn't implemented. At first I gave up and went back to our LegacyJ integration. It was only later, after taking a second look, that I realised there was much more potential on NACA's generated Java and general approach.

The most obvious was that the Java was actually readable! At least if you count this as readable. NACA actually checked-in their Java files after the conversion, so the code had to be both readable and maintainable. This also had the nice side-effect of allowing our absolutely massive generated COBOL programs to compile (in 99% of cases anyway).

In addition there was a separate and static class structure representing the program definition, meaning that each program required less memory.

I was given some time to investigate the possibility of making NACA work with our unique flavour of COBOL. Fortunately it turned out there wasn't too much missing and I managed to get a working prototype in a reasonably short period of time. After that the decision to switch to a cheaper and open-source alternative which we could control wasn't hard to make and we haven't looked back since.

To avoid making this post longer that it already is I might save the important discussion of performance for another day. In short our pure-Java application runs surprisingly quickly. The biggest bottleneck is, without a doubt, one of memory. Running an entire-COBOL runtime within the JVM is obviously costly in terms of memory, not helped by our generated COBOL and vast code-base.

Do I recommend this approach to others? Absolutely, without a doubt. There seems to be people advising against a direct port, or at least re-thinking the problem first. For us the issue is one of scale. There simply isn't enough time/money to re-write everything, at least not in this decade. We needed to do something now; something we could guarantee would continue to work.

The benefits of running a pure-Java stack are, alone, compelling. One example that springs to mind is that of tracing. Once upon a time we would need to ask customers with a bug to recompile specific applications in trace mode in the vain hope that we actually knew where the problem was. Now we can leverage powerful Java logging (no, not that useless java.util.logging) and have full tracing across the entire stack; something that is invaluable for customer support.

So, while I hate the idea of granting further life to our hideous COBOL demon, from a business point-of-view it has been crucial in the continued success and evolution of our product; giving us breathing room to slowly migrate COBOL logic to 'normal' Java applications while guaranteeing our business logic continues to serve our customers. Or at least that's what our marketing brochures say; for me it was fun.

Read and Post Comments

Blogofile

June 02, 2011 at 08:21 PM | categories: github, blogofile | View Comments

Inspired by my friend OJ, I've decided to make the switch to Blogofile. I guess I have too much time on my hands.

There isn't much I can say on this subject that hasn't already been said. This script came in handy to convert my Blogger posts across.

My biggest hurdle was finding a new host and being a complete cheapskate. I couldn't seem to find a free and simple HTML site. I started to think about Dropbox, until I realised that they don't respect index.html files. Fortunately someone had a neat idea - host it on GitHub!

One thing would have been nice is a selection of out-the-box themes; but completely understandable that there isn't. In the meantime I ported my current Blogger theme, which was surprisingly painless.

I'll probably be tweaking bits-and-pieces of the site in the next few weeks, like a child playing with a new toy. Procrastinating instead of, say, writing new blog posts.

Read and Post Comments

Who needs documentation anyway?

May 20, 2011 at 07:41 PM | categories: restructuredtext, markup, documentation | View Comments

So, we have a dilemma at work. How do the developers and documentation teams collaborate on a single set of documentation files while using familiar tools? Is such a thing possible or am I dreaming?

Currently the documentation team at our work uses Author-it, a full-blown authoring 'solution'. For some time we, the developers, were been sent Word documents which we updated and emailed back. At this point the source control fanatic in me started to twitch. It's a document - just like source code - can I see who did what changes? How do two people work on the same document at the same time, and what about conflicts? Can I have multiple version of the same document? The other problem was one of cost; Author-it is too expensive to justify individual licenses for developers.

We had talked about switching to a more text-based documentation system on-and-off for some time. Finally I cracked and decided to just do it.

Latex was the first possibility that surfaced. However, as I thought about it more the idea of everyone having to learn obtuse Latex syntax before writing a single-line of documentation was a little off-putting. Note: I really do like Latex, but I felt like it might have been overly complicated for our situation.

The second alternative was to use a publication format like Docbook or DITA. While certainly a powerful way to group content, the prospect of living in XML hell didn't enthuse me all that much. I already get enough of that in Java-land thank you very much.

My final option was to use a lightweight markup langague, like Markdown or reStructureText (ReST), which GitHub actively promotes for their README files. This felt like the right fit for us, and without further consultation I converted our current internal word documents to ReST, added them to Git and created a Jenkin job to output PDF and HTML files. Bam!

At this point us developers were happy again. We were back to using our favourite source control for collaboration, without have to stuff around with emailing Word documents. We could quickly edit text files, use source control to track/merge changes and didn't have to worry any more about niggling presentation issues. My manager, on the other hand, who was away at the time, was not so pleased. (I swear this had nothing to do with the timing of my rash decision).

For us developers the idea of using a text editor to edit a document is a fairly comfortable one. For people used to - and who enjoy using - Word this was in some ways a big step backwards. No spell checking, no auto-complete, no drag 'n' drop. All that good stuff. Another complaint he had was about presentation - the PDF output wasn't slick enough to be used officially. Finally, if we wanted to switch to something else, ReST seemed to lack in some of the more powerful concepts required to support single-source publishing.

I could definitely be wrong but it feels like nothing has quite nailed this space in the space, at least in the OSS community. What would certainly help is a powerful and easy to use editor. Perhaps a plugin for Open Office would do the trick? Alternatively a browser-based editor, not too unlike this.

We're currently in a holding pattern at the moment. Developers are still (happily) using ReST, but the 'official' stuff is being written by the documentation team. Us and them.

I see a couple of available options:
  1. What am I talking about, emailing Word documents around is fine - stop complaining.
  2. Embrace Author-it and see if it can support more collaborators. Just looking at their website briefly it seems like they have an online review mode.
  3. Meet my boss half-way, he can live without the nice GUI but we switch to a singe-source markup language like DocBook/DITA.
  4. Everyone else be damned; developers rule; reStructuredText it is!
I'm curious what other people have done in similar situations.
Read and Post Comments

Java logging and per-user tracing

May 14, 2011 at 05:20 AM | categories: logging, java, logback | View Comments

This is a fairly short post about my brief experiences with logging in Java. I realise it's 2011 and logging is very unsexy, but I thought I'd share anyway.

Let me just say this straight-up - Java Logging you are completely fucking useless. It boggles the mind how badly Sun screwed up the implementation of something so simple but yet so fundamental. It's logging for Christ's sake; how hard can it be?!? This captures the essence of my feelings, so I'll leave it at that.

One thing I did want to mention is Logback and the awesome SiftingAppender. Logback is the successor to the much loved Log4J, written by the same dude and addressing some of the problems with the original.

My manager wanted a way for our users to enable individual logging on the server and not require any meddling on the server by an administrator. A quick Google revealed this and from there it wasn't hard to implement a start/stop trace button on each page which harnesses MDC for per-user logging. On completion the trace can either be downloaded or emailed directly from the server to the system administrator or bug tracker.

Honestly, if/when I ever work on another online application I will almost certainly re-implement something very similar again. Being able to capture a trace of everything from the server for a single user session has proven to be an invaluable tool for diagnosing bugs. Give it a whirl!
Read and Post Comments