Evil JVM feature optimizes away error information

The other day I ran across a tricky bug in the Flying Saucer PDF generation library.  It’s been pretty useful so far, but it does not seem to be under active development, so it has the occasional stale bug. In this case, our designer tried to use “display: table”, which crashes Flying Saucer R8.  Workaround is to use a different “display” property, like “display: block”.

The really annoying part was researching the bug.  The JVM includes an “optimization” that causes it to lose stack trace information for repeated exceptions in some cases.  So the server log showed “Caused by ClassCastException” but without any trace.  You can disable this “optimization” with the JVM argument XX:-OmitStackTraceInFastThrow .  I keep using apologetic quotes because I think there is something terribly wrong with optimizing exception control flow at the expense of debugging information.  If your application is throwing exceptions all over the place, you probably want to do something about that, and having the stack trace can make a world of difference.  In the case above, it cost me hours of debugging time (since I had never heard of this “feature” previously, I assumed it was a problem in our codebase), when once I had the stack trace it only took a few minutes to figure out.

Personally I would rather never see this “optimization” used, but at the very least I think it should not be enabled by default.

Learning about Software Architecture

It’s not trivial to find good sources for learning about software architecture.  Google something specific, like “the foo doesn’t work in the bar system, it throws BazException” and you’re pretty likely to get a specific and correct Stack Overflow answer or blog post. But search for “software architecture” or related terms and you get some good material plus a bunch of garbage disguised as good material.

Part of the problem is that software architecture is a bit like theology.  It’s arcane, deep, and people have lots of varied and strong opinions about it. One way to save yourself some headache is to focus on reading authors whom you trust.  For example, I read some of Martin Fowler’s writing, plus I checked out the reviews of his book on Amazon, and concluded I think he has good ideas.  So I’ll add him to my personal list of architects to follow.  The work of finding trusted sources is something everyone has to do for himself; it can be a good starting point to look for popular writers, but there’s no substitute for critically evaluating the author’s ideas.

What do I look for in an architect?  Mainly, someone who is able to see the value of different solutions to a software design problem.  Design patterns are not formulas to be followed blindly: they are idioms that have benefits and costs.  The complexity costs of patterns and abstractions are often greatly underestimated, even (or especially) by good engineers.  I particularly like his discussion of GUI patterns (linked previously).  He’s not afraid to discuss ambiguity and talk about how people sometimes use the same term to mean different things.  Architects have to be able to see and discuss the shades of grey in software design.

Problem scope

Designing code that is flexible, easy to apply for different purposes, is difficult.  Every software engineer has experienced using both highly flexible and inflexible code. Using flexible code is much easier and more fun than the alternative.  So we all learn to nod and smile when people talk about making code flexible and reusable.

But sometimes we forget the purpose of software engineering.  Not all code needs to be flexible.  Sometimes it is enough to dash off a quick piece of code that solves the current problem and be done with it.  Sometimes code needs to be a little bit flexible, but only to solve a few related problems.  The trick is to recognize the scope of the problem at hand.  We often do this by thinking of use cases and trying to identify concrete ways that the code will be used.  Use cases have to be real ways that the system will be used.  (Not cool ways that it could be used “if we did it this way.”)

The temptation for many engineers, including myself, is to take a simple problem, try to think of all the ways a solution could be extended, and try to stretch the solution too far.  But software design is a bit like a balloon: the more you stretch it, the thinner it gets, and if you go to far it may explode and leave you without anything useful.  It’s easy to waste time thinking of harder problems than the one at hand.  But every feature, every concept, every layer of indirection that you add to the solution adds to the complexity, which affects the implementation time, debugging time, and difficulty of modification in the future.

Sometimes we really do solve big problems that require highly flexible software.  But the right approach is not to “think really hard” and take a psychic stab at the answer.  Instead, bigger problems need more use cases; the number and scope of use cases is proportional to (and in fact, is) the size of the problem.  The first step in design is to identify the uses of the software as much as possible, then design software that fits those uses—and no more.  This came up the other day when I was thinking about a mini-UI-framework for an application I was creating.  It needed to be a little flexible to handle several different UI cases, but it seemed like every solution was flexible in some ways and inflexible in other ways.  Then I remembered that for the problem I was solving, I didn’t need an ultimately powerful framework, I just needed something to handle the half-dozen cases at hand.  I chose a simple abstraction that fit those cases, and moved on.  Any engineer could probably point at several reasons why I should have done it a different way (because of the various inflexibilities I introduced), but the fact is that I solved the problem of today, and we’ll let tomorrow worry about itself.

See related post by Kevlin Henney.

Easy text-based (XML or JSON) serialization in Scala or Java

Trying to find the right software library is always an interesting problem.  Sometimes there are only a few choices, and your constraints dictate what you use, but this week I was looking for a library to solve a rather common problem.  With so many options, I had to research and prototype with a few of the more promising ones and see how they play out.  It’s fun to compare the different approaches and find the right tool for the job.

My primary goals were:

  1. To serialize existing data objects in my Scala program
  2. Into a text-based format (JSON or XML) for easier debugging; size and speed were not an issue
  3. With a minimum amount of pain and effort; I don’t need much control over the format

I started with some simple web searching and reading other people’s comparisons of the libraries.  I found several promising options, focusing first on JSON because it’s the slimmer and hipper format.  I spent quite a bit of time on lift-json because it seemed to be a mature and popular Scala option, and it handled some simple cases rather nicely.  Unfortunately, it choked on several use cases in my program.  For one thing, it requires Maps to have String keys.  I found a supposed workaround that only handled serializing and not deserializing.  This wasn’t the last time I had trouble round-tripping data in one of these libraries.  It also struggled with complex case classes, even though it advertised support for case classes, regardless of how I configured it.  And it had some Twilight Zone caching problems in sbt.  So I also tried jackson and Google’s gson.  Again, they seemed to work in simple cases, but as soon as I threw a “complex” case class composed of other case classes and maps at them, they couldn’t serialize and then deserialize.

Finally I landed on XStream, a mature Java library for serializing any Java object to XML and back. Thank goodness Scala lives in the Java universe. Despite the intricacies of Scala, XStream had no trouble sending an object to XML and back with a single line of code and no configuration.  The resulting XML is slightly more ugly than XML normally is, but as you see from my goals listed above, I don’t really care.  Even so, it’s not too difficult to add in some extensions similar to mixedbits-webframework and get cleaner XML.

Sometimes an older solution works perfectly well.  I wanted to believe in the JSON libraries, but they weren’t quite up to the task.  I was particularly annoyed to keep running across surprising API limitations; personally I think any exception to an advertised function should be clearly documented, as XStream does.  If you say you support case classes and maps, either support them 100% or precisely specify which kinds you don’t support.  Don’t make me install the library and play around with it to learn the truth; it breaks my trust in the software.  XStream was a delightful contrast.  It just worked out-of-the-box, even for complex cases.  And if my needs were more complex, I now trust XStream enough to invest the time to learn its API and write extensions.