Planet Classpath
On Thursday night (January 8) I'll be making a short presentation about JFreeChart to the Bordeaux Java User Group. If you are in the area, register and come along.


My French is nowhere near good enough (yet) so the presentation will be in English, but I'm trying to at least write the slides in French. And I'm planning to work through a live demo in code, so it doesn't really matter what language you speak, as long as it is Java!
A new version of JFreeChart was released on the last day of 2008, but I didn't have time to announce it until now. The sample chart below highlights a couple of features that made it into this release. First, the axes have support for minor tick marks (a long outstanding feature request that has at last been implemented). And second, despite JFreeChart having supported multiple axes and datasets since I can't remember how long ago now, it hasn't been so easy to add *duplicate* axes for a single dataset. That's been dealt with too.


As usual the release contains a number of bug fixes and other small changes. Refer to the NEWS file and ChangeLog for details.

Happy New Year to everyone!
I'm listening to the 'holidays 2008ish' episode of Javaposse and in reviewing their last years predictions they have enough fumbling around the status of OpenJDK that I want to do a little bit of explaining. OpenJDK 6 != JDK 6... moved servers and if done correctly nobody will notice (except for the new server having a totally sweet favicon Tap). But if you do happen to notice anything odd with the planet after the move, then please do yell and scream.

Elyn got me this DVD for Christmas.  Aside from being a “spaghetti eastern” (har, har), it is a lovely meditation on the universal act of eating.

This is such an unusual film that I’ve been thinking a bit about what attracts me to it.  I like the joy and the quietness of it — I like happy endings and mundane, as opposed to extreme, conflict.  Also, I enjoy how very foreign it seems… I know zilch about Japan, and for all I know a truck-driving cowboy is some kind of icon there — but here he just seems bizarre.  And, I like the film’s digressions from the main story, which are entertaining but not excessively distracting.

This is a must-see.

As anyone working on GCC would know, GCC bootstrap times are getting worse. It is so excruciating on some platforms that it is nearly impossible to keep those platforms up-to-date even if people want to. Of course, many more optimisations, new languages and their ever-bloating runtimes, more comprehensive support for language standards, etc. make it inevitable that bootstrap times increase, but does it really have to increase so much?

On my home PC, a "c,c++,java" bootstrap takes more than three hours and a complete testsuite run takes a lot of time as well. Considering that any change to the main compiler needs a complete bootstrap and testsuite run twice over (once without and once with your patch), that too in the best case of no regressions, is it small wonder that many people who might want to otherwise volunteer to help with GCC development just cannot afford to? I have only so much free time left after my job and my family and many a time I feel I am much better off reading a good book or watching a good movie, for example, than literally losing sleep over GCC. Small wonder then that almost all of the prolific contributors to GCC either work on it as a part of their job or on really fast machines with loads of memory (or both).

Perhaps it is not a good idea after all to have a single compiler codebase support so many languages and runtimes at the same time. Perhaps it would be better to start over by creating a well-defined (in terms of the structure and contract) set of language and platform-independent intermediate languages (different avatars of GENERIC and RTL) and have the front-ends and the back-ends as separate projects from the core framework. Of course, if things were this simple people would have done it already, but a man can dream, can't he?

(Originally posted on Advogato.)
I hate EWFL tree nodes in GCJ. So many of the ICEs (internal compiler errors) I have seen in GCJ are because some piece of code expects or doesn't expect an EWFL node. To put it simply, the current front-end wants a WFL-wrapped expression node whenever there is a need to emit a warning or an error about that expression, but not otherwise.

This can easily frustrate anyone wishing to fix some of these ICEs in the hopes of making GCJ better. For example, here I am discovering that many ICEs in the Jacks testsuite are because the body of an empty block ({}) or statement is not being wrapped in an EWFL for diagnostics about unreachable statements, finding that it is trivially fixed and then discovering that doing this creates a whole mess of new ICEs on other tests, which have to be individually addressed in this manner potentially creating yet other ICEs in other places, ad nauseum.

To quote Jeff Law (gcc/ChangeLog.tree-ssa), "Death to WFL nodes"!

(Originally posted on Advogato.)

Have you seen this error before?

FATAL ERROR: JVMPI, an experimental interface, is no longer supported.
Please use the supported interface: the JVM Tool Interface (JVM TI).
For information on temporary workarounds contact:

For a long time now, since we released JDK 1.5, we have been warning people that the VM profiling interface JVMPI is going away. Starting with the JDK 6 update 3 release (JDK6u3), it is gone for good.

If you really need JVMPI, your best bet is to use a JDK 1.5 or older release, and also find out about transitioning to JVM TI. More often than not, you have become dependent on a tool that uses JVMPI, in which case you should try and upgrade that tool to a version that uses JVM TI instead. But if you have written your own JVMPI code, see the JVMPI transition article at for help in transitioning to JVM TI.

NOTE: Getting this message indicates that JVMPI has been requested of the JVM. A request for JVMPI must be made prior to JVM initialization and regardless of whether JVMPI is eventually used at runtime, just the request for it will have a negative performance impact on your Java application. In most situations, JVMPI should never be requested unless some kind of performance work is being done and slower performance is considered acceptable. JVM TI does not have many of the JVMPI limitations.

A few references of interest:


I've released IKVM 0.38 to SourceForge. The binaries are identical to the ones in release candidate 2.

Release Notes

This document lists the known issues and incompatibilities.


  • Code unloading (aka class GC) is not supported.
  • In Java static initializers can deadlock, on .NET some threads can see uninitialized state in cases where deadlock would occur on the JVM.
  • JNI
    • Only supported in the default AppDomain.
    • Only the JNICALL calling convention is supported! (On Windows, HotSpot appears to also support the cdecl calling convention).
    • Cannot call string contructors on already existing string instances
    • A few limitations in Invocation API support
      • The Invocation API is only supported when running on .NET.
      • JNI_CreateJavaVM: init options "-verbose[:class|:gc|:jni]", "vfprintf", "exit" and "abort" are not implemented. The JDK 1.1 version of JavaVMInitArgs isn't supported.
      • JNI_GetDefaultJavaVMInitArgs not implemented
      • JNI_GetCreatedJavaVMs only returns the JavaVM if the VM was started through JNI or a JNI call that retrieves the JavaVM has already occurred.
      • DestroyJVM is only partially implemented (it waits until there are no more non-daemon Java threads and then returns JNI_ERR).
      • DetachCurrentThread doesn't release monitors held by the thread.
    • Native libraries are never unloaded (because code unloading is not supported).
  • The JVM allows any reference type to be passed where an interface reference is expected (and to store any reference type in an interface reference type field), on IKVM this results in an IncompatibleClassChangeError.
  • monitorenter / monitorexit cannot be used on unitialized this reference.
  • Floating point is not fully spec compliant.
  • A method returning a boolean that returns an integer other than 0 or 1 behaves differently (this also applies to byte/char/short and for method parameters).
  • Synchronized blocks are not async exception safe.
  • Ghost arrays don't throw ArrayStoreException when you store an object that doesn't implement the ghost interface.
  • Class loading is more eager than on the reference VM.
  • Interface implementation methods are never really final (interface can be reimplemented by .NET subclasses).
  • JSR-133 finalization spec change is not fully implemented. The JSR-133 changes dictate that an object should not be finalized unless the Object constructor has run successfully, but this isn't implemented.
  • When a java.lang.Error (or subclass) is thrown in (and escapes) a static initializer, the stack trace might be (partially) lost.

Static Compiler (ikvmc)

  • Some subtle differences with ikvmc compiled code for public members inherited from non-public base classes (so called "access stubs"). Because the access stub lives in a derived class, when accessing a member in a base class, the derived cctor will be run whereas java (and ikvm) only runs the base cctor.
  • Try blocks around base class ctor invocation result in unverifiable code (no known compilers produce this type of code).
  • Try/catch blocks before base class ctor invocation result in unverifiable code (this actually happens with the Eclipse compiler when you pass a class literal to the base class ctor and compile with -target 1.4).
  • Only code compiled in a single assembly fully obeys the JLS binary compatibility rules.
  • An assembly can only contain one resource with a particular name.
  • Passing incorrect command line options to ikvmc may result in an exception rather than a proper error messages.
  • Under specific circumstances ikvmc may die with an exception if you're compiling code that references missing classes. As a workaround, supply the missing classes (may be stubs).
  • Under specific circumstances ikvmc may produce unverifiable code if you're compiling code that references missing classes. As a workaround, supply the missing classes (may be stubs).

Class Library

Most class library code is based on OpenJDK 6 build 12. Below is a list of divergences and IKVM specific implementation notes.        Not implemented.
java.applet GNU Classpath implementation. Not implemented.
java.awt GNU Classpath implementation with partial System.Windows.Forms based back-end. Not supported. Not implemented.
java.lang.instrument Not implemented. Not implemented. No IPv6 support implemented. Getting the default system proxy for a URL is not implemented.
java.text.Bidi GNU Classpath implementation. Not supported. Partially based on GNU Classpath implementation.
javax.imageio.plugins.jpeg Not implemented. Not implemented.
javax.print Not implemented.
javax.script Not implemented.
javax.smartcardio Not implemented.
javax.sound Not implemented.
javax.swing GNU Classpath implementation. Not supported. Not implemented.
org.ietfs.jgss Not implemented.
sun.jdbc.odbc Not implemented. Audio content handlers not implemented. Image content handlers not implemented.

The entire public API is available, so "Not implemented." for javax.print, for example, means that the API is there but there is no back-end to provide the actual printing support. "Not supported." means that the code is there and probably works at least somewhat, but that I'm less likely to fix bugs reported in these areas.

Specific API notes:

  • java.lang.Thread.stop(Throwable t) doesn't support throwing arbitrary exceptions on other threads (only java.lang.ThreadDeath).
  • java.lang.Thread.holdsLock(Object o) causes a spurious notify on the object (this is allowed by the J2SE 5.0 spec).
  • java.lang.String.intern() strings are never garbage collected.
  • Weak/soft references and reference queues are inefficient and do not fully implement the required semantics.
  • java.lang.ref.SoftReference: Soft references are not guaranteed to be cleared before an OutOfMemoryError is thrown.
  • Threads started outside of Java aren't "visible" (e.g. in ThreadGroup.enumerate()) until they first call Thread.currentThread().
  • java.lang.Thread.getState() returns WAITING or TIMED_WAITING instead of BLOCKING when we're inside Object.wait() and blocking to re-acquire the monitor.
  • shared locks are only supported on Windows NT derived operating systems.
  • java.lang.SecurityManager: Deprecated methods not implemented: classDepth(String), inClass(String), classLoaderDepth(), currentLoadedClass(), currentClassLoader(), inClassLoader()

Supported Platforms

This release has been tested on the following CLI implementations / platforms:

CLI Implementation       Architecture      Operating System
.NET 2.0 SP2 x86 Windows
.NET 2.0 SP2 x64 Windows

Partial Trust

There is experimental support for running in partial trust.

It‘s quite some time I am missing and have not update the blog. The reasons are quite obvious: heavy work load (had to port Jogl on an OpenGL capable VxWorks machine… quite challenging, but the Caciocavallo project helped me quite a lot, will post some details and screenshots when I am back), and now I am on holiday, in the beautiful and hot south Italy, without acess to the Internet or any PC (yahi, for 20 days!)...

It‘s so incredible for me to go out at night at 1 a.m. and still find people around… And no need of super warm clothes!!! Drinking beer on the promenade, staring at the moonlight shadow on the sea… oh, I feel alive again finally!

Ok, I could find a pc to write this little blog post, after all I am a geek, but at least I cannot program here (first, it‘s a windows machine, so by definition no one can ever program on this crap, second, I don‘t have a Linux DVD at hand, so I cannot fix the software, it‘s already a miracle I managed to write this stuff, btw).

I am spending some time with my family and going around doing photos, eating food, drinking wine… Meeting people…

In the idle evenings, I‘m trying to learn about the beautiful CELL processor and the Playstation 3 (yes, for gaming, but also for programming, and no, it‘s just because I like it, no official work/projects… yet). I really like its design, the idea of having a bunch of Synergetic Processing Elements (SPE, 8 on the “standard“ version, 6 + 1 on the PS3) controlled by a Power Processing Element.

While this makes programming quite difficult, the power unleashed is worthy of the strongest Jedi masters around. “Sadly“ the GPU on the PS3 is “only” a G70, this means that, as far as I know, on this processor there is no CUDA support, but it‘s PhysX capable, quite important as I really think PhysX has been tuned for use on the CELL as well as the G70. Ok, the CELL already provides much of the funtionality that you can obtain via CUDA/OpenCL, so perhaps is not so important anyway. I can‘t wait to have one to start some hacking!!

What should go with the bundle? Probably Little Big Planet, Mirror‘s Edge, Dead Space, and, of course, The Force Unleashed. Feel free to google for them :)

Ok, time to eat some pandoro, so better to rush. I wish you all a very great new year!

I've checked in all the changes required to split the class library into ten different assemblies. So here is the first snapshot that contains the split binaries.

This means that the -sharedclassloader ikvmc option has been implemented, but it isn't ready for prime time yet, for now I've only focussed on getting the core class library to build with it.


  • Split IKVM.OpenJDK.ClassLibrary.dll into ten parts.
  • Added -sharedclassloader option to ikvmc.
  • Removed some GNU Classpath build leftovers.
  • Removed workaround for com.sun.beans.ObjectHandler.classForName2() that hopefully isn't necessary any more.
  • Made ikvmc emit a warning whenever it emits code that throws a hard error.
  • Fixed ikvmc to detect access to members in another assembly that expose non-public types from that assembly (the CLR doesn't allow this) and generate java.lang.IllegalAccessError (plus warning during compilation) instead of producing invalid code.
  • Volker Berlin checked in his first set of changes to replace java.awt.image.BufferedImage with the OpenJDK version.

As always with a development snapshot, don't use this in production, but please do try it out and let me know about it. The sources are available in cvs and the binaries here:

The two primary goals of making small language changes in JDK 7 is to:

  1. Make the things programmers do everyday easier.

  2. Support other platform changes in JDK 7.

Over the years, certain common coding patterns have been recognized as needlessly verbose including:

  • if-equals-X-else-if-equals-Y testing chains on strings

  • duplicated catch blocks for different exception types

  • repeated type parameters when declaring and initializing a variable of parameterized type

These patterns can be replaced with new constructs that are more concise and more clear without fundamentally altering the language. Besides improvements to support existing Java programs, language changes should also be made to allow appropriate access to new JVM capabilities, such as those being enabled by the Da Vinci Machine project.

While language changes can fundamentally improve the modes of expression in a language, language changes have a number of drawbacks as solutions to programming problems:

  • Slow availability: Language changes occur in platform releases, which typically only occur every few years.

  • Heavyweight: The full extent of a language changes can affect multiple components of the platform.

  • Changes may be needed at multiple points in the toolchain: Even after a language change is fully available in the JDK, independent libraries and tools may need to be updated as well before the changes can be fully utilized.

Therefore, language changes are rarely the preferred solution if other workable solutions are available. Since IDEs are now commonly used for Java development, mitigating or solving problems using IDE tooling is one possibility. As of Java SE 6, compliant compilers are required to support annotation processing as standardized by JSR 269, see javax.annotation.processing and javax.lang.model. Annotation processing provides a general meta-programming framework; beyond processing annotations directly, annotation processors can be used to implement many currently extra-lingual checks based on a program's structure. Checks which previously would have required language changes can now be implemented by developers and just used by convention. JSR 308, Annotations on Java Types, would enable more detailed checking by allowing annotations in more program locations.

When judging whether or not any change to the platform is worthwhile, a useful notion is estimating the feature's "thrust to weight ratio," that is estimating whether the benefits of making the change exceed the full cost of implementing the change. For language changes, this metric is improved by having a larger fraction of programs potentially benefiting from the change. For example, it would be roughly the same amount of engineering to add numerical operator overloading support for classes like BigInteger and BigDecimal as to add support for bracket, "[]", syntax for Lists and Maps. Besides complications with the == operator in the numerical case, bracket syntax for Maps and Lists has much higher utility since many more Java programs use Collections than large numbers.

Especially with the maturity of the Java platform, the onus is on the proposer to convince that a language change should go in; the onus is not to prove the change should stay out.

Given the upcoming holidays, the language change proposal form and the seeding proposals will both be coming in January 2009.

Today I made my second official patch to OpenJDK. I forgot how to make the jtreg test and had to figure it out all over again, so here’s my quick and dirty guide for the future:

  1. Build jtreg. I use the IcedTea one, because it’s there:
    make jtreg
  2. Make a test root and copy your test into it:
    mkdir -p tests/tests
    touch tests/TEST.ROOT
    mv ~/ tests/tests
  3. Run the tests:
    openjdk-ecj/control/build/linux-ppc/j2sdk-image/jre/bin/java -jar test/jtreg.jar -v1 -s tests

In other news it’s over a year since I started hacking on Zero. I was hoping to be able to announce a TCK-passing build before Christmas but that’s not going to happen. Oh well.


there have been interesting questions in the comments sections of my last post. I take the time to answer them here:

Mark Wielaard asked: Finally, do you have any speed comparisons of the hotspot/zero and cacao on arm ?

Unfortunately not. Subjective it feels as if the SwingSet2 demo runs better with Cacao. The Freerunner and its software are also not in a good shape to make performance tests. Even native C applications show unacceptable slow behavior.

Xerxes Rånby said: Your combined effort have provided the key to unlock the possibility to run OpenJDK on all embedded Linux NAS storage devices, home routers and more )

Indeed and I very much welcome people to try compiling the recipes for their MIPS devices as well. I also wonder if OpenJDK can be compiled against uClibc and uClibc++. If that is the case then we can also have OpenJDK+Hotspot/Zero on AVR32 (though a few more patches to the build system will be needed for this). )

Andrew John Hughes wrote: What version of IcedTea did you use? I think some issues have changed/been fixed. Certainly –with-openjdk-src-dir isn’t broken any more )

I used the latest from the stable release: 1.3.1. Good to hear that some issues are already fixed. I am looking forward to integrate some patches after Xmas.

Eric Herman wrote: My sense is that Java is not truly Free if I can not bootstrap it from C, and I was of the impression that I still needed a working Java in order to build Java. I look forward to learning more about the excellent work you’ve been doing.

Actually with OpenEmbedded this is the case: OpenJDK is bootstrapped without the need for a pre-installed Java runtime. What I left out in my explanation is that all of this builds on top of my previous work on a completely self-hosting Java^H^H^H^HJava-like toolchain based on GNU Classpath, JamVM/Cacao, ECJ and eventually Jikes. I have written down the gory details of this in the OE wiki to make it possible that anyone can pick up my work enhance it or re-use it in another build environment.

Finally Wladimir Mutul asked: Why not use Scratchbox(.org) for your cross-compilation ?

Before starting with OpenEmbedded Jalimo was only targeting the Nokia Internet tablets and as such we build our stuff with Scratchbox. First of all: I really like the Debian-way of packaging things (all of the Java related recipes in OE resemble Debian conventions for instance). However three points were most important when deciding to leave the Scratchbox path:

  • At the time we worked with SB it was not possible to either get JamVM or Cacao working reliably inside SB. So we had to use Jikes as a Java compiler. GNU Classpath adopting Java 1.5 syntax would have made it neccessary to use ECJ instead which needs a Java runtime.
  • OpenMoko appeared on the scene and we wanted to provide binaries for this platform. Additionally we had an Irex Iliad (an Ebook reader) and wanted to support it, too. OpenEmbedded built for those out of the box and it turned out that with little work OE could be modified to support Maemo as well.
  • Last but not least with SB a lot of manual work was involved when integrating a new patch or updating to a new version. With OE it is a matter of renaming a recipe and checking which patches are still needed. Gentoo packagers do it in the same way.

It should go without saying that I miss things that would be available by basing everything on Debian. E.g. source packages, direct support for Ant builds, many many tried and tested packages and of course GCJ. Furthermore it isn’t all bright and shiny with OE either. For instance the project is an extremelyfast moving target. People add new stuff to it everyday. Some things have effects on your recipes and may cause build problems. However this is why projects like Poky exists which make stable release on top of OE snapshots.

I hope my explanations cleared things up a bit.

Happy holidays!

I was working in my yard trimming bushes when I heard a buzzing sound. At first I thought maybe one of the automatic valves had gone haywire, so I opened up the box that had the valves in it, and ran back into the house as hundreds of bees started flying around me. Later I went back and took a few pictures.

You can see where I dropped the clippers. This happened a while back but I just recently found the images on my camera while doing my pool pump blog. Here is a close up of the hive:

Underneath the bees are two pancake sized honeycombs, which I saw as I ran into the house, but wasn't able to get pictures before the bees all came back.

I was at a loss as to what to do with the bees, I didn't want to kill them. Eventually we found a local bee person who gladly came to get them and we donated $50 to his 'save the bees' fund. They are called feral bees, and we were told that there is probably a larger hive somewhere close, and these was a new hive from that split from the larger hive. The Bees are disappearing and we were glad we managed to save these. The bee person almost talked me into creating my own bee hive as a hobby, seems like it would be an interesting hobby.


This really is about pool pumps, not Java. ;^)

Almost 10 years ago we upgraded to a new house and when planning out the landscaping, we decided on a pool. I'd never owned a pool before and had no idea what I was getting into. Growing up in Southern California, a great deal of my childhood summers were spent in the local city pool. The thought of having my own pool seemed like a cool idea, and it is great to have, and it looks really cool:

But... pools do not come cheap. One expense is the pool service, which runs me over $130 per month. As an expense compromise, we decided to do our own the yard work, but pay to have the pool taken care of.

Anyway, more to the point of this blog. Pools use pumps, electric pumps, to circulate and filter the water. Depending on your electric rates, the cost of running these pumps can easily run thousands of dollars a year. In my area of California, the cost of electricity is priced in 5 tiers, the highest 5th tier is twice the lowest or baseline tier. Larger homes are pretty much guaranteed to enter the 3rd tier, many get into the 4th and 5th, and the pool pump alone could cost people $4,000 per year. And this cost includes cutting back the time the pump runs in the winter, these pool pumps use lots of electricity.

So about 9 months ago, a loud screeching out by the pool equipment area announced that our pool pump had lost a bearing and managed to destroy itself enough that it needed to be replaced. Just replacing the pump was going to be over $500, but having been given some variable speed pump information recently from Allan Freeman at Alliance Solar. I decided to wait and investigate. Luckily we have two pool pumps, one for the waterfalls and one for the pool filtering, so we had the pool people swap the pumps, temporarily giving up using the waterfalls (not a big deal). This bought us some time.

Then recently Allan contacted me with an estimate to install a variable speed pool pump including the interface to the pool automation system. His estimate also included a predicted electricity savings of somewhere between $700 to $1500 per year! These variable speed pool pumps can potentially pay for themselves in roughly 2 years. So we went for it. Go green! ;^)

The new interface box is on the left of the pool control box, and the new variable speed pump is the left pump, the right one is the pump for the waterfall. The interface box is what somehow maps the pump settings to the older pool control system settings, different speeds are needed for different pumping situations.

Basically the old pump was drawing 9amps, all the time. The new pump will draw anywhere from .7amps to 5amps at the highest speed setting. The really big savings comes from the fact that the basic pool filtering action can use the lower if not lowest pump speeds, and the basic pool filtering is probably 80% of the pump's usage. What a deal!

So if you have a pool, and you want to save on your electric bill, before you go buy solar electric panels, investigate these new variable speed pool pumps. Dollar for dollar, these new pumps could pay for themselves well before solar electric panels could.

Just to note, we have had solar electric and solar pool water panels for many years:

The panels on the right are only used in the summer, heating the pool water, we had those installed probably 8 years ago. The panels on the left are solar electric panels we have had for 4 years or so. The goal on the solar electric panels was to get us out of the more expensive 4th and 5th tiers of the electricity rates, which they have done, and they save us maybe $150 a month, but the system cost close to $8,000 after all the rebates and tax credits (the rebates/credits change from year to year, so investigate this carefully before you buy anything). The panels send DC electricity to the Sunny Boy converted in the garage:

The AC electricity is mostly consumed but if there is excess, it spins the electric meter backwards (no batteries on ours), kind of giving us a credit or in a sense using PG&E (the elctric company) as our battery. We generate far less than we consume, but that was the plan when it was installed, to generate the electricity we would have paid a premium for. Of course, that's why our new pool pump won't save us as much as a neighbor who doesn't have solar electric panels. Still, it saves us money, but it takes much longer to get your money back from a solar electric system. Don't get me wrong, I'm glad we did it, but people need to understand that these systems do cost quite a bit to have installed. First, go for the variable speed pool pump, well, assuming you have a pool. Then look into solar electric panels.

Allan Freeman and his excellent professional crew from Alliance Solar Services in Alameda installed all our solar panels and the new variable speed pump. They can be reached at (510)-523-2833 and I HIGHLY recommend them.


As Ken and Sebastian have already announced it OpenJDK integration into Jalimo was finished. However there was a bit of work to do to not only compile OpenJDK but to also package it nicely. This work is now also completed.

Additionally - and this was not told anywhere else yet - we can now also offer Cacao+OpenJDK. So anyone who is in need of a decent JIT for its target platform can now build this combination, too.

Those who do not know OpenEmbedded may wonder what is so special about the work I have done in the last weeks. Well, the special thing is that we are cross-compiling the OpenJDK. That means the machine on which the JDK is built is of a different kind than the one on which we want to run it later on. The difficulty stems from the fact that the OpenJDK build system is not designed for this (in contrast to the one used by PhoneME btw).

Before I will tell you about the guts of cross-compiling OpenJDK lets enjoy some screenshots:

jtunermetalworks fontdemo netx - warningnetx - aboutjdiskreportswingset2 (openjdk at freerunner)

People may have seen these apps on their desktops and as such they are not very exciting. However for me they have a special meaning: As a contributor to GNU Classpath it was always my wish to be able to run any Java (=J2SE) program on any device running a Free operating system. I was contributing to the Free Swing implementation and together with the enormous work done by Roman, Thomas, Lillian and many others we were able to run a few Swing and AWT programs. Still performance, completeness and correctness of our implementation where limited in many places and would require more years of dedicated effort. Thanks to Sun releasing Java as free software and the important work done by the IcedTea team we can now have the real thing on our devices and suddenly get 100% compatibility. :-)

Now on to the cross-compilation guts:

First of all compared to the work that would have been required with plain OpenJDK IcedTea made the effort of cross-compiling the thing a breeze. The best thing that IcedTea provides is the ability to use a GNU Classpath-based toolchain to build OpenJDK. On the major distros GCJ is used as a runtime and ECJ as the bootstrap Java compiler. In OpenEmbedded we have JamVM, Cacao and plain GNU Classpath as runtime options which work equally well. JamVM does not understand some of the “-X”options so I had to patch their use away.

Although IcedTea is a nice environment I had to patch a few things to get things started. E.g. IcedTea requires you to point to a GCJ home directory. It will then create symlinks to libraries and headers files contained within them. The problems with IcedTea are: When cross-compiling the libraries provided by your system’s GCJ do not make sense for your target (e.g. your system GCJ has AMD64 binaries while your target requires those for ARM). Secondly in a cross-compilation environment like OpenEmbedded you do not want to rely on stuff outside the environment. As such I had to modify this behavior to link to header files (jni.h and likes) which are provided by the OE-built GNU Classpath.

There where other problems like IcedTea not respecting the –with-openjdk-src-dir option. It will still try to download the OpenJDK sources itself (which is not allowed in OE, because this is done through the environment).

All in all I collected my changes in patches which I called ‘build hacks’. I consider these hacks fixable and will work together with the IcedTeam team to resolve those for future releases in order to make cross-compiling OpenJDK even easier in the future.

Besides IcedTea I had to patch OpenJDK itself to get things compiled properly. The first thing that causes trouble is that you cannot chose the compiler being used. OpenJDK contains some complicated makefiles that check your system environment and then decide which compiler to use. I used good old sed to replace the respective part of the makefile. I believe this behavior can be added to IcedTea without causing any harm to the other IcedTea users.

The next problem is the sanitizing step. This one will compile a few binaries (for the target platform!), run them (impossible) and decide on their output whether the build can continue. These checks are done for CUPS and Alsa. The fix is to patch them away. I think an optional –disable-sanity-checks would be OK for IcedTea.

The final big problem with OpenJDK’s build system is that it uses the result of ‘uname -m’ for its decision. This is troublesome because ‘uname -m’ will only tell you the architecture of your build machine not the target one. I solved this by replacing all these calls with a variable and allowed IcedTea’s makefile to provide a value for this.

The remaining issues where minor: Some unsuitable paths here, a sizer.32 binary that I had to take from a previous non-cross build there.

Apropos non-cross build. This was the really ambitious undertaking. But before I explain this here are some cross-compilation basics that I learned in the years contributing to OpenEmbedded:

Non-cross compilation aware projects are troublesome when they create binaries whose output is going to be used directly in the build. The general approach to this problem is that you first compile this particular binary for your build system and use it in the cross-build. In OpenEmbedded this will often result in a separate build recipe. If your project is cross-compilation aware it will allow you to specify the location of such binaries. E.g. Cacao 0.99.x has the –with-cacaoh option to point to the location of the non-cross-compiled header generator. In non-cross-compilation aware projects you need to patch the makefiles accordingly.

The IcedTea build normally consists of two builds of the JDK. The first one creates a bootstrap JDK. This one is heavily patched, stripped down and can be build using GNU Classpath-based software. In the second step the bootstrap JDK’s java, javac and a few other binaries are run to build the final JDK. For our cross-compilation effort this means we need to get this bootstrap JDK built for the build machine and as said above this is where things get troublesome.

The reason for this is that OpenJDK depends on a few libraries like CUPS, Alsa, libjpeg and giflib which I either had to provide in their native (= for the build system) form or patch their use away. Remember that I cannot just take e.g. Debian’s libcupsys-dev because that would be outside the OpenEmbedded build environment (and would make people unhappy who use OE on a different distro).

I decided to go the way of patching the bootstrap JDK build since printing and sound would not be needed for bootstrap purposes anyway. Originally I also disabled most of the AWT (and as such the libxt, xproto dependencies). However it turned out that at one point the OpenJDK build converts a bunch of GIF pictures into Java byte array source code using javax.imageio which requires the headless variant of the AWT.

A few lines above I told you that the bootstrap JDK is compiled separately for the build machine. In a simple project we could now skip this part from the cross-compilation build. However the bootstrap JDK build also creates libraries (for the target machine) to which the final JDK links against. That is why we cannot skip this part.

So cross-compiling OpenJDK consists of the major steps:

  • build a GNU Classpath-based toolchain (gjar, gjavah), ECJ and Ant
  • build the bootstrap JDK for the build machine (’make icedtea-against-ecj’)
  • build the bootstrap JDK for the target machine
  • replace the binaries in bootstrap/icedtea/bin with those of your native boostrap JDK
  • build the final JDK

and as I have written in the first part of this posting: You need to patch everything heavily to get some problems out of the way which stem from the fact that OpenJDK is not cross-compilation aware.

Finally a screenshot showing Cacao+OpenJDK:

cacao + openjdk

The next step will be integrating Hotspot-Shark into OpenEmbedded and of course getting as many of my patches upstream as possible.

Compared to GNU Classpath + Cacao/JamVM the OpenJDK packages are ~6 times larger. I am curious to see how the modularization efforts of Java7 works out. With the work done now it will be much easier to follow those developments. :-)

Edit: Fixed images links.

The excellent Cairo graphics library has a simple function to draw arcs; in C it’s cairo_arc(); from java-gnome it’s Context’s arc() method, etc.

Quite unsurprisingly they define increasing angles as going from the positive x axis on toward the positive y axis. Nothing unusual about that. The only thing that was surprising is that they even mention this in their documentation.

I should know better.

What I totally missed was the implication of this. I didn’t quite clue in that the positive y direction in screen positioning and page drawing is down, and so increasing angles go clockwise. Using cr.arc() to go from 0 to say π/3 radians does not give a rise of 60° like I expected; it gives this:

cairo arc positive

Whoa. This is not the counter-clockwise increasing θ like we’re all used to seeing in normal Cartesian or Polar co-ordinates. But it is indeed increasing toward the positive y axis. Oops. Oh well :)

So I made this illustration and added it to the documentation for Context’s arc() method. Really it’s mostly about pointing out which direction positive y is, but when I’ve learned something like this the hard way, I do my best to try and incorporate that knowledge into our public API. With any luck others can be spared my folly.

Drawn with Cairo, of course!


Update: Some people have pointed out that you can use a transformation matrix, and if you happen to (say) mirror across the horizontal axis then the clockwise notion would no longer apply. Fair enough; but if you have forgotten that +y starts out going down, then you’re not going to think to do such a flip in the first place.

I was pleased to find a decent l’Entrecôte restaurant in east Berlin around the corner from where I was staying.

I ordered what looked like it might be a promising little Côte de Rhône. Somewhat to my chagrin, a bottle of Côte de Beaune showed up instead. Which turned out to be delightful.

Hautes-Côtes De Beaune 2005
“Clos De La Perrière”
Domaine Parigot Père et Fils

Which just goes to show that what you ask for has little to do with what’s actually going to work with the meal you’re having.


Today we're happy to announce we are working with Robert Schuster and Sebastian Mancke of tarent and Jalimo on porting OpenJDK to the BUG.  We have watched excitedly in the past of progress being make in this effort from various developers, and wanted to get involved.  A few days back Robert reported his initial success with HotSpot and OpenJDK build from OpenEmbedded running on an OpenMoko phone.  This will have a big impact on BUG and impact on others as well.

What this means for BUG: The BUG, being a platform for building custom gadgets and applications using Java and OSGi, has been served well with PhoneME.  It's fast, has excellent support, and is very compatible and stable.  However the BUG is a cutting-edge platform and we'd like to offer cutting-edge Java features as well.  Based on tarent's work, we plan to offer a version of BUG that has all the latest language and classpath enhancements in OpenJDK.  This future version will not have GPL classpath restrictions, and will have the potential as being tested as a compliant Java environment.  This is important for those wishing to extend the product.

What this means for Linux devices: Java on embedded devices is a controversial subject.  Some say it's too big, some say it's not "real" Java.  Everyone has an opinion.  This is an interesting time because it means shortly we will have access to same open Java environments on mobile devices that developers have gotten used to on the desktop and server. Embedded systems and general purpose computer categories have been converging for a while now, and this seems like another step in the progression.

What it means for ARM Linux Distros:  It was important for Bug and tarent to do the development in the open, and to use an existing community-based build system: Poky Linux/OpenEmbedded.  As work proceeds, the customers and contributors to open source projects and products such as Gumstix, OpenMoko, and the BeagleBoard can easily and quickly benefit.  We have been using a lot of great stuff from these projects and it's great to be able to work with tarent in giving back.

We encourage others to get involved!  The Jalimo source repository is here:   Open source Java is a great thing, and we're really looking forward to BUG + OpenJDK! 

This year's Devoxx ended a week ago, so after catching up on my mail queue (still some way to go until Inbox Zero, but also nowhere near having to declare e-mail bankruptcy yet), flushing my Community One talk ideas out, booking my FOSDEM trip, and penning another little piece in German, it's time to reflect on the conference that was.

And it was great! It was the first time I went to Devoxx as a speaker on the regular track, so I was a little nervous how a talk (that's describing how we're right in the middle of progressing from a world a few years ago in which Java was a second class citizen on GNU/Linux, that had to be tamed and updated manually, into one in which more and more of the rich commons of Free Software and Open Source projects written in the Java language is available out of the box along with an integrated JVM, and what kind of technological and cultural challenges are waiting for the unwary developer & packager) would play to an audience that I didn't really expect to be using GNU/Linux much yet.

Well, was I in for a surprise! Contrary to my fears, my talk was well attended, with many attendees using Linux themselves, and some good conversations and feedback spinning out of it. In the hallways, there was the usual set of whiteboards, including one with a favourite operating system poll - Linux collected the most votes on that one. This year, familiar Fedora and Ubuntu desktop themes also showed up along side Macs wherever notebooks were opened up to get hold of the WiFi. And thanks to Mark Reinhold's keynote session, Linux had at least 25% of the Devoxx keynote desktop operating system market share, possibly even more ...

Looking back at the conference, it seems as if 'Open Source is simply the way we do things around here these days' is silently moving to the center stage in the Java world, just like how Linux seems to be gradually maturing into a good developer desktop choice without much fanfare. Many of the sessions in the conference schedule covered open source technologies - I think it's been the majority of the content this year, and I suspect it'll continue to grow further over the coming years. That's reflected in some of the conversations I had - the JavaFX ones were all about Linux support, and in particular support for use of free media formats.

One of the surprisingly interesting BOFs I attended was the JCP one, organized by Corina Ulescu. Alex Buckley, Brian Goetz and others debated how to make the specification development process more transparent to more developers without turning it into a mess. Making it easier for JUGs to join the JCP came up as one way of adding more transparency to the process. Using more modern tools for collaboration came up as another, etc. There is a lot experimentation going on, apparently, but it's all moving in the right direction, from where I stand, and seems to be increasingly driven by developers participating in the JCP themselves.

What I enjoy most about conferences like Devoxx is the opportunity to explore ideas and have conversations with old friends and to make new ones. This one was no exception, the hallway track was excellent - many thanks to Stephan & BeJUG for putting a great event together year after year!

I've been doing lots of modularization work, but I'm not quite ready to check it in yet. I've now split the class library assembly into ten assemblies and I've got "shared class loader" support working (at least for the core class library scenario).

I've been positively surprised by how many scenarios can be supported while only loading IKVM.OpenJDK.Core.dll. As the graph below shows, it has more (necessarily circular) dependencies than I would like, but for a number of scenarios I've carefully tweaked the set of classes in Core to enable lazy loading the dependencies only when they are really needed.

I'm not yet ready to commit to supported "Core-only" scenarios, but here's a flavor of some things I've been able to do:

  • Running "Hello, World!" in dynamic mode
  • Serialization
  • Reflection
  • File I/O & Socket I/O (both classic and nio)

Packages included in Core:

  • java.lang
  • java.lang.annotation
  • java.lang.ref
  • java.lang.reflection
  • java.math
  • java.nio
  • java.nio.channels
  • java.nio.channels.spi
  • java.nio.charset
  • java.nio.charset.spi
  • java.util
  • java.util.concurrent
  • java.util.concurrent.atomic
  • java.util.concurrent.locks
  • java.util.regex

Here is the assembly dependency graph (click the image to enlarge):

Here are the current assembly file sizes:

IKVM.OpenJDK.Core.dll 3,278,848
IKVM.OpenJDK.Security.dll 2,646,016
IKVM.OpenJDK.Util.dll 1,111,040
IKVM.OpenJDK.Xml.dll 8,497,664
IKVM.OpenJDK.SwingAWT.dll 3,040,768
IKVM.OpenJDK.Charsets.dll 5,017,088
IKVM.OpenJDK.Corba.dll 2,335,232
IKVM.OpenJDK.Management.dll      1,180,160
IKVM.OpenJDK.Misc.dll 2,668,032
IKVM.OpenJDK.Text.dll 628,736

It's interesting to see that the total is slightly less than the previous size of IKVM.OpenJDK.ClassLibrary.dll (30,472,704).

Quite a few people have reported problems building icedtea6 or needed dependencies on our IRC channel and as the build is quite resource intensive, Caster has now made binary builds for icedtea6. The package is available via layman using:

layman -a java-overlay
emerge icedtea6-bin

The binary package should also make it easier to bootstrap the from source build. The binaries are built in stable chroots so they should run for our stable users too. Please report any problems to with [java-overlay] in the subject. For amd64 users this should be the easiest way to get a 64 bit browser plugin.

I’m not a big fan of UML diagrams, but in this case I think it really helps to explain how Cacio works (and to recognize its beauty ;-) ).

Caciocavallo Architecture Overview UML Diagram

Let me go from top to bottom and explain the parts that make up Caciocavallo:

  • Swing: Everybody knows it. It is a universe of its own. Basically, it builds on AWT, is implemented in 100% Java, only uses so-called lightweight (non-native) components, has a lot of Look & Feel fluff, etc.
  • AWT: Slightly less known than Swing, this dinosaur is the foundation of Swing. It provides another, so-called heavyweight set of widgets, that are usually implemented by the corresponding platform widgets, as well as the toplevel containers (windows, dialogs, frames). It’s still 100% Java, but talks to AWT peers…
  • AWT peers: a set of interfaces that are used by AWT for the platform dependent parts. AWT doesn’t care who it talks to, as long as it provides the implementations for all the widgets in AWT. OpenJDK has two implementations of the peers, one for Win32, one for X11. If you happen to have a system that has all the required widgets and stuff available, this is the place to plug in. Caciocavallo is yet another one that helps for the cases where you don’t have native widgets.
  • The Cacio peers is another set of peers for AWT. It implements all the widgets by using Swing for drawing and logic. The idea is that each AWT widget should live in its own window. This means we have the obvious toplevel windows plus nested windows for all the components and containers. This makes sure that the widgets behave as they should - heavyweight. However, this windowing behaviour is not implemented in the peers directly, but instead hides behind the PlatformWindow interface.
  • PlatformWindow is an interface that provides all the windowing behaviour that we need for the AWT widgets and toplevels. It looks very similar to ComponentPeer aggregated with WindowPeer, but there are also some differences. The PlatformWindow implementation to be used for a widget is created by a PlatformWindowFactory that lives in the CacioToolkit. If you have a system that has no widgets, but supports (nestable) windows, you implement PlatformWindow and get all the AWT widgets for free.
  • ManagedWindow is an implementation of PlatformWindow for the case where you don’t have any native support for windows, for example a plain framebuffer. It implements all the necessary windowing behaviour, including nested and overlapping windows in Java, and builds only on another interface ManagedWindowContainer. Interestingly, it implements this same interface itself. This makes sense so that windows can be nested. ManagedWindows can also be useful on systems where you have support for toplevel windows, but not for nested windows. All you need to do is to implement the ManagedWindowContainer for the topmost container (e.g. the screen or the native toplevel windows).
  • ManagedWindowContainer is a really small interface, it only has a handful of methods, the most important beeing getGraphics(). If you implement this correctly (e.g. the example PlatformScreen class in the diagram), you can serve everything that builds on it - you get windows, you get heavyweight widgets, and of course all the heavyweight and lightweight AWT/Swing stuff.

To summarize, there are 3 points to plug in potential implementations: at the peer level for systems with full widget sets, on the PlatformWindow level for systems w/o widgets but with windows or on the ManagedWindowContainer level for bare bones systems w/o anything. It’s also possible to start a full implementation on ManagedWindowContainer and then work the way up, because the interfaces are similar and higher-level interfaces are more or less supersets of the lower-level interfaces (it should be possible to transform a ManagedWindowContainer into a PlatformWindow, and to transform a PlatformWindow into a ComponentPeer and WindowPeer, etc). To give you a feel of the size: ManagedWindowContainer has ~4 methods, PlatformWindow ~20 methods, the whole set of peer interfaces ~100 methods.

Of course, this overview leaves out a lot of details, but in general it’s that easy. The implementation is not complete though: the peers don’t support all the widgets yet, the managed window doesn’t support every feature yet (i.e. restacking), but the general architecture is in place now, and it will most likely only change in the details.

Yesterday evening I got the managed windows in Caciocavallo to the point where they draw correctly even in the case of obscuring overlapping windows. This is pretty neat:

Caciocavallo Managed Windows

You see the dialog in the middle is a heavyweight window that overlaps the main window beneath, which shows an animation. This behaviour is very important for supporting toplevel windows as well as AWT heavyweight widgets.

Implementing AWT on a fullscreen system using Caciocavallo is now as easy as implementing a Java2D pipeline for that screen, and let Cacio do all the funny stuff like handling windows, implement the widget peers, etc. And implementing the pipeline in the easiest (unoptimized) case also boils down to implementing SurfaceData for the system, which is one day of work or so. Rapid prototyping, yay!

There are some dependencies on licensed code that cannot be open sourced. We are working towards decoupling the dependencies so that the non-proprietary portions can be open sourced. Currently the JavaFX compiler, Netbeans JavaFX plugin and Eclipse JavaFX plugin are already being developed in the open source. The scene graph is out in the open. We will put the core runtime out in the open over time.

Jeet Kaul, VP of the Client Software Group at Sun Microsystems, mapping out the road ahead towards an open source JavaFX runtime.

Our friend Jennifer recommended this the other week.

Initially I put this book into the same category as The Yiddish Policemen’s Union — which is to say, tough competition.  And, while enjoyable, The Eyre Affair is not really up to the same standard; the writing is decent but not popping, the ideas are fun but, after a while, perhaps a bit obvious.

By midway through I decided that this book fits more into the genre of The Hitchhiker’s Guide.  It has a similar approach to logic and reality, and I found it enjoyable in a similar sort of way.  Where Union, improbably, is a serious book in goofy trappings, Affair makes no excuses for its goofiness — every character has an absurd, jokey name.

Affair is the first of a series.  I read the second (good as well) and the third (less good).  The series went meta — events happening inside of books in the book — and I lost my connection with the characters.  Though… even the third has some gems, like the discussion of “had had” and “that that”.

I realized after a while that sometimes I am not positive enough about the good books, or detailed enough about my reasons for liking them.  You really ought to read Union.  It is great.  Soon I Will Be Invincible is another one — I wrote about it tepidly, but it really is a must-read.

We’ve covered many of the features of python-gdb:

  • Writing new commands
  • Convenience functions
  • Pretty-printing
  • Auto-loading of Python code
  • Scripting gdb from Python
  • Bringing up a GUI

In fact, that is probably all of the user-visible things right now.  There are classes and methods in the Python API to gdb that we have not covered, but you can read about those when you need to use them.

What next?  There are a few things to do.  There are probably bugs.  As we saw in some earlier sections, support for I/O redirection is not there.  We need better code for tracking the inferior’s state.  Barring the unexpected, all this will be done in the coming months.

Now is an exciting time to be working on gdb.  There are a number of very interesting projects underway:

  • Reversible debugging is being developed.  The idea here is that gdb can record what your program does, and then you can step backward in time to find the bug.
  • Sérgio Durigan Júnior, at IBM, has been working on syscall tracing support.  This will let us do strace-like tracing in gdb.  What’s nice about this is that all the usual gdb facilities will also be available: think of it as a Python-enabled strace, with stack dump capability.
  • The excellent folks at Code Sourcery (I would name names, but I’m afraid of leaving someone out) are working on multi-process support for gdb.  This is the feature I am most looking forward to.  In the foreseeable future, gdb will be able to trace both the parent and the child of a fork.  The particular “wow” use-case is something I read on the frysk web site: run “make check” in gdb, and have the CLI fire up whenever any program SEGVs.  No more futzing with setting up the debug environment!  In fact, no more figuring out how to get past libtool wrapper scripts — we could add a little hack so that you can just run them in gdb and the right thing will happen.

Naturally, we’ll be wiring all this up to Python, one way or another.

I’ve also got some longer-term plans for the Python support.  I’m very interested in extending gdb to debug interpreted languages.  As with most computer problems, this means inserting a layer of indirection in a number of places: into expression parsing, into symbol lookup, into breakpoints, into watchpoints, etc.  The goal here is to be able to write support for, say, debugging Python scripts, as a Python extension to gdb.  Then, users could switch back and forth between “raw” (debugging the C implementation) and “cooked” (debugging their script) views easily.

I have two basic models I use when thinking about python-gdb: valgrind and emacs.

Emacs is a great example of managing the split between the core implementation and scripts.  Emacs developers prefer to write in elisp when possible; the core exists, more or less, to make this possible for a wide range of uses.  I’m trying to steer gdb in this direction.  That is, push Python hooks into all the interesting places in gdb, and then start preferring Python over C.  (Mozilla might have been another good example here — but I am more familiar with Emacs.)

Naturally, we’ll pursue this with extraordinary wisdom and care.  Cough cough.  Seriously, there are many areas of gdb which are not especially performance sensitive.  For example, consider the new commands we wrote during this series.  Even support for a new language would not require anything that could not be comfortably — and excellently — done in Python.

Valgrind taught me the Field of Dreams model: even a fairly esoteric area of programming can attract a development community, provided that you build the needed infrastructure.  In other words, just look at all those cool valgrind skins.  This library orientation, by the way, is something I would like to see GCC pursue more vigorously.

I’m very interested to hear your feedback.  Feel free to post comments here, or drop us a line on the Archer list.

We’ve come to the end of this series of posts.  I’m sad to see it end, but now it is time to stop writing about python-gdb features, and to go back to writing the features themselves.  I’ll write more when there is more to be said.

Soon a project will be starting to consider adding a to-be-determined set of small language changes to JDK 7. Given the rough timeline for JDK 7 and other on-going efforts to change the language, such as modules and annotations on types, only a limited number of small changes can be considered for JDK 7. That does not imply that larger changes aren't appropriate or worthwhile at some point in the future; in the mean time such changes can be explored and honed for JDK 8 or later.

Separate from its size, criteria to evaluate the utility of a language change will be discussed in a future blog entry.

The JCP process defines three deliverables for a JSR:

  • Specification.

  • Reference Implementation

  • Compatibility Tests

These three distinct aspects of a language change, specification, implementation, and general testing, exist whether or not the change is managed under a JSR. For this project, a language change will be judged small if it is simultaneously a small-enough effort under all three of specification, implementation, and testing. In other words, if a change is medium sized or larger in a single area, it is not a small change. (This corresponds to using an infinity norm to measure size; see "Norms: How to Measure Size".) Another concern is the size of change to developers, but if the change is small in these three areas, it is likely to be small for developers to learn and adopt too. Because there is limited fungiblity between the people working on specification, implementation, and testing, a single oversize component can't necessarily be compensated for by the other two components being small enough to managed on their own.

The size of a specification change is not just related to the amount of text that is altered; it also depends on which text, how many new concepts are needed, and the complexity of those concepts. Similarly, the implementation effort can be large if a limited amount of tricky code is involved as well as if a large volume of prosaic code is needed. An estimate of the future maintenance effort should factor into judging the net implementation cost too. The specification size and implementation size are often not closely related; a small spec change can require large implementation efforts and vice versa. JCK-style conformance testing is based on testing assertions in the specification, so the size of this kind of testing effort should have some positive correlation with the size of the specification change. Likewise, regression testing should have at least a weak positive correlation with the size of the implementation change. However, adequate conformance testing can be disproportionately large compared to the size of the specification change depending on how the assertions interact and how many programs they affect.

Due to complexity of the Java type system and the desire to maintain backwards compatibility, almost any type system change will be at least a medium-sized effort for the implementation, specification, or both. Each new feature of the type system can interact with all the existing features, as well as all the future ones, so type system changes must be approached with healthy skepticism.

As a point of reference, the set of Java SE 5 language features will be sized according to the above criteria; from smallest to largest:

  • Normal maintenance, Size: Tiny
    In the course of maintaining the platform, small changes and corrections are made to the Java Language Specification (JLS) and javac. These changes even take together are not large enough to warrant a JSR separate from the platform umbrella JSR.

  • Hexadecimal floating-point literals, Size: Very small
    Hexadecimal floating-point literals were a small new feature added to the language in JDK 5 under maintenance. Only very localized grammatical changes were needed in the JLS together with well-bounded supporting library methods.

  • for-each loop, Size: Small
    Part of JSR 201, the enhanced for statement required a new section in the JLS and a straightforward desugaring by the compiler. However, there were still complications; calamity was narrowly averted in the new libraries needed to support the for loop. A new java.lang.Iterator type that would have broken migration compatibility was dropped in favor of reusing the less than ideal java.util.Iterator.

  • static import, Size: Small, but more complicated than expected
    Static import added more ways to influence the mapping of simple names in source code to the binary names in class files. The mapping already had complexities, including rules for hiding, shadowing, and obscuring; static import introduced more interactions.

  • enum types, Size: Medium
    By introducing a new kind of type, adding enum types included a type system modification and so were a medium-sized change. While the normative JLS text devoted to enums is brief, JVMS changes were also required, as well as surprising time-consuming and intricate libraries work, including interactions with IIOP serialization.

  • autoboxing and unboxing, Size: Medium
    The complications with autoboxing and unboxing come not from the feature directly, but from its interactions with generics and method resolution.

  • Annotation types, Size: Large
    As an enum was a new kind of specialized class, an annotation type, introduced in JSR 175, were a new kind of specialized interface. Besides being a type change, annotation types required coordinated JVM and library modifications as well as a new tool and framework, and a subsequent standardization, to fulfill the potential of the feature.

  • Generics, Size: Huge
    Generics were a pervasive change to the platform, introducing many new concepts in the specification, considerable change to the compiler, and far-reaching libraries updates.

Some examples of bigger-than-small language changes that have been discussed in the community include:

  • BGGA closures: Independent of the technical merit of the proposal, BGGA closures would be a large change to the language and platform.

  • Properties: While a detailed judgment would have to be made against a specific proposal, as a new kind of type properties would most likely be at least medium-sized.

  • Reification: The addition of information about the type parameters of objects at runtime would involve language changes, nontrivial JVM changes to maintain efficiency, and raise compatibility issues.

Specific small language changes we at Sun are advocating for JDK 7 will be discussed in the near future.

The last couple of days I came around to work a little more on Caciocavallo. I added two notable features:

Event Handling

Event handling is now done in Caciocavallo. We now have a generic event pump that pulls event data out of a CacioEventSource implementation. EventSource is an interface that has to be provided by the target implementation and delivers event data. This data is then processed, possibly transforming some things, and eventually an AWT event is generated of it and posted to the AWT event queue. On the target implementation, only the CacioEventSource has to be implemented, which usually simply polls the native event queue and fills in the event data.

Managed Windows

Cacio will support ‘managed windows’ soon, only needs some cleanup before committing. A little background is probably in place: the idea in Cacio is that all the AWT widgets live in their own heavyweight window. Those windows are nested and laid out according to the structure and layout of the heavyweight AWT widgets. The peers implement the painting and logic by using Swing, and delegate the windowing stuff to an interface called PlatformWindow. A target implementation basically only needs to implement the PlatformWindow interface and gets all the AWT widgets for free.

But what if your target system doesn’t support any windows? Think of a plain framebuffer as example? In this case you need to implement the windowing logic yourself. In order to help making this easier, I added a generic ‘window manager’ (it’s not like the X11 window managers, more like what X Windows itself does: handle rectangular, possibly nested areas on the screen). This implements the PlatformWindow interface and only needs an implementation of an even simpler interface (called ManagedWindowContainer) as the backend.

This also will do alot of work for events: the target implementation only needs to provide mouse and keyboard events, and the window manager will generate focus, window and component events for you.

With this window manager in place it will be possible to support all kinds of setups I can think of: 1. A fullscreen target with no window support at all. The window manager then manages all the toplevel AND nested windows. 2. Basic toplevel window support on the target. Let your target handle the toplevel window and use the window manager to handle the nested windows. 3. Full window support on the target (think X11). You don’t need the window manager then and implement PlatformWindow directly.

Last time I promised something flashy in this post.  What could be flashier than a GUI?

Here’s some code to get you started:

from threading import Thread
import gtk

def printit ():
    print "Hello hacker"

class TestGtkThread (Thread):
    def destroy (self, *args):

    def hello (self, *args):
        gdb.post_event (printit)

    def run (self):

        self.window = gtk.Window(gtk.WINDOW_TOPLEVEL)
        self.window.connect("destroy", self.destroy)

        button = gtk.Button("Hello World")
        # connects the 'hello' function to the clicked signal from the button
        button.connect("clicked", self.hello)


class TestGtk (gdb.Command):
    def __init__ (self):
        super (TestGtk, self).__init__ ("testgtk", gdb.COMMAND_NONE,
        self.init = False

    def invoke (self, arg, from_tty):
        if not self.init:
            self.init = True
            v = TestGtkThread()
            v.setDaemon (True)
            v.start ()


Note that we finesse the problem of main loop integration by simply starting a separate thread.  My thinking here is to just use message passing: keep gdb operations in the gdb thread, and gtk operations in the GUI thread, and send active objects back and forth as needed to do work.  The function gdb.post_event (git pull to get this) arranges to run a function during the gdb event loop; I haven’t really investigated sending events the other direction.

The above isn’t actually useful — in fact it is just a simple transcription of a python-gtk demo I found somewhere in /usr/share.  However, the point is that the addition of Python cracks gdb open: now you can combine gdb’s inferior-inspection capabilities with Python’s vast suite of libraries.  You aren’t tied to the capabilities of a given gdb GUI; you can write custom visualizers, auto-load them or load them on demand, and use them in parallel with the CLI.  If your GUI provides a CLI, you can do this without any hacks there at all; for example, this kind of thing works great from inside Emacs.

The next post is the final one in this series, I’m sorry to say.

I've been following Bug Labs choice of JVM quite closely. After a series of comparisons between JamVM, CacaoVM and PhoneME they adopted PhoneME (initial test here and the follow-up). I blogged on the results of the first test, which were favourable to JamVM. However, for the second test, they sorted out the problems with running PhoneME's JIT, and the positions of JamVM and PhoneME reversed.

This was disheartening, but the results spoke for themselves. However, one odd fact is that the second test did not give any details of start-up time. JamVM clearly won this in the first test, and it's unlikely enabling PhoneME's JIT would have changed this.

So, I read with great interest the recent blog entry where they've got CacaoVM/GNU Classpath running on the BUG. It appears they will still ship with PhoneME, but CacaoVM/GNU Classpath will be an option for customers who require the Classpath exception.

So what? Well, I'd like an explanation why they seem so reluctant to use JamVM. From their own tests, JamVM came out on top for start-up, and came second in performance to PhoneME with JIT.

Perhaps they've finally cracked the performance problems with CacaoVM. But JamVM is not configured for top performance on ARM either (by default, the inlining interpreter is disabled on ARM).

Of course, there are many other advantages to JamVM on embedded systems besides start-up time. It has its own, compacting garbage-collector with full support for soft, weak and phantom references in addition to class unloading. CacaoVM relies on Boehm GC, exhibiting memory fragmentation issues, and it has no support for soft/weak/phantom references or class-unloading.

Things like this makes me very disheartened. As I've said before, it makes me wonder why I continue to work on JamVM at all. However, giving up will be a case of "cutting my nose off to spite my face".

If they've hit any problems with JamVM I'll be quite happy to work with them to fix them, but I've received no feedback or requests. Unfortunately, I have been unable to leave any comments on the blog entry itself. On enquiring with the webmaster, it appears that this is new software which is at an early stage. However, they've put this functionality at the top of their TODO list, and I can expect it in a day or two (thanks Brian).

To finish on a positive note, I've done quite a lot of work on JamVM over the last few months, including memory footprint and performance improvements over JamVM 1.5.1. Hopefully I'll make a new release before Christmas.

I see it! It actually exists! Yes Virginia, There Is a Santa Claus!

Actually that's a whale we saw in Alaska this summer. No I'm not trying to insult Santa. ;^)

Seriously, we have our first cut at generated OpenJDK6 repositories:

You can browse all 7 repositories:

Or you can clone them, or do a forest clone to get the entire forest:

hg fclone yourjdk6
(See the OpenJDK Developer Guide for more information on how to setup Mercurial and the forest extension).

A few important notes:

  • These should be treated as experimental and read-only, official ones should be next week
  • They should match the contents of the OpenJDK6 source bundles, except:
    • No control directory, these files are in the top repository now
    • Previously you had to 'cd control/make && gnumake', now just 'cd . && gnumake'
    • README-builds.html is in the top repository, it's movement has created a little confusion in the changesets, ultimately we will have one copy.
  • Contributed changes should be documented in the changeset comments, if the contribution information is missing please let me know
  • These repositories were created from the TeamWare workspaces and a set of patches and documentation on those patches, we may have to re-create them. If we re-create repositories again, the old ones will not be related to the new ones. So any changesets you create with your clones should be viewed as temporary until the final repositories are put in place.
  • The hotspot repository may be completely replaced when we upgrade to HS14, so when that happens you may need to re-clone the hotspot repository.

Please let me know if you see anything wrong with these repositories.

The target date for official repositories is next week, once it is official we can add more changesets to correct problems, but we can't go back and change the changesets already created.


So far we’ve concentrated on way to use Python to extend gdb: writing new commands, writing new functions, and customized pretty-printing.  In this post I want to look at gdb from a different angle: as a library.  I’ve long thought it would be pretty useful to be able to use gdb as a kind of scriptable tool for messing around with running programs, or even just symbol tables and debug info; the Python work enables this.

One word of warning before we begin: we’re starting to get into the work-in-progress parts of python-gdb.  If you play around here, don’t be surprised if it is not very polished.  And, as always, we’re interested in your feedback; drop us a line on the Archer list.

For historical and technical reasons, it is pretty hard to turn gdb into an actual loadable Python library.  This might be nice to do someday; meanwhile we’ve made it possible to invoke gdb as an interpreter: add the “-P” (or “--python“) option.  Anything after this option will be passed to Python as sys.argv.  For example, try this script:

#!/home/YOURNAME/archer/install/bin/gdb -P
print "hello from python"

Ok… so far so good.  Now what?  How about a little app to print the size of a type?

#!/home/YOURNAME/archer/install/bin/gdb -P
import sys
import gdb
gdb.execute("file " + sys.argv[1])
type = gdb.Type (sys.argv[0])
print "sizeof %s = %d" % (sys.argv[0], type.sizeof ())

You can script that with gdb today, though the invocation is uglier unless you write a wrapper script.  More complicated examples are undeniably better.  For instance, you can write a “pahole” clone in Python without much effort.

That invocation of gdb.execute is a bit ugly.  In the near future (I was going to do it last week, but I got sick) we are going to add a new class to represent the process (and eventually processes) being debugged.  This class will also expose some events related to the state of the process — e.g., an event will be sent when the process stops due to a signal.

The other unfinished piece in this area is nicer I/O control.  The idea here is to defer gdb acquiring the tty until it is really needed.  With these two pieces, you could run gdb invisibly in a pipeline and have it bring up the CLI only if something goes wrong.

It will look something like:

#!/home/YOURNAME/archer/install/bin/gdb -P
import sys
import gdb

def on_stop(p):
  (status, value) = p.status
  if status != gdb.EXIT:
    gdb.cli ()
    sys.exit (value)

process = gdb.Inferior(sys.argv)
process.connect ("stop", on_stop) ()

I’ll probably use python-gobject-like connect calls, unless Python experts speak up and say I should do something different.

The next post will cover a flashier use of Python in gdb.  Stay tuned.

Thanks to Kelly, trial Mercurial repositories for OpenJDK 6 are now available for evaluation. These trial repositories will be available read-only for about a week to find any problems before creating the final live repositories; for details, see Kelly's email to the jdk6-dev alias.

In order to support a partner, we needed to get a JVM on the BUG that had the commericial-friendly classpath exception clause to the GPL.  Unfortuantely phoneME does not have this, however GNU Classpath does.  I know of two JVMs we could use: JamVM and CACAO.  The Jalimo people have done a good job updating OpenEmbedded with the latest CACAO sources.  Marcin completed the work in getting a build image from sources with CACAO and all the BUG OSGi code including JNI support, and poof!  Java 1.5 on BUG!  Of course we will continue to ship phoneME on the BUG, but it's great to give our customers and collaborators the option of dropping something else in on a whim.  Without the OpenEmbedded, Jalimo, GNU Classpath, CACAO, and other Java FOSS communities we could never have done this!

root@bug:~# java -version

java version "1.5.0"

CACAO version 0.99.3 Copyright (C) 1996-2005, 2006, 2007, 2008 CACAOVM - Verein zur Foerderung der freien virtuellen Maschine CACAO This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Recently, we've been working to raise the quality bar for the code in the OpenJDK langtools repository.

Before OpenJDK, the basic quality bar was set by the JDK's product team and SQE team. They defined the test suites to be run, how to run them, and the target platforms on which they should be run. The test suites included the JDK regression tests, for which the standard was to run each test in its own JVM (simple and safe, but slow), and the platforms were the target platforms for the standard Sun JDK product.

Even so, the bar was somewhat higher in selected areas. The javac team has pushed the use of running the javac regression tests in "same JVM" mode, because it is so much faster. Starting up a whole JVM to compile a three line program to verify that a particular error message is generated is like using a bulldozer to crack an egg. Likewise, as a pure Java program, it has been reasonable to develop the compiler and related tools, and to run the regression tests, on non-mainstream supported platforms.

With the advent of OpenJDK, the world got a whole lot bigger, and expectations got somewhat higher, at least for the langtools component. If nothing else, there's a bigger family of developers these days, with a bigger variety of development environments, to be used for building and testing OpenJDK.

We've been steadily working to make it so that all the langtools regression tests can be run in "same JVM" mode. This has required fixes in a number of areas:

  • in the regression test harness (jtreg)
  • in tools like javadoc, which used to be neither reusable nor re-entrant. This made it hard to test it with different tests in the same VM. javadoc is now reusable, re-entrant is coming soon
  • in the tests themselves: some tests we changed to make them same-VM safe; others, like the apt tests, we simply marked as requiring "othervm" mode. Marking a test as requiring "othervm" allows these tests to succeed when the default mode for the rest of the test suite is "samevm".

We've also made it so that you can run the langtools tests without building a full JDK, by using the -Xbootclasspath option. For a while, that left one compiler test out in the cold ( but that test was finally rewritten, recently.

We've been working to use Hudson to build and test the langtools repository, in addition to the standard build and test done by Release Engineering and QA teams. This allows us (developers) to perform additional tests more easily, such as running FindBugs, or testing "developer" configurations as well as "product" configurations. (i.e. the configurations an OpenJDK developer might use.) This has also made us pay more attention to the documented way to run the langtools regression tests, using the standard Ant build file. In practice, the Sun's "official" test runs are done using jtreg from the command line, and speaking for myself, I prefer to run the tests from the command line as well, to have more control over which tests to run or rerun, and how to run them.

The net result of all of this is that the langtools regression tests should all always pass, however they are run. This includes

  • as part of testing a fully built JDK
  • as part of testing a new version of langtools, using an earlier build of JDK as a baseline
  • from the jtreg command line in "other vm" mode
  • from the jtreg command line in "same vm" mode
  • from the <jtreg> Ant task, such as used in the standard build.xml file
  • on all Java platforms

I'm happy to announce that I'll be leading up Sun's efforts to develop a set of small language changes in JDK 7; we intend to submit a JSR covering those changes during the first half of 2009. However, before the JSR proposal is drafted and submitted to the JCP, we'll first be running a call for proposals so Java community members can submit detailed, thoughtful changes for consideration too. We'll be seeding the discussion with a few proposals we think would improve the language. More information on our proposed changes, guidance for measuring the size of a change, and criteria for judging the desirability of a language change will be coming over the next several weeks.

I've proposed an OpenJDK project to host the discussion of the proposals and potentially some prototype implementations.

Suggested Reading
So you want to change the Java Programming Language...

Recently Mark Reinhold startedblogging aboutmodularizingthe JDK. I've been getting requests to split up IKVM.OpenJDK.ClassLibrary.dll for a long time. It's good to see that the Java platform is also moving in that direction, but we needn't wait for that.

I've been working on making ikvmc support "multi target" mode for a while now. Last week I used this mode to compile the OpenJDK classes into 743 different assemblies (one per package) . I then tried to use NDepend to dig through the dependencies, but it turns out that 743 assemblies is a bit too much to handle (NDepend handled it resonably well, but the dependency graph was way too large to be useful). So I started moving some obvious stuff together. The result was this NDepend report. Still a lot of data and a lot of dependencies, but some patterns are starting to emerge.

Here's my preliminary view of how things could be split:

IKVM.OpenJDK.Core.dll 5 MB
IKVM.OpenJDK.Security.dll 3 MB
IKVM.OpenJDK.Util.dll 1 MB
IKVM.OpenJDK.Xml.dll 8 MB
IKVM.OpenJDK.SwingAWT.dll 3 MB
IKVM.OpenJDK.Charsets.dll 5 MB
IKVM.OpenJDK.Corba.dll 2 MB
IKVM.OpenJDK.Management.dll      2 MB
IKVM.OpenJDK.Misc.dll 3 MB

(The sizes are only approximate.)

I had originally hoped to make IKVM.OpenJDK.Core.dll smaller by keeping java.util, and out of it, but it looks like you won't be able to run anything non-trivial without requiring classes from these packages, so it makes more sense to put them into the core assembly. The IKVM.OpenJDK.Security.dll and  IKVM.OpenJDK.Util.dll assemblies contain other util and security related packages that shouldn't be needed as often.

It is possible to split packages across assemblies (e.g. java.awt.AWTPermission will be in IKVM.OpenJDK.Core.dll because java.lang.SecurityManager depends on it), but given the potential for confusion my current thinking is that it is probably best to only move individual classes into Core should it be necessary (because, realistically, you can't develop without having a reference to Core you're less likely to be confused when trying to locate the class).

To avoid confusion or expectations that are too high: I haven't yet built the runtime infrastructure to support this. So while I can compile the class library into all these parts, the runtime won't actually be able to work correctly, because it still expects all the boot class loader classes in a single assembly.

As always, feedback on the proposed split is very welcome.


I really like NDepend's ability to show the dependencies in different ways and the interactive dependency matrix that allows you to drill down into a dependency to see exactly where it comes from. Beyond dependency analysis it also has a power SQL like query language to allows use to query dependencies and compute all the code metrics you can come up with. It also includes a number of code metrics out of the box, but those aren't really my cup of tea, so I can't comment how useful they are.

One other small but really nice thing about it is that you can run it without installing. IMO this is very nice compared with installers that do who knows what to your system (and require administrator access).

An evaluation copy can be downloaded from their website.

Full disclosure: I was given a free copy of NDepend Professional Edition.

Vision The JDK is big—and hence it ought to be modularized. Doing so would enable significant improvements to the key performance metrics of download size, startup time, and memory footprint.

Java libraries and applications can also benefit from modularization. Truly modular Java components could leverage the performance-improvement techniques applicable to the JDK and also be easy to publish in the form of familiar native packages for many operating systems.

Finally, in order to realize the full potential of a modularized JDK and of modularized applications the Java Platform itself should also be modularized. This would allow applications to be installed with just those components of the JDK that they actually require. It would also enable the JDK to scale down to smaller devices, yet still offer conformance to specific Profiles of the Platform Specification.

Okay—so where do we start?

JDK 7 As a first step toward this brighter, modularized world, Sun’s primary goal in the upcoming JDK 7 release will be to modularize the JDK.

There will be other goals, to be sure—more on those later—but the modularization work will drive the release, which we hope to deliver early in 2010.

Tools Modularizing the JDK requires a module system capable of supporting such an effort. It requires, in particular, a module system whose core can be implemented directly within the Java virtual machine, since otherwise the central classes of the JDK could not themselves be packaged into meaningful modules.

Modularizing the JDK—or indeed any large code base—is best done with a module system that’s tightly integrated with the Java language, since otherwise the compile-time module environment can differ dramatically from the run-time module environment and thereby make the entire job far more difficult.

Now—which module system should we use?

JSR 277 The current draft of this JSR proposes the JAM module system, which has been the subject of much debate and is far from finished. This system is intended to be at least partly integrated with the Java language. Owing to some of its rich, non-declarative features, however, it would be impossible to implement its core functionality directly within the Java virtual machine.

Sun has therefore decided to halt development of the JAM module system, and to put JSR 277 on hold until after Java SE 7.

JSR 294 This JSR, Improved Modularity Support in the Java Programming Language, is chartered to extend the Java language and the Java virtual machine to support modular programming. Its Expert Group has already discussed language changes that have been well received for their simplicity as well as their utility to existing module systems such as OSGi.

Earlier this year JSR 294 was effectively folded into the JSR 277 effort. Sun intends now to revive 294 as a separate activity, with an expanded Expert Group and greater community involvement, in support of the immediate JDK 7 modularization work as well as the larger goal of modularizing the Java SE Platform itself.

OSGi If JSR 277’s JAM module system is an unsuitable foundation for modularizing the JDK, what about the OSGi Framework? This module system is reasonably mature, stable, and robust. Its core has even already been implemented within a Java virtual machine, namely that of Apache Harmony. OSGi is not at all integrated with the Java language, however, having been built atop the Java SE Platform rather than from within it.

This last problem can be fixed. Sun plans now to work directly with the OSGi Alliance so that a future version of the OSGi Framework may fully leverage the features of JSR 294 and thereby achieve tighter integration with the language.

Jigsaw In order to modularize JDK 7 in the next year or so, and in order better to inform the work of JSR 294, Sun will shortly propose to create Project Jigsaw within the OpenJDK Community.

This effort will, of necessity, create a simple, low-level module system whose design will be focused narrowly upon the goal of modularizing the JDK. This module system will be available for developers to use in their own code, and will be fully supported by Sun, but it will not be an official part of the Java SE 7 Platform Specification and might not be supported by other SE 7 implementations.

If and when a future version of the Java SE Platform includes a specific module system then Sun will provide a means to migrate Jigsaw modules up to that standard. In the meantime we’ll actively seek ways in which to interoperate with other module systems, and in particular with OSGi.

All work on Project Jigsaw will be done completely in the open, in as transparent a manner as possible. We hope you’ll join us!

My thanks to Alex Buckley for comments on drafts of this entry.

In the previous entry we covered the basics of pretty-printing: how printers are found, the use of the to_string method to customize display of a value, and the usefulness of autoloading.  This is sufficient for simple objects, but there are a few additions which are helpful with more complex data types.  This post will explain the other printer methods used by gdb, and will explain how pretty-printing interacts with MI, the gdb machine interface.

Python-gdb’s internal model is that a value can be printed in two parts: its immediate value, and its children.  The immediate value is whatever is returned by the to_string method.  Children are any sub-objects associated with the current object; for instance, a structure’s children would be its fields, while an array’s children would be its elements.

When pretty-printing from the CLI, gdb will call a printer’s “children” method to fetch a list of children, which it will then print.  This method can return any iterable object which, when iterated over, returns pairs. The first item in the pair is the “name” of the child, which gdb might print to give the user some help, and the second item in the pair is a value. This value can be be a string, or a Python value, or an instance of gdb.Value.

Notice how “pretty-printers” don’t actually print anything?  Funny.  The reason for this is to separate the printing logic from the data-structure-dissection logic.  This way, we can easily implement support for gdb options like “set print pretty” (which itself has nothing to do with this style of pretty-printing — sigh. Maybe we need a new name) or “set print elements“, or even add new print-style options, without having to modify every printer object in existence.

Gdb tries to be smart about how it iterates over the children returned by the children method.  If your data structure potentially has many children, you should write an iterator which computes them lazily.  This way, only the children which will actually be printed will be computed.

There’s one more method that a pretty-printer can provide: display_hint.  This method can return a string that gives gdb (or the MI user, see below) a hint as to how to display this object.  Right now the only recognizedd hint is “map”, which means that the children represent a map-like data structure.  In this case, gdb will assume that the elements of children alternate between keys and values, and will print appropriately.

We’ll probably define a couple more hint types.  I’ve been thinking about “array” and maybe “string”; I assume we’ll find we want more in the future.

Here’s a real-life printer showing the new features.  It prints a C++ map, specifically a std::tr1::unordered_map.  Please excuse the length — it is real code, printing a complex data structure, so there’s a bit to it.  Note that we define a generic iterator for the libstdc++ hash table implementation — this is for reuse in other printers.

import gdb
import itertools

class Tr1HashtableIterator:
    def __init__ (self, hash):
        self.count = 0
        self.n_buckets = hash['_M_bucket_count']
        if self.n_buckets == 0:
            self.node = False
            self.bucket = hash['_M_buckets']
            self.node = self.bucket[0]
            self.update ()

    def __iter__ (self):
        return self

    def update (self):
        # If we advanced off the end of the chain, move to the next
        # bucket.
        while self.node == 0:
            self.bucket = self.bucket + 1
            self.node = self.bucket[0]
            self.count = self.count + 1
            # If we advanced off the end of the bucket array, then
            # we're done.
            if self.count == self.n_buckets:
                self.node = False

    def next (self):
        if not self.node:
            raise StopIteration
        result = self.node.dereference()['_M_v']
        self.node = self.node.dereference()['_M_next']
        self.update ()
        return result

class Tr1UnorderedMapPrinter:
    "Print a tr1::unordered_map"

    def __init__ (self, typename, val):
        self.typename = typename
        self.val = val

    def to_string (self):
        return '%s with %d elements' % (self.typename, self.val['_M_element_count'])

    def flatten (list):
        for elt in list:
            for i in elt:
                yield i

    def format_one (elt):
        return (elt['first'], elt['second'])

    def format_count (i):
        return '[%d]' % i

    def children (self):
        counter = itertools.imap (self.format_count, itertools.count())
        # Map over the hash table and flatten the result.
        data = self.flatten (itertools.imap (self.format_one, Tr1HashtableIterator (self.val)))
        # Zip the two iterators together.
        return itertools.izip (counter, data)

    def display_hint (self):
        return 'map'

If you plan to write lazy children methods like this, I recommend reading up on the itertools package.

Here’s how a map looks when printed.  Notice the effect of the “map” hint:

(gdb) print uomap
$1 = std::tr1::unordered_map with 2 elements = {
  [23] = 0x804f766 "maude",
  [5] = 0x804f777 "liver"

The pretty-printer API was designed so that it could be used from MI.  This means that the same pretty-printer code that works for the CLI will also work in IDEs and other gdb GUIs — sometimes the GUI needs a few changes to make this work properly, but not many.  If you are an MI user, just note that the to_string and children methods are wired directly to varobjs; the change you may have to make is that a varobj’s children can change dynamically.  We’ve also added new varobj methods to request raw printing (bypassing pretty-printers), to allow efficient selection of a sub-range of children, and to expose the display_hint method so that a GUI may take advantage of customized display types.  (This stuff is all documented in the manual.)

Next we’ll learn a bit about scripting gdb.  That is, instead of using Python to extend gdb from the inside, we’ll see how to use Python to drive gdb.

I've been toying with doing a blog or podcast aggregator with JavaFX. I have a feeling that the strengths of JavaFX, animation, graphics, media, etc, could be put to good use in such an app. Additionally most podcasters put little...
You can view the updated site, launch videos, sample apps, and more, at .. well .. when the server comes back up, it's having a little bit of trouble at this moment. Anyway, I want to post a few...

Consider this simple C++ program:

#include <string>
std::string str = "hello world";
int main ()
  return 0;

Compile it and start it under gdb.  Look what happens when you print the string:

(gdb) print str
$1 = {static npos = 4294967295,
  _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x804a014 "hello world"}}

Crazy!  And worse, if you’ve done any debugging of a program using libstdc++, you’ll know this is one of the better cases — various clever implementation techniques in the library will send you scrambling to the gcc source tree, just to figure out how to print the contents of some container.  At least with string, you eventually got to see the contents.

Here’s how that looks in python-gdb:

(gdb) print str
$1 = hello world

Aside from the missing quotes (oops on me), you can see this is much nicer.  And, if you really want to see the raw bits, you can use “print /r“.

So, how do we do this?  Python, of course!  More concretely, you can register a pretty-printer class by matching the name of a type; any time gdb tries to print a value whose type matches that regular expression, your printer will be used instead.

Here’s a quick implementation of the std::string printer (the real implementation is more complicated because it handles wide strings, and encodings — but those details would obscure more than they reveal):

class StdStringPrinter:
    def __init__(self, val):
        self.val = val

    def to_string(self):
        return self.val['_M_dataplus']['_M_p'].string()
gdb.pretty_printers['^std::basic_string<char,.*>$'] = StdStringPrinter

The printer itself is easy to follow — an initializer that takes a value as an argument, and stores it for later; and a to_string method that returns the appropriate bit of the object.

This example also shows registration.  We associate a regular expression, matching the full type name, with the constructor.

One thing to note here is that the pretty-printer knows the details of the implementation of the class.  This means that, in the long term, printers must be maintained alongside the applications and libraries they work with.  (Right now, the libstdc++ printers are in archer.  But, that will change.)

Also, you can see how useful this will be with the auto-loading feature.  If your program uses libstdc++ — or uses a library that uses libstdc++ — the helpful pretty-printers will automatically be loaded, and by default you will see the contents of containers, not their implementation details.

See how we registered the printer in gdb.pretty_printers?  It turns out that this is second-best — it is nice for a demo or a quick hack, but in production code we want something more robust.

Why?  In the near future, gdb will be able to debug multiple processes at once.  In that case, you might have different processes using different versions of the same library.  But, since printers are registered by type name, and since different versions of the same library probably use the same type names, you need another way to differentiate printers.

Naturally, we’ve implemented this.  Each gdb.Objfile — the Python wrapper class for gdb’s internal objfile structure (which we briefly discussed in an earlier post) — has its own pretty_printers dictionary.  When the “” file is auto-loaded, gdb makes sure to set the “current objfile”, which you can retrieve with “gdb.get_current_objfile“.  Pulling it all together, your auto-loaded code could look something like:

import gdb.libstdcxx.v6.printers

Where the latter is defined as:

def register_libstdcxx_printers(objfile):
   objfile.pretty_printers['^std::basic_string<char,.*>$'] = StdStringPrinter

When printing a value, gdb first searches the pretty_printers dictionaries associated with the program’s objfiles — and when gdb has multiple inferiors, it will restrict its search to the current one, which is exactly what you want.  A program using will print using the v6 printers, and (presumably) a program using will use the v7 printers.

As I mentioned in the previous post, we don’t currently have a good solution for statically-linked executables.  That is, we don’t have an automatic way to pick up the correct printers.  You can always write a custom auto-load file that imports the right library printers.  I think at the very least we’ll publish some guidelines for naming printer packages and registration functions, so that this could be automated by an IDE.

The above is just the simplest form of a pretty-printer.  We also have special support for pretty-printing containers.  We’ll learn about that, and about using pretty-printers with the MI interface, next time.

Apparently Fedora 10’s eclipse-ecj doesn’t have gcj-compiled libraries any more. Never mind:

mkdir /usr/lib/gcj/eclipse-ecj
aot-compile -c "-O3" /usr/lib/eclipse/dropins/jdt/plugins /usr/lib/gcj/eclipse-ecj

Also, whilst I’m messing with my system, I’ve always had to do the following for ppc64 builds to work:

mkdir -p /usr/lib/jvm/java-gcj/jre/lib/ppc64/server
ln -s /usr/lib64/gcj-4.3.2/ /usr/lib/jvm/java-gcj/jre/lib/ppc64/server

I never figured out how anyone else manages without this. Maybe nobody else is trying to build two platforms on the one box.

One of the more obscure language changes included back in JDK 5 was the addition of hexadecimal floating-point literals to the platform. As the name implies, hexadecimal floating-point literals allow literals of the float and double types to be written primarily in base 16 rather than base 10. The underlying primitive types use binary floating-point so a base 16 literal avoids various decimal ↔ binary rounding issues when there is a need to specify a floating-point value with a particular representation.

The conversion rule for decimal strings into binary floating-point values is that the binary floating-point value nearest the exact decimal value must be returned. When converting from binary to decimal, the rule is more subtle: the shortest string that allows recovery of the same binary value in the same format is to be used. While these rules are sensible, surprises are possible from the differing bases used for storage and display. For example, the numerical value 1/10 is not exactly representable in binary; it is a binary repeating fraction just as 1/3 is a repeating fraction in decimal. Consequently, the numerical values of 0.1f and 0.1d are not the same; the exact numeral value of the comparatively low precision float literal 0.1f is
and the shortest string that will convert to this value as a double is
This in turn differs from the exact numerical value of the higher precision double literal 0.1d,
0.1000000000000000055511151231257827021181583404541015625. Therefore, based on decimal input, it is not always clear what particular binary numerical value will result.

Since floating-point arithmetic is almost always approximate, dealing with some rounding error on input and output is usually benign. However, in some cases it is important to exactly specify a particular floating-point value. For example, the Java libraries include constants for the largest finite double value, numerically equal to (2-2-52)·21023, and the smallest nonzero value, numerically equal to 2-1074. In such cases there is only one right answer and these particular limits are derived from the binary representation details of the corresponding IEEE 754 double format. Just based on those binary limits, it is not immediately obvious how to construct a minimal length decimal string literal that will convert to the desired values.

Another way to create floating-point values is to use a bitwise conversion method, such as doubleToLongBits and longBitsToDouble. However, even for numerical experts this interface is inhumane since all the gory bit-level encoding details of IEEE 754 are exposed and values created in this fashion are not regarded as constants. Therefore, for some use cases it helpful to have a textual representation of floating-point values that is simultaneously human readable, clearly unambiguous, and tied to the binary representation in the floating-point format. Hexadecimal floating-point literals are intended to have these three properties, even if the readability is only in comparison to the alternatives!

Hexadecimal floating-point literals originated in C99 and were later included in the recent revision of the IEEE 754 floating-point standard. The grammar for these literals in Java is given in JLSv3 §3.10.2:


HexSignificand BinaryExponent FloatTypeSuffixopt

This readily maps to the sign, significand, and exponent fields defining a finite floating-point value; sign0xsignificandpexponent. This syntax allows the literal


to be to used represent the value 3; 1.8hex × 21 = 1.5decimal × 2 = 3. More usefully, the maximum value of
(2-2-52)·21023 can be written as
and the minimum value of
2-1074 can be written as
0x1.0P-1074 or 0x0.0000000000001P-1022, which are clearly mappable to the various fields of the floating-point representation while being much more scrutable than a raw bit encoding.

Retroactively reviewing the possible steps needed to add hexadecimal floating-point literals to the language:

  1. Update the Java Language Specification: As a purely syntactic changes, only a single section of the JLS had to updated to accommodate hexadecimal floating-point literals.

  2. Implement the language change in a compiler: Just the lexer in javac had to be modified to recognize the new syntax; javac used new platform library methods to do the actual numeric conversion.

  3. Add any essential library support: While not strictly necessary, the usefulness of the literal syntax is increased by also recognizing the syntax in Double.parseDouble and similar methods and outputting the syntax with Double.toHexString; analogous support was added in corresponding Float methods. In addition the new-in-JDK 5 Formatter "printf" facility included the %a format for hexadecimal floating-point.

  4. Write tests: Regression tests (under test/java/lang/Double in the JDK workspace/repository) were included as part of the library support (4826774).

  5. Update the Java Virtual Machine Specification: No JVMS changes were needed for this feature.

  6. Update the JVM and other tools that consume classfiles: As a Java source language change, classfile-consuming tools were not affected.

  7. Update the Java Native Interface (JNI): Likewise, new literal syntax was orthogonal to calling native methods.

  8. Update the reflective APIs: Some of the reflective APIs in the platform came after hexadecimal floating-point literals were added; however, only an API modeling the syntax of the language, such as the tree API might need to be updated for this kind of change.

  9. Update serialization support: New literal syntax has no impact on serialization.

  10. Update the javadoc output: One possible change to javadoc output would have been supplementing the existing entries for floating-point fields in the constant fields values page with hexadecimal output; however, that change was not done.

In terms of language changes, adding hexadecimal floating-point literals is about as simple as a language change can be, only straightforward and localized changes were need to the JLS and compiler and the library support was clearly separated. Hexadecimal floating-point literals aren't applicable to that many programs, but when they can be used, they have extremely high utility in allowing the source code to clearly reflect the precise numerical intentions of the author.

Since the OpenJDK source was released, there have been various discussions about aspects of the build architecture, including how dependencies on third parties libraries should be managed. For example, on a Linux system should the JDK use its own libzip or the libzip that comes with the distribution? I think the appropriate answers to these and related question hinge on whether the final deliverable from the OpenJDK project is viewed as the source code itself or a binary built from the source.

Traditionally before OpenJDK, the end result of the Sun's JDK project that most people used were the JDK and JRE binaries. These binaries were meant to be universal in the sense of being usable on a given processor family across a version range of operating systems. For example, there was a single windows x86 binary for use on, say, Windows NT, Windows XP, etc., a single Solaris SPARC binary for use across Solaris 8 through Solaris 10, and effectively a single Linux x86 binary for use across different Linux distributions.

This "single binary" model drives decisions about what platform to produce the binary on, generally an older release, and what assumptions can be made about available system resources, generally rather weak ones. With a single binary deliverable, since fewer environment resources can be relied on, making the JDK build self-contained is necessary for it to be reliable in a wide variety of environments. With this delivery architecture, there is some justification to, for example, including a copy a library like libzip in the JDK build rather than relying on a system library, even though there are increased maintenance costs.

However, when an OpenJDK/IcedTea binary is being built on a particular Linux distribution for use only on that distribution, the constraints are different. If the build is being done by the OS vendor, the vendor controls the OS contents and knows whether or not system libraries like libzip are reliable and kept up to date. Since stronger assumptions can be made about the host environment, weaker conditions need to be fulfilled by the JDK source tree. For an OS vendor, relying on a single copy of native libraries for the OS and the JDK is preferable to building (and maintaining) multiple copies.

Going forward, I'd expect the JDK build to evolve to better accommodate options to use host platform resources. Perhaps modules systems in the future can help manage such dependencies more transparently.

If you are not reading the Kaffe, gcj, Classpath or OpenJDK mailing lists on a daily basis, you may have missed that we are continuing a fine tradition of meeting in a developer room at FOSDEM again - our developer room request was approved earlier this week by the FOSDEM organizers.

With my friendly ad hoc FOSDEM meeting committee hat, I'll say that FOSDEM is just plain great, there is no other way to put it. In the past years we've turned our dev room into a lightning talk fest on all kinds of Java Libre projects issues (and more!), and I hope we'll continue that fine tradition, while making more room for discussions and debates in the schedule this year.

Since we don't know yet how large the room we get will be, and other small details, the wiki is currently light on content, but as we are using the tried and proven procedure of letting the dev room content evolve on the wiki, while politely poking people to sign up and offer subjects they want to talk about, it'll all appear as we go in the coming weeks. Meanwhile, if you're planning to come, please sign up so that we can have an estimate of how many people will join us for the talks and our regular dev room dinner event. If you are a FOSDEM Java Libre dev room regular, make sure to notice the shift from the regular pattern in the past: FOSDEM '09 is on February 7th/8th, rather then at the end of the month. It's still in Brussels, Belgium, of course!

Coming up a lot sooner in Belgium is DeVoxx, next week in Antwerp. I'll have a little talk there, and I'm sure Mark Reinhold, Alex Buckley and others will have interesting things to say in their sessions on Project Jigsaw. I'll be there from Tuesday until Friday, and hope to catch up with a lot of people I haven't met in a while.

Finally, about right between DeVoxx and FOSDEM, (except that it's on another continent) is the M3DD conference that I've mentioned in the previous entry. I forgot to mention that the early bird registration deadline for M3DD is in two days, on Friday. The deadlines for submissions for Community One and Java One are approaching very quickly, too.

The JCK tests probe the conformance of a Java platform implementation to a specification. For example, JCK 6b is the current test suite to determine conformance to the Java SE 6 spec. Official claims of conformance require not only passing the complete set of relevant tests, but also meeting the other requirements spelled out in the JCK user's guide.

Regarding the JCK, conformance is measured with respect to a binary rather than to a source base directly, which is sensible since a Java platform implementation will typically rely on and be affected by the properties of the host environment, including the OS, the C compiler, and the processor.

Previously Red Hat announced that using OpenJDK sources augmented with IcedTea patches, an OpenJDK binary on Fedora 9 passed the JCK and met the other conformance requirements.

Amongst other changes, after incorporating community-developed patches (notably 6748251 in b13) and following the OpenJDK 6 build instructions, inside Sun a binary resulting from building the unmodified OpenJDK 6 b13 sources on Redhat Enterprise AS 2.1 with gcc 2.95 (the official Linux build platform for Sun's 6 update releases) passed all the JCK 6b tests when run on Fedora Core 8 x86. That binary also meets all the other JCK requirements.

OpenJDK 6 binaries built from b13 (and later) sources on and for different host environments are now more likely to share those favorable conformance properties, but testing would be necessary to verify conformance status and to make any formal statements.

More information

Running with the usual jtreg flags, -a -ignore:quiet always and -s for the langtools area, the basic regression test results on Linux for OpenJDK 6 build 14 are:

  • HotSpot, 3 tests passed.

  • Langtools, 1,351 tests passed.

  • JDK, 3,077 tests pass, 26 tests fail, 3 tests have errors.

In this build, we upgraded from a HotSpot 10 base to HotSpot 11. HotSpot 11 is also being used in the 6u10 release. The set of included tests for the HotSpot versions differ:

0: b13-hotspot/summary.txt  pass: 5
1: b14-hotspot/summary.txt  pass: 3

0      1      Test
pass   ---    compiler/6547163/
pass   ---    compiler/6563987/
pass   ---    compiler/6571539/
pass   ---    compiler/6595044/
---    pass   compiler/6663621/
---    pass   compiler/6724218/

6 differences

In langtools all the tests continue to pass:

0: ./b13-langtools/summary.txt  pass: 1,351
1: ./b14-langtools/summary.txt  pass: 1,351

No differences

And in jdk, a few new tests were added in b14 and the existing tests have generally consistent results:

0: b13-jdk/summary.txt  pass: 3,072; fail: 23; error: 3
1: b14-jdk/summary.txt  pass: 3,077; fail: 26; error: 3

0      1      Test
---    fail   com/sun/org/apache/xml/internal/ws/server/
pass   fail   java/awt/TextArea/UsingWithMouse/SelectionAutoscrollTest.html
---    pass   java/awt/image/ConvolveOp/
---    pass   javax/management/monitor/
pass   fail   javax/swing/JColorChooser/
---    pass   javax/swing/JFileChooser/6484091/
---    pass   sun/management/jmxremote/
---    pass   sun/nio/cs/
---    pass   sun/security/ssl/com/sun/net/ssl/internal/ssl/SSLEngineImpl/
---    pass   tools/pack200/

10 differences