Welcome!

Linux Containers Authors: Elizabeth White, David Sprott, Martin Etmajer, Anders Wallgren, Liz McMillan

Related Topics: Linux Containers, Java IoT

Linux Containers: Article

Improving Swing Performance: JIT vs AOT Compilation

Linux native band plays fast

The JFC/Swing API, natively precompiled on Linux for the first time, delivers measurable improvement in Java GUI performance.

The Excelsior Engineering Team has ported Excelsior JET, a Java Virtual Machine (JVM) with an ahead-of-time compiler, to the Linux/x86 platform. As the JET JVM supports the entire J2SE platform API including the Java Foundation Classes (JFC/Swing), Excelsior engineers had an opportunity to evaluate the response time of natively compiled JFC/Swing on Linux. The results of the comparison with conventional, dynamically optimizing JVMs were encouraging: response time has improved by 40% or even doubled on some benchmarks. What's more important is that real-world Swing applications performed perceivably faster.

This article describes Excelsior JET JVM and JFCMark, free benchmark software by Excelsior that measures Swing-based GUI performance. Moreover, the authors share their technical experience in optimizing JFC/Swing and argue why ahead-of-time Java compilation has advantages over dynamic compilation for certain application types.

Two Recipes for Java

The definition of the term "virtual machine" has been revised in the past few years. Modern Java Virtual Machines are no longer just interpreters of the Java bytecode. High performance, state-of-the-art implementations are made up of optimizing compilers that translate bytecode instructions down to the native code that runs directly on the hardware. However, one technical decision distinguishes contemporary JVMs: what's the best time to run the performance engine, an optimizing native compiler? Two options exist: run it before or after the application starts.

Most JVMs initially interpret the program and then analyze how it runs by looking for hot spots, that is, frequently executed portions of bytecode. Hot spots are then compiled to optimized native code during program execution. This approach is called Just-In-Time (JIT) compilation. Other JVMs feature static native compilers similar to traditional C/C++ compilers, enabling developers to optimize their Java applications before execution. For Java, this old trick has a new name - Ahead-Of-Time (AOT) compilation. However, solely static compilation is not enough for Java compatibility. Remember that many Java applications use custom classloaders to load some components or plug-ins at runtime. To have the "J" in "JVM," such virtual machines must be supplied with an interpreter or JIT compiler to handle classes that could not be precompiled.

Either approach alone is not a silver bullet for Java performance. JIT-oriented JVMs can "see through" program execution, which may help them optimize hot code better than static compilers do. In return, AOT-oriented JVMs do not spend execution time on interpretation, profiling, and compilation, so optimized programs run fast from start-up. Not surprisingly, single-loop benchmarks fail to reveal a clear performance winner between the two approaches. Instead of carrying the "microbenchmark war" into the Linux camp, let's take a look at a vital example: the performance of Java GUI applications based on JFC/Swing.

Penguin-Driven JVMs

Both JIT and AOT-oriented implementations are available for Linux. The well-known Sun HotSpot Client and Server JVMs are powered by JIT compilers. BEA WebLogic JRockit and IBM Java 2 Runtime Environment also play in the JIT team. GCJ, the GNU compiler for Java, now supported by Red Hat, and Excelsior JET feature AOT compilation.

At the core of Excelsior JET is a static optimizing compiler that enables developers to transform their Java applications into native executables or shared libraries (.so) on a Linux flavor. The AOT compiler comes with a JET Control Panel (see Figure 1), a graphical front end that makes the product easy to learn and use. The command-line interface provides the integration of Excelsior JET into automated builds. The redistributable JET runtime system includes a JIT compiler to support Java dynamic classloading. Another graphical tool, JetPackII (see Figure 2), enables the rapid creation of installation packages for optimized applications. Excelsior JET supports all Java 2 platform packages up to version 1.4.2.

Another static Java compiler is GCJ, a member of the GNU Compiler Collection. Following the FSF philosophy, GCJ uses a clean-room, free implementation of the Java 2 Platform API. Although most packages are now supported, there are noticeable exceptions such as the Abstract Window Toolkit (AWT), Swing, and some of the APIs introduced to J2SE 1.4. Contributors to the GNU Classpath project are currently implementing the missing packages. The GNU Interpreter for Java complements GCJ to enable Java dynamic loading. The implementation of a JIT compiler is planned for the future.

Accelerated Swing Tempo

GUI response time is in the eye of the beholder. As a result, it's tough to obtain GUI performance scores. To address this problem, Excelsior has developed JFCMark, a free benchmark suite to measure the performance of the JFC/Swing API. The included tests are manipulating with frames, trees, and tables; switching look-and-feels; decoding and drawing images; displaying and scrolling HTML texts; and using Swing layout managers. JFCMark requires JVMs that support the Java 2 Platform at the level of J2SE 1.3, 1.4, or 1.5.

Each test performs its scenario in the main loop through a given number of iterations. This allows you to obtain performance scores in short- and long-running modes. Upon completion, these tests report performance measured in units specific to their specific scenarios, for instance, frames opened per second. When designing JFCMark, we paid close attention to benchmarking accuracy. For a particular configuration, every benchmark always performs the same number of operations independently of the JVM under test. This requirement is achieved through synchronous processing of Swing events, that is, a next event is sent only after the previous one is processed. Therefore, a possible difference in reported speed depends solely on the time of the benchmark execution.

Let's consider the performance of the Swing windowing system that provides operations with frames. We have used a part of JFCMark to test typical manipulations with frames such as opening/closing, dragging, and selecting. The testbed configuration was as follows.]

  • CPU: AMD Athlon running at 1,8 GHz
  • RAM: 512MB DDR SDRAM
  • Video: NVidia GeForce2 MX-200 at 1024x768x65536c
  • OS: Red Hat Linux release 8.0 (Psyche)
  • Linux Kernel: 2.4.18-14
We've run these benchmarks using a spectrum of JVMs available for Linux:
  • Excelsior JET 3.6 Professional Edition with JRE 1.4.2_04
  • Sun Java HotSpot Client VM 1.4.2_04
  • IBM J2RE 1.4.2 Classic with JIT enabled
  • Sun Java HotSpot Server VM 1.4.2_04
  • BEA WebLogic JRockit 8.1 with JRE 1.4.2_04
The server-oriented JVMs (Sun HotSpot Server and BEA WebLogic JRockit) are included for your reference. They're not expected to really shine in client-side application performance. GCJ is not included because it does not yet have support for JFC/Swing. To play fair, we provided results for both long- and short-running configuration of the benchmarks. Figures show the composite performance index in comparison with the reference implementation (Sun HotSpot Client VM). Longer bars mean better performance.

Figure 3 shows Swing performance scores for the long-playing version of JFCMark. For example, 600 frames are opened and closed by one of the tests. In this scenario, JIT-based JVMs have a chance to "warm up," that is, to optimize hot code for maximum speed. Note, however, that this level of performance may be reached only after doing a "good amount of mouse-clicking" in real-world Java GUI applications. Nevertheless, Excelsior JET still outperforms by 40% the first runner-up (HotSpot Client VM).

To see how Swing works on dynamic JVMs that have not been "warmed up" yet, look at Figure 4. It shows out-of-the-box performance scores for the short-running configuration. This case would correspond to the way Java GUI applications perform right after start-up. As can be seen, the JET-compiled Swing runs fast and it runs fast from the start. Other test participants work at least twice as slow for this scenario. Note that short-running benchmarks are mostly important for the client-side application's performance. For instance, if you start a GUI application and drag something with the mouse, you don't want to see it stumbling just because the JVM has not yet done its job. People talking about "snail Java" often ignore the fact that many JVMs take time to warm up. It's not an issue for server-side applications, which typically run for hours and days. However, the performance of client-side applications is most heavily impacted by the JVM warm-up cycle.

JEdit, a Practical Example

If you don't trust vendor benchmarks, check the results yourself. One of the samples that comes with Excelsior JET is a project for compiling jEdit, an open source, cross-platform text editor written in Java. jEdit has many advanced features that make text editing easier, such as syntax highlighting, auto indent, abbreviation expansion, registers, macros, regular expressions, and multiple file search/replace. It's a good example of a full-featured Swing-based application to evaluate GUI response time.

Behind the Performance Figures

An interesting question is why don't the JIT-powered JVMs hit the performance bar raised by Excelsior JET? Of course, aggressive static optimizations and the removal of bytecode interpretation make Swing work smoothly. However, it seems that the long-running mode of JFCMark should be comfortable for the JIT compilers. Where does the 40% performance win come from? We found the answer unexpectedly when we aimed at further improving Swing performance: the absence of hot methods.

Under the covers, JFC/Swing is quite a complex event-driven system implemented on top of AWT. The implementation consists of several hundred classes which, in turn, use a variety of other core classes. Many thousands of Java methods are executed when, for example, you open a Swing frame. When we obtained the Swing execution profile, it proved to be almost flat. Lots of methods were executed but each of them took hundredths of one percent in total execution time. Only a few methods took more than 1-2%. At this point, we encountered an interesting problem: what should we improve in the absence of clear performance bottlenecks? As you may have guessed, the same problem exists for profile-guided JIT compilers. Instead of a few hot spots to be aggressively optimized, there are plenty of "warm spots" that are left intact. The flat execution profile is an application-specific property that some JVMs cannot effectively manage to achieve top performance.

Conclusion

This article is not about fast and slow JVMs. Rather, it demonstrates that for some Java applications, a JVM with AOT compilation can work faster than JIT-based JVMs. The main lesson we have learned from this study is that one size does not fit all and JFC/Swing is not the only example. One way or another, the Java platform wins and we were happy to make Excelsior JET JVM available to Java/Linux developers.

Resources

  • Java Foundation Classes (JFC/Swing): http://java.sun.com/products/jfc/
  • Excelsior JET: www.excelsior-usa.com/jet.html
  • JFCMark: JFC/Swing benchmark with source code: www.excelsior-usa.com/jfcmark.html
  • Sun HotSpot JVMs: http://java.sun.com/j2se/1.4.2
  • IBM J2RE: www-106.ibm.com/developerworks/java/jdk/linux140/
  • BEA WebLogic JRockit: www.bea.com/framework.jsp?CNT=index.htm&FP=/content/products/jrockit
  • GCJ, the GNU compiler for Java: http://gcc.gnu.org/java/
  • GNU Classpath: www.gnu.org/software/classpath/
  • jEdit: www.jedit.org
  • More Stories By Vitaly Mikheev

    Vitaly Mikheev is the chief technology officer for Excelsior, LLC, a company focusing on design and development of optimizing compilers. Vitaly has been involved in software development since 1987 and focused on compiler construction technologies for the last decade. He started working with Java in 1998 as the architect of the Excelsior Java Virtual Machine. Before that, he worked on proprietary optimizing compilers for Nortel Networks. Vitaly is a member of ACM and a co-author of the patent on the garbage collector algorithm implemented in the Samsung's J2ME CDC virtual machine. He holds an MS in computer science from the Novosibirsk State University, Russia.

    Comments (5) View Comments

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    Most Recent Comments
    Blind Earl 11/08/04 09:04:35 AM EST

    Once I actually found the article it was an interesting read. However, this website looks *terrible*. When a column is only 11 characters wide due to advertising squishing it down, it is time to find some other magazine to read.

    the author 11/08/04 07:14:08 AM EST

    To angel'o'sphere:

    1)

    >>A VM based on a JIT compiler loads always only byte code
    >>and compiles the byte-code to native code on the fly
    >>depending on several algorithms/heuristics.
    >>A AOT compiler does JUST THE SAME on the first invocation
    >>of the byte code program, but it saves the generated
    >>native code to be available for later invocations of the
    >>same program. So for the later invocations, the
    >>compilation was ahead of time, not for the first one

    The thing you are talking about is called Caching JIT compilation (at least HotSpot and BEA engineers called it so when I talked with them at JavaOne 2004)

    Besides, Caching JIT compilation does not occur on the first invocation - it would take long long time and miss profile information useful for optimizations.

    2)

    >>The Excelsior JET is no AOT compiler but just a static
    >>compiler. However they seem to jump on the bandwagon
    >>of a new buzzword, because the term AOT is only used
    >>in academic circles currently

    visit http://gcc.gnu.org/java/ (GNU compiler for Java). It's described as an AOT compiler...

    3)

    >>The next flaw is this: they measure a GUI application
    >>over only 600 "loops". As the exact form of the loop
    >>is not published, I have the impression they made the
    >>loop in a way that a "standard" HotSpot JIT does
    >>predictable not even attempt to compile much of the
    >>covered code

    In this test, HotSpot shows 28,9 frames/sec for short-running configuration and 44,6 frames/sec for long-running one. It does not look like "not even attempt to compile much of the covered code". Increasing the number of iteration to open more frames does not improve the HotSpot results. BTW, did you ever open 600+ frames in a Java application?

    4)

    >>In the "benchmark" of the article,
    >>you neither have a loop (that one is in the external
    >>driver program hidden) nor stack samples which occur
    >>often enough to cause a JIT compilation.

    Check the source code of the benchmark. It includes the loop

    for(int j=0;j>IMHO: the Excelsior people tricked the HotSpot/JIT VMs
    >>in not doing their job, and thus you see the big
    >>performance gain.
    >> There are other tricks thinkable, e.g. not using
    >>the standard SWING library but a tweaked one,
    >>just remove some "synchronized" keywords if
    >> you are sure the benchmark does not need them,
    >> and voila ...

    The standard Swing library and the public JET version were used for benchmarking. As for "synchronized" keywords - some of them are *safely* removed by the JET compiler during the course of escape analysis. There is excellent literature about this optimization technique (for example, see ACM OOPSLA'99 - about 5 papers were devoted to it).

    Take care,

    --Vitaly

    P.S. JITs work well on servers and reusing the same machinery on the desktop is not always appropriate.

    angel'o'sphere 11/08/04 05:28:31 AM EST

    I don't like the article.
    For several reasons I think they cheated and they coin old terms into new terms for no reason (except publicity). E.g. they mix up AOT with static compilation.
    AOT is an extension of JIT compilation, and not a compilation done ahead of the first execution.
    A VM based on a JIT compiler loads always only byte code and compiles the byte-code to native code on the fly depending on several algorithms/heuristics.
    A AOT compiler does JUST THE SAME on the first invocation of the byte code program, but it saves the generated native code to be available for later invocations of the same program. So for the later invocations, the compilation was ahead of time, not for the first one.
    The Excelsior JET is no AOT compiler but just a static compiler. However they seem to jump on the bandwagon of a new buzzword, because the term AOT is only used in academic circles currently (besides that Java 5.0 tries to cash compile informations and even native code over invocations).
    The next flaw is this: they measure a GUI application over only 600 "loops". As the exact form of the loop is not published, I have the impression they made the loop in a way that a "standard" HotSpot JIT does predictable not even attempt to compile much of the covered code.
    To understand that you have to know how a HotSpot JIT works. Two "big" strategies are used: fiddling with loops, this is a short code fragment, inside of one method, consider it as a for loop or while loop. You can do different things with it ... not interesting here.
    The second big thing is call stack sampling. Suppose you have very often the method calls A, B, C on the call stack, the VM decides to check if anything can be inlined or if any of those methods is jitted.
    In the "benchmark" of the article, you neither have a loop (that one is in the external driver program hidden) nor stack samples which occur often enough to cause a JIT compilation.
    Excellent literature about JIT compilation and strategies can be found here: http://www.research.ibm.com/jalapeno/publication.h tml and here: http://www-124.ibm.com/developerworks/oss/jikesrvm /info/papers.shtml
    IMHO: the Excelsior people tricked the HotSpot/JIT VMs in not doing their job, and thus you see the big performance gain. There are other tricks thinkable, e.g. not using the standard SWING library but a tweaked one, just remove some "synchronized" keywords if you are sure the benchmark does not need them, and voila ...

    Seun Osewa 11/07/04 02:32:53 PM EST

    combining the two approaches will lead to overhead similar to a JIT approach (hence, there's nothing to be gained for a long-running program)

    RAMMS+EIN 11/06/04 05:40:50 AM EST

    Why compare JIT against AOT? Why not have both?

    AOT compilation makes for fast start up time and fast run time. JIT advocates claim that it can lead to better performance, as more optimizations can be performed with run-time information. So why not combine the two? Compile it before the first run, and further optimize it at run-time where appropriate. That way, you get the best of both worlds.

    @ThingsExpo Stories
    The Quantified Economy represents the total global addressable market (TAM) for IoT that, according to a recent IDC report, will grow to an unprecedented $1.3 trillion by 2019. With this the third wave of the Internet-global proliferation of connected devices, appliances and sensors is poised to take off in 2016. In his session at @ThingsExpo, David McLauchlan, CEO and co-founder of Buddy Platform, will discuss how the ability to access and analyze the massive volume of streaming data from mil...
    WebSocket is effectively a persistent and fat pipe that is compatible with a standard web infrastructure; a "TCP for the Web." If you think of WebSocket in this light, there are other more hugely interesting applications of WebSocket than just simply sending data to a browser. In his session at 18th Cloud Expo, Frank Greco, Director of Technology for Kaazing Corporation, will compare other modern web connectivity methods such as HTTP/2, HTTP Streaming, Server-Sent Events and new W3C event APIs ...
    SYS-CON Events announced today that Men & Mice, the leading global provider of DNS, DHCP and IP address management overlay solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. The Men & Mice Suite overlay solution is already known for its powerful application in heterogeneous operating environments, enabling enterprises to scale without fuss. Building on a solid range of diverse platform support,...
    Eighty percent of a data scientist’s time is spent gathering and cleaning up data, and 80% of all data is unstructured and almost never analyzed. Cognitive computing, in combination with Big Data, is changing the equation by creating data reservoirs and using natural language processing to enable analysis of unstructured data sources. This is impacting every aspect of the analytics profession from how data is mined (and by whom) to how it is delivered. This is not some futuristic vision: it's ha...
    Silver Spring Networks, Inc. (NYSE: SSNI) extended its Internet of Things technology platform with performance enhancements to Gen5 – its fifth generation critical infrastructure networking platform. Already delivering nearly 23 million devices on five continents as one of the leading networking providers in the market, Silver Spring announced it is doubling the maximum speed of its Gen5 network to up to 2.4 Mbps, increasing computational performance by 10x, supporting simultaneous mesh communic...
    The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, will provide an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data profes...
    With the Apple Watch making its way onto wrists all over the world, it’s only a matter of time before it becomes a staple in the workplace. In fact, Forrester reported that 68 percent of technology and business decision-makers characterize wearables as a top priority for 2015. Recognizing their business value early on, FinancialForce.com was the first to bring ERP to wearables, helping streamline communication across front and back office functions. In his session at @ThingsExpo, Kevin Roberts...
    Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
    One of the bewildering things about DevOps is integrating the massive toolchain including the dozens of new tools that seem to crop up every year. Part of DevOps is Continuous Delivery and having a complex toolchain can add additional integration and setup to your developer environment. In his session at @DevOpsSummit at 18th Cloud Expo, Miko Matsumura, Chief Marketing Officer of Gradle Inc., will discuss which tools to use in a developer stack, how to provision the toolchain to minimize onboa...
    SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 ad...
    SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
    Join us at Cloud Expo | @ThingsExpo 2016 – June 7-9 at the Javits Center in New York City and November 1-3 at the Santa Clara Convention Center in Santa Clara, CA – and deliver your unique message in a way that is striking and unforgettable by taking advantage of SYS-CON's unmatched high-impact, result-driven event / media packages.
    SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
    With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, will discuss the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filte...
    SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Avere delivers a more modern architectural approach to storage that doesn’t require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbuilding of data centers ...
    SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies adopt disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevO...
    Fortunately, meaningful and tangible business cases for IoT are plentiful in a broad array of industries and vertical markets. These range from simple warranty cost reduction for capital intensive assets, to minimizing downtime for vital business tools, to creating feedback loops improving product design, to improving and enhancing enterprise customer experiences. All of these business cases, which will be briefly explored in this session, hinge on cost effectively extracting relevant data from ...
    Companies can harness IoT and predictive analytics to sustain business continuity; predict and manage site performance during emergencies; minimize expensive reactive maintenance; and forecast equipment and maintenance budgets and expenditures. Providing cost-effective, uninterrupted service is challenging, particularly for organizations with geographically dispersed operations.
    As enterprises work to take advantage of Big Data technologies, they frequently become distracted by product-level decisions. In most new Big Data builds this approach is completely counter-productive: it presupposes tools that may not be a fit for development teams, forces IT to take on the burden of evaluating and maintaining unfamiliar technology, and represents a major up-front expense. In his session at @BigDataExpo at @ThingsExpo, Andrew Warfield, CTO and Co-Founder of Coho Data, will dis...
    SYS-CON Events announced today that iDevices®, the preeminent brand in the connected home industry, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. iDevices, the preeminent brand in the connected home industry, has a growing line of HomeKit-enabled products available at the largest retailers worldwide. Through the “Designed with iDevices” co-development program and its custom-built IoT Cloud Infrastruc...