Java performance
Encyclopedia
The performance of a compiled
Java bytecode
Java bytecode is the form of instructions that the Java virtual machine executes. Each bytecode opcode is one byte in length, although some require parameters, resulting in some multi-byte instructions. Not all of the possible 256 opcodes are used. 51 are reserved for future use...

 Java program will depend on how smartly its particular tasks are going to be managed by the host JVM, and how well the JVM takes advantage of the features of the hardware and OS in doing so. Thus, any Java performance test or comparison has to always report the version, vendor, OS and hardware architecture of the used JVM. In a similar manner, the performance of the equivalent natively-compiled program will depend on the quality of its generated machine code, so the test or comparison also has to report the name, version and vendor of the used compiler, and its activated optimization directives
Compiler optimization
Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

.

Historically, Java programs' execution speed improved significantly due to the introduction of Just-In Time compilation (in 1997/1998 for Java 1.1
Java version history
The Java language has undergone several changes since JDK 1.0 as well as numerous additions of classes and packages to the standard library. Since J2SE 1.4, the evolution of the Java language has been governed by the Java Community Process , which uses Java Specification Requests to propose and...

), the addition of language features supporting better code analysis, and optimizations in the Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 itself (such as HotSpot
HotSpot
HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

 becoming the default for Sun's JVM in 2000). Hardware execution of Java bytecode, such as that offered by ARM's Jazelle
Jazelle
Jazelle DBX allows some ARM processors to execute Java bytecode in hardware as a third execution state alongside the existing ARM and Thumb modes. Jazelle functionality was specified in the ARMv5TEJ architecture and the first processor with Jazelle technology was the ARM926EJ-S...

, can also offer significant performance improvements.

Virtual machine optimization techniques

Many optimizations have improved the performance of the Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 over time. However, although Java was often the first Virtual machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

 to implement them successfully, they have often been used in other similar platforms as well.

Just-In-Time compilation

Early Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

s always interpreted bytecodes
Java bytecode
Java bytecode is the form of instructions that the Java virtual machine executes. Each bytecode opcode is one byte in length, although some require parameters, resulting in some multi-byte instructions. Not all of the possible 256 opcodes are used. 51 are reserved for future use...

. This had a huge performance penalty (between a factor 10 and 20 for Java versus C in average applications). To combat this, a just-in-time (JIT) compiler was introduced into Java 1.1. Due to the high cost of compilation, an additional system called HotSpot
HotSpot
HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

 was introduced into Java 1.2 and was made the default in Java 1.3. Using this framework, the Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 continually analyzes the program's performance for "hot spots" which are frequently or repeatedly executed. These are then targeted for optimization
Optimization (computer science)
In computer science, program optimization or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources...

, leading to high performance execution with a minimum of overhead for less performance-critical code.
Some benchmarks show a 10-fold speed gain from this technique. However, due to time constraints, the compiler cannot fully optimize the program, and therefore the resulting program is slower than native code alternatives.

Adaptive optimization

Adaptive optimization is a technique in computer science that performs dynamic recompilation
Dynamic recompilation
In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution...

 of portions of a program based on the current execution profile. With a simple implementation, an adaptive optimizer may simply make a trade-off between Just-in-time compilation and interpreting instructions. At another level, adaptive optimization may take advantage of local data conditions to optimize away branches and to use inline expansion.

A Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 like HotSpot
HotSpot
HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

 is also able to deoptimize a previously JITed
Just-in-time compilation
In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

 code. This allows it to perform aggressive (and potentially unsafe) optimizations, while still being able to deoptimize the code and fall back on a safe path later on.

Garbage collection

The 1.0 and 1.1 Virtual Machines
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 used a mark-sweep collector, which could fragment the heap after a garbage collection.
Starting with Java 1.2, the Virtual Machines
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 switched to a generational collector, which has a much better defragmentation behaviour.
Modern Virtual Machines
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 use a variety of techniques that have further improved the garbage collection
Garbage collection (computer science)
In computer science, garbage collection is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program...

 performance.

Split bytecode verification

Prior to executing a class
Class (computer science)
In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...

, the Sun JVM verifies its bytecode
Java bytecode
Java bytecode is the form of instructions that the Java virtual machine executes. Each bytecode opcode is one byte in length, although some require parameters, resulting in some multi-byte instructions. Not all of the possible 256 opcodes are used. 51 are reserved for future use...

s (see Bytecode verifier). This verification is performed lazily: classes bytecodes are only loaded and verified when the specific class is loaded and prepared for use, and not at the beginning of the program. (Note that other verifiers, such as the Java/400 verifier for IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

 System i, can perform most verification in advance and cache verification information from one use of a class to the next.) However, as the Java Class libraries are also regular Java classes, they must also be loaded when they are used, which means that the start-up time of a Java program is often longer than for C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 programs, for example.

A technique named Split-time verification, first introduced in the J2ME
Java Platform, Micro Edition
Java Platform, Micro Edition, or Java ME, is a Java platform designed for embedded systems . Target devices range from industrial controls to mobile phones and set-top boxes...

 of the Java platform, is used in the Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 since the Java version 6
Java version history
The Java language has undergone several changes since JDK 1.0 as well as numerous additions of classes and packages to the standard library. Since J2SE 1.4, the evolution of the Java language has been governed by the Java Community Process , which uses Java Specification Requests to propose and...

. It splits the verification of bytecode
Java bytecode
Java bytecode is the form of instructions that the Java virtual machine executes. Each bytecode opcode is one byte in length, although some require parameters, resulting in some multi-byte instructions. Not all of the possible 256 opcodes are used. 51 are reserved for future use...

 in two phases:
  • Design-time - during the compilation of the class from source to bytecode
  • runtime - when loading the class.


In practice this technique works by capturing knowledge that the Java compiler has of class flow and annotating the compiled method bytecodes with a synopsis of the class flow information. This does not make runtime verification
Runtime verification
Runtime verification is a computing system analysis and execution approach based on extracting information from a running system and using it to detect and possibly react to observed behaviors satisfying or violating certain properties...

 appreciably less complex, but does allow some shortcuts.

Escape analysis and lock coarsening

Java is able to manage multithreading
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...

 at the language level. Multithreading is a technique that allows programs to operate faster on computer systems that have multiple CPUs
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...

. Also, a multithreaded application has the ability to remain responsive to input, even when it is performing long running tasks.

However, programs that use multithreading need to take extra care of objects
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

 shared between threads, locking access to shared methods
Method (computer science)
In object-oriented programming, a method is a subroutine associated with a class. Methods define the behavior to be exhibited by instances of the associated class at program run time...

 or blocks when they are used by one of the threads. Locking a block or an object is a time-consuming operation due to the nature of the underlying operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

-level operation involved (see concurrency control
Concurrency control
In information technology and computer science, especially in the fields of computer programming , operating systems , multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.Computer...

 and lock granularity).

As the Java library does not know which methods will be used by more than one thread, the standard library always locks blocks when necessary in a multithreaded environment.

Prior to Java 6, the virtual machine always locked
Lock (computer science)
In computer science, a lock is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. Locks are one way of enforcing concurrency control policies.-Types:...

 objects and blocks when asked to by the program even if there was no risk of an object being modified by two different threads at the same time. For example, in this case, a local was locked before each of the add operations to ensure that it would not be modified by other threads (Vector is synchronized), but because it is strictly local to the method this is not necessary:

public String getNames {
Vector v = new Vector;
v.add("Me");
v.add("You");
v.add("Her");
return v.toString;
}

Starting with Java 6, code blocks and objects are locked only when necessary, so in the above case, the virtual machine would not lock the Vector object at all.

As of version 6u14, Java includes experimental support for escape analysis.

Register allocation improvements

Prior to Java 6
Java version history
The Java language has undergone several changes since JDK 1.0 as well as numerous additions of classes and packages to the standard library. Since J2SE 1.4, the evolution of the Java language has been governed by the Java Community Process , which uses Java Specification Requests to propose and...

, allocation of registers
Register allocation
In compiler optimization, register allocation is the process of assigning a large number of target program variables onto a small number of CPU registers...

 was very primitive in the "client" virtual machine (they did not live across blocks), which was a problem in architectures
CPU design
CPU design is the design engineering task of creating a central processing unit , a component of computer hardware. It is a subfield of electronics engineering and computer engineering.- Overview :CPU design focuses on these areas:...

 which did not have a lot of registers
Processor register
In computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...

 available, such as x86
X86 architecture
The term x86 refers to a family of instruction set architectures based on the Intel 8086 CPU. The 8086 was launched in 1978 as a fully 16-bit extension of Intel's 8-bit based 8080 microprocessor and also introduced segmentation to overcome the 16-bit addressing barrier of such designs...

. If there are no more registers available for an operation, the compiler must copy from register to memory (or memory to register), which takes time (registers are significantly faster to access). However the "server" virtual machine used a color-graph
Graph coloring
In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices share the...

 allocator and did not suffer from this problem.

An optimization of register allocation was introduced in Sun's JDK 6; it was then possible to use the same registers across blocks (when applicable), reducing accesses to the memory. This led to a reported performance gain of approximately 60% in some benchmarks.

Class data sharing

Class data sharing (called CDS by Sun) is a mechanism which reduces the startup time for Java applications, and also reduces memory footprint. When the JRE is installed, the installer loads a set of classes from the system jar
JAR (file format)
In software, JAR is an archive file format typically used to aggregate many Java class files and associated metadata and resources into one file to distribute application software or libraries on the Java platform.JAR files are built on the ZIP file format and have the .jar file extension...

 file (the jar file containing all the Java class library, called rt.jar) into a private internal representation, and dumps that representation to a file, called a "shared archive". During subsequent JVM
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 invocations, this shared archive is memory-mapped
Memory-mapped file
A memory-mapped file is a segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource. This resource is typically a file that is physically present on-disk, but can also be a device, shared memory object, or other resource...

 in, saving the cost of loading those classes and allowing much of the JVM's Metadata for these classes to be shared among multiple JVM
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

 processes.

The corresponding improvement for start-up time is more noticeable for small programs.

Sun Java versions performance improvements

Apart from the improvements listed here, each Sun
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

's Java version introduced many performance improvements in the Java API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

.

JDK 1.1.6 : First Just-in-time compilation
Just-in-time compilation
In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

 (Symantec
Symantec
Symantec Corporation is the largest maker of security software for computers. The company is headquartered in Mountain View, California, and is a Fortune 500 company and a member of the S&P 500 stock market index.-History:...

's JIT-compiler)

J2SE 1.2 : Use of a generational collector.

J2SE 1.3 : Just-In-Time compilation by HotSpot
HotSpot
HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

.

J2SE 1.4 : See here, for a Sun overview of performance improvements between 1.3 and 1.4 versions.

Java SE 5.0 : Class Data Sharing

Java SE 6 :
  • Split bytecode verification
  • Escape analysis and lock coarsening
  • Register allocation Improvements


Other improvements:
  • Java OpenGL
    OpenGL
    OpenGL is a standard specification defining a cross-language, cross-platform API for writing applications that produce 2D and 3D computer graphics. The interface consists of over 250 different function calls which can be used to draw complex three-dimensional scenes from simple primitives. OpenGL...

     Java 2D pipeline
    Java 2D
    In computing, Java 2D is an API for drawing two-dimensional graphics using the Java programming language. Every Java 2D drawing operation can ultimately be treated as filling a shape using a paint and compositing the result onto the screen....

     speed improvements
  • Java 2D
    Java 2D
    In computing, Java 2D is an API for drawing two-dimensional graphics using the Java programming language. Every Java 2D drawing operation can ultimately be treated as filling a shape using a paint and compositing the result onto the screen....

     performance has also improved significantly in Java 6


See also 'Sun overview of performance improvements between Java 5 and Java 6'.

Java SE 6 Update 10

  • Java Quick Starter reduces application start-up time by preloading part of JRE data at OS startup on disk cache
    Page cache
    In computing, page cache, sometimes ambiguously called disk cache, is a "transparent" buffer of disk-backed pages kept in main memory by the operating system for quicker access. Page cache is typically implemented in kernels with the paging memory management, and is completely transparent to...

    .
  • Parts of the platform that are necessary to execute an application accessed from the web when JRE is not installed are now downloaded first. The entire JRE is 12 MB, a typical Swing application only needs to download 4 MB to start. The remaining parts are then downloaded in the background.
  • Graphics performance on Windows
    Microsoft Windows
    Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

     improved by extensively using Direct3D
    Direct3D
    Direct3D is part of Microsoft's DirectX application programming interface . Direct3D is available for Microsoft Windows operating systems , and for other platforms through the open source software Wine. It is the base for the graphics API on the Xbox and Xbox 360 console systems...

     by default, and use Shader
    Shader
    In the field of computer graphics, a shader is a computer program that is used primarily to calculate rendering effects on graphics hardware with a high degree of flexibility...

    s on GPU
    Graphics processing unit
    A graphics processing unit or GPU is a specialized circuit designed to rapidly manipulate and alter memory in such a way so as to accelerate the building of images in a frame buffer intended for output to a display...

     to accelerate complex Java 2D
    Java 2D
    In computing, Java 2D is an API for drawing two-dimensional graphics using the Java programming language. Every Java 2D drawing operation can ultimately be treated as filling a shape using a paint and compositing the result onto the screen....

     operations.

Future improvements

Future performance improvements are planned for an update of Java 6 or Java 7:
  • Provide JVM
    Java Virtual Machine
    A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

     support for dynamic languages
    Dynamic programming language
    Dynamic programming language is a term used broadly in computer science to describe a class of high-level programming languages that execute at runtime many common behaviors that other languages might perform during compilation, if at all...

    , following the prototyping work currently done on the Multi Language Virtual Machine
    Da Vinci Machine
    The Da Vinci Machine, also called the Multi Language Virtual Machine is a Sun Microsystems project aiming to prototype the extension of the Java Virtual Machine to add support for dynamic languages....

    ,
  • Enhance the existing concurrency library by managing parallel computing
    Parallel computing
    Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

     on multi-core processors,
  • Allow the virtual machine
    Java Virtual Machine
    A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

     to use both the Client and Server compiler
    Just-in-time compilation
    In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

    s in the same session with a technique called Tiered compilation:
    • The Client would be used at startup (because it is good at startup and for small applications),
    • The Server would be used for long-term running of the application (because it outperforms the Client compiler for this).
  • Replace the existing concurrent low-pause garbage collector (also called CMS or Concurrent Mark-Sweep collector) by a new collector called G1 (or Garbage First) to ensure consistent pauses over time.

Comparison to other languages

Objectively comparing the performance of a Java program and another equivalent one written in another programming language such as C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

 requires a carefully and thoughtfully constructed benchmark which compares programs expressing algorithms written in as identical a manner as technically possible. The target platform
Platform (computing)
A computing platform includes some sort of hardware architecture and a software framework , where the combination allows software, particularly application software, to run...

 of Java's bytecode
Bytecode
Bytecode, also known as p-code , is a term which has been used to denote various forms of instruction sets designed for efficient execution by a software interpreter as well as being suitable for further compilation into machine code...

 compiler is the Java platform, and the bytecode is either interpreted or compiled into machine code by the JVM
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

. Other compilers almost always target a specific hardware and software platform, producing machine code that will stay virtually unchanged during its execution. Very different and hard-to-compare scenarios arise from these two different approaches: static vs. dynamic compilation
Dynamic compilation
Dynamic compilation is a process used by some programming language implementations to gain performance during program execution. Although the technique originated in the Self programming language, the best-known language that uses this technique is Java...

s and recompilations
Dynamic recompilation
In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution...

, the availability of precise information about the runtime environment and others.

Java is often Just-in-time compiled
Just-in-time compilation
In computing, just-in-time compilation , also known as dynamic translation, is a method to improve the runtime performance of computer programs. Historically, computer programs had two modes of runtime operation, either interpreted or static compilation...

 at runtime by the Java Virtual Machine
Virtual machine
A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

, but may also be compiled ahead-of-time
AOT compiler
An ahead-of-time compiler is a compiler that implements ahead-of-time compilation. This refers to the act of compiling an intermediate language, such as Java bytecode, .NET Common Intermediate Language , or IBM System/38 or IBM System i "Technology Independent Machine Interface" code, into a...

, just like C++. When Just-in-time compiled, its performance is generally:
  • moderately slower than compiled languages such as C
    C (programming language)
    C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

     or C++
    C++
    C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

    ,
  • similar to other Just-in-time compiled languages such as C#,
  • much faster than languages without an effective native-code compiler (JIT or AOT
    AOT compiler
    An ahead-of-time compiler is a compiler that implements ahead-of-time compilation. This refers to the act of compiling an intermediate language, such as Java bytecode, .NET Common Intermediate Language , or IBM System/38 or IBM System i "Technology Independent Machine Interface" code, into a...

    ), such as Perl
    Perl
    Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

    , Ruby
    Ruby (programming language)
    Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

    , PHP
    PHP
    PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

     and Python
    Python (programming language)
    Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

    .

Program speed

Java is in some cases equal to C++ on low-level and numeric benchmarks.

Benchmarks often measure performance for small numerically-intensive programs. In some real-life programs, Java out-performs C. One example is the benchmark of Jake2
Jake2
Jake2 is a Java port of the GPL release of the Quake II game engine.-History:The 0.9.1 version of Jake2 was shown by the JOGL team for JavaOne 2004, to present an example of Java-OpenGL interoperability...

 (a clone of Quake 2
Quake II
Quake II, released on December 9, 1997, is a first-person shooter computer game developed by Id Software and distributed by Activision. It is not a sequel to Quake; it merely uses the name of the former game due to Id's difficulties in coming up with alternative names.The soundtrack for Quake II...

 written in Java by translating the original GPL C code). The Java 5.0 version performs better in some hardware configurations than its C counterpart. While it's not specified how the data was measured (for example if the original Quake 2 executable compiled in 1997 was used, which may be considered bad as current C compilers may achieve better optimizations for Quake), it notes how the same Java source code can have a huge speed boost just by updating the VM, something impossible to achieve with a 100% static approach. For other programs the C++ counterpart runs significantly faster than the Java equivalent

Some optimizations that are possible in Java and similar languages are not possible in C++:
  • C-style pointers make optimization hard in languages that support them,
  • The use of escape analysis techniques is limited in C++
    C++
    C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

     for example, because the compiler does not know where an object
    Object (computer science)
    In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...

     will be used as accurately (also because of pointers).


The JVM is also able to perform processor specific optimizations or inline expansion
Inline expansion
In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the callee. This optimization may improve time and space usage at runtime, at the possible cost of increasing the final size of the program In computing, inline...

. The ability to deoptimize code previously compiled or inlined allows to perform more aggressive optimizations than those performed with statically typed languages.

Results for microbenchmarks
Benchmark (computing)
In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it...

 between Java and C++ highly depend on which operations are compared. For example, when comparing with Java 5.0:
  • 32 and 64 bits arithmetics operations, File I/O
    Input/output
    In computing, input/output, or I/O, refers to the communication between an information processing system , and the outside world, possibly a human, or another information processing system. Inputs are the signals or data received by the system, and outputs are the signals or data sent from it...

     and Exception handling
    Exception handling
    Exception handling is a programming language construct or computer hardware mechanism designed to handle the occurrence of exceptions, special conditions that change the normal flow of program execution....

    , have a similar performance to comparable C++ programs
  • Array
    Array data type
    In computer science, an array type is a data type that is meant to describe a collection of elements , each selected by one or more indices that can be computed at run time by the program. Such a collection is usually called an array variable, array value, or simply array...

    s operations performance are better in C.
  • Trigonometric functions performance is much better in C.

Multi-core performance

The scalability and performance of Java applications on multi-core systems is limited by the object allocation rate. This effect is sometimes called an "allocation wall". Also, applications that have not been tuned for multi-core systems may suffer from lock contention.

Startup time

Java startup time is often much slower than for C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 or C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...

, because a lot of classes (and first of all classes from the platform Class libraries) must be loaded before being used.

When compared against similar popular runtimes, for small programs running on a Windows machine, the startup time appears to be similar to Mono's
Mono (software)
Mono, pronounced , is a free and open source project led by Xamarin to create an Ecma standard compliant .NET-compatible set of tools including, among others, a C# compiler and a Common Language Runtime....

 and a little slower than .Net's
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

.

It seems that much of the startup time is due to IO-bound operations rather than JVM initialization or class loading (the rt.jar class data file alone is 40 MB and the JVM must seek a lot of data in this huge file). Some tests showed that although the new Split bytecode verification technique improved class loading by roughly 40%, it only translated to about 5% startup
improvement for large programs.

Albeit a small improvement it is more visible in small programs that perform a simple operation and then exit, because the Java platform data loading can represent many times the load of the actual program's operation.

Beginning with Java SE 6 Update 10, the Sun JRE comes with a Quick Starter that preloads class data at OS startup to get data from the disk cache
Page cache
In computing, page cache, sometimes ambiguously called disk cache, is a "transparent" buffer of disk-backed pages kept in main memory by the operating system for quicker access. Page cache is typically implemented in kernels with the paging memory management, and is completely transparent to...

 rather than from the disk.

Excelsior JET approaches the problem from the other side. Its Startup Optimizer reduces the amount of data that must be read from the disk on application startup, and makes the reads more sequential.

Memory usage

Java memory usage is heavier than for C++, because:
  • there is a 8-byte overhead for each object and 12-byte for each array in Java (32-bit; twice as much in 64-bit java). If size of an object is not a multiple of 8 bytes, it is rounded up to next multiple of 8. This means an object containing a single byte field occupies 16 bytes and requires 4-byte reference. However, C++ also allocates a pointer (usually 4 or 8 bytes) for every object that declares virtual function
    Virtual function
    In object-oriented programming, a virtual function or virtual method is a function or method whose behaviour can be overridden within an inheriting class by a function with the same signature...

    s.
  • parts of the Java Library
    Java Class Library
    The Java Class Library is a set of dynamically loadable libraries that Java applications can call at run time. Because the Java Platform is not dependent on any specific operating system, applications cannot rely on any of the existing libraries...

     must be loaded prior to the program execution (at least the classes that are used "under the hood" by the program). This leads to a significant memory overhead for small applications when compared to its best known competitors Mono
    Mono (software)
    Mono, pronounced , is a free and open source project led by Xamarin to create an Ecma standard compliant .NET-compatible set of tools including, among others, a C# compiler and a Common Language Runtime....

      or .Net
    .NET Framework
    The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

    .

  • both the Java binary and native recompilations will typically both be in memory
  • the virtual machine itself consumes memory.
  • in Java, a composite object (class A which uses instances of B and C) is created using references to allocated instances of B and C. In C++ the cost of the references can be avoided.
  • lack of address arithmetic makes creating memory-efficient containers, such as tightly spaced structures and XOR linked list
    XOR linked list
    An XOR linked list is a data structure used in computer programming. They take advantage of the bitwise exclusive disjunction operation, here denoted by ⊕, to decrease storage requirements for doubly linked lists. An ordinary doubly linked list stores addresses of the previous and next list items...

    s, impossible.


However, it can be difficult to strictly compare the impact on memory of using Java versus C++. Some reasons why are:
  • In C++, memory deallocation happens synchronously. In contrast, Java can do the deallocation asynchronously, possibly when the program is otherwise idle. A program that has periods of inactivity might perform better with Java because it is not deallocating memory during its active phase.
  • In C++, there can be a question of which part of the code "owns" an object, and is therefore responsible for deallocating it. For example, a container of objects might make copies of objects inserted into it, relying on the calling code to free its own copy, or it might insert the original object, creating an ambiguity of whether the calling code is handing the object off to the container (in which case the container should free the object when it is removed) or asking the container only to remember the object (in which case the calling code, not the container, will free the object later). For example, the C++ standard containers (in the STL
    Standard Template Library
    The Standard Template Library is a C++ software library which later evolved into the C++ Standard Library. It provides four components called algorithms, containers, functors, and iterators. More specifically, the C++ Standard Library is based on the STL published by SGI. Both include some...

    ) make copies of inserted objects. In Java, none of this is necessary because neither the calling code nor the container "owns" the object. So while the memory needed for a single object can be heaver than in C++, actual Java programs may create fewer objects, depending on the memory strategies of the C++ code, and if so, the time required for creating, copying, and deleting these objects is also not present in a Java program.

The consequences of these and other differences are highly dependent on the algorithms involved, the actual implementations of the memory allocation systems (free, delete, or the garbage collector), and the specific hardware.

As a result, for applications in which memory is a critical factor of choosing between languages, a deep analysis is required.

One should also keep in mind that a program that uses a garbage collector needs about five times the memory of a program that uses explicit memory management in order to reach the same performance.

Trigonometric functions

Performance of trigonometric functions can be bad compared to C, because Java has strict specifications for the results of mathematical operations, which may not correspond to the underlying hardware implementation. On the x87
X87
x87 is a floating point-related subset of the x86 architecture instruction set. It originated as an extension of the 8086 instruction set in the form of optional floating point coprocessors that worked in tandem with corresponding x86 CPUs. These microchips had names ending in "87"...

, Java since 1.4 does argument reduction for sin and cos in software, causing a big performance hit for values outside the range.

Java Native Interface

The Java Native Interface
Java Native Interface
The Java Native Interface is a programming framework that enables Java code running in a Java Virtual Machine to call and to be called by native applications and libraries written in other languages such as C, C++ and assembly.-Purpose and features:JNI enables one to write native methods to...

 has a high overhead associated with it, making it costly to cross the boundary between code running on the JVM and native code. Java Native Access
Java Native Access
Java Native Access provides Java programs easy access to native shared libraries without using the Java Native Interface. JNA's design aims to provide native access in a natural way with a minimum of effort...

 (JNA) provides Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 programs easy access to native shared libraries (DLL
Dynamic-link library
Dynamic-link library , or DLL, is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems...

s on Windows) without writing anything but Java code—no JNI or native code is required. This functionality is comparable to Windows' Platform/Invoke and Python's
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 ctypes. Access is dynamic at runtime without code generation. But it comes with a cost and JNA is usually slower than JNI.

User interface

Swing
Swing (Java)
Swing is the primary Java GUI widget toolkit. It is part of Oracle's Java Foundation Classes — an API for providing a graphical user interface for Java programs....

 has been perceived as slower than native widget toolkit
Widget toolkit
In computing, a widget toolkit, widget library, or GUI toolkit is a set of widgets for use in designing applications with graphical user interfaces...

s, because it delegates the rendering of widgets to the pure Java Java 2D
Java 2D
In computing, Java 2D is an API for drawing two-dimensional graphics using the Java programming language. Every Java 2D drawing operation can ultimately be treated as filling a shape using a paint and compositing the result onto the screen....

 API. However, benchmarks comparing the performance of Swing versus the Standard Widget Toolkit
Standard Widget Toolkit
The Standard Widget Toolkit is a graphical widget toolkit for use with the Java platform. It was originally developed by IBM and is now maintained by the Eclipse Foundation in tandem with the Eclipse IDE...

, which delegates the rendering to the native GUI libraries of the operating system, show no clear winner, and the results greatly depend on the context and the environments.

Use for high performance computing

Recent independent studies seem to show that Java performance for high performance computing (HPC) is similar to Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

 on computation intensive benchmarks, but that JVMs still have scalability issues for performing intensive communication on a Grid Network
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

.

However, high performance computing applications written in Java have recently won benchmark competitions. In 2008 and 2009, an Apache Hadoop
Hadoop
Apache Hadoop is a software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data...

 (an open-source high performance computing project written in Java) based cluster was able to sort a terabyte and petabyte of integers the fastest. The hardware setup of the competing systems was not fixed, however.

In programming contests

As Java solutions run slower than solutions in other compiled languages, it is not uncommon for online judges to use greater time limits for Java solutions to be fair to contestants using Java.

See also

  • Java Platform
  • Java Virtual Machine
    Java Virtual Machine
    A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

  • HotSpot
    HotSpot
    HotSpot is a Java virtual machine for desktops and servers, maintained and distributed by Oracle Corporation. It features techniques such as just-in-time compilation and adaptive optimization designed to improve performance.-History:...

  • Java Runtime Environment
  • Java version history
    Java version history
    The Java language has undergone several changes since JDK 1.0 as well as numerous additions of classes and packages to the standard library. Since J2SE 1.4, the evolution of the Java language has been governed by the Java Community Process , which uses Java Specification Requests to propose and...

  • Virtual Machine
    Virtual machine
    A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

  • Common Language Runtime
    Common Language Runtime
    The Common Language Runtime is the virtual machine component of Microsoft's .NET framework and is responsible for managing the execution of .NET programs. In a process known as just-in-time compilation, the CLR compiles the intermediate language code known as CIL into the machine instructions...

  • Compiler optimization
    Compiler optimization
    Compiler optimization is the process of tuning the output of a compiler to minimize or maximize some attributes of an executable computer program. The most common requirement is to minimize the time taken to execute a program; a less common one is to minimize the amount of memory occupied...

  • Performance analysis
    Performance analysis
    In software engineering, profiling is a form of dynamic program analysis that measures, for example, the usage of memory, the usage of particular instructions, or frequency and duration of function calls...

  • JStik
    Jstik
    The JStik is a microcontroller based on the aJile Systems line of embedded Java processors. It is novel in that it uses Java byte code as the native machine language...

     - An embedded processor running Java bytecode natively.
  • Comparison of Java and C++
    Comparison of Java and C++
    This is a comparison of the Java programming language with the C++ programming language.- Design aims :The differences between the C++ and Java programming languages can be traced to their heritage, as they have different design goals....


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK