How to create executable jar file?

Method 1: I used to create a executable jar file by eclipse export “Runnable JAR file”

It’s very handy, and it has 3 options to handle libraries:

  • Extract required libraries into generated JAR
  • Package required libraries into generated JAR
  • Copy required libraries into sub folder next to the generated JAR.

I really like it. You don’t need to configure anything but just a few clicks. I think it’s the best way if it’s just for test.It works well except when I use the 1st option. I run into this problem :

org.xml.sax.SAXParseException: schema_reference.4: Failed to read schema document 'http://www.springframework.org/schema/beans/spring-beans-4.0.xsd', because 1) could not find the document; 2) the document could not be read; 3) the root element of the document is not <xsd:schema>.

This is because the xsd file is missed/overwroten when eclipse extract the libs, and my runtime host cannot connect to Internet that makes it impossible to get it. While this is  not a real problem for eclipse, it can be overcome by using the 2nd option. It didn’t extract the lib, but includes them as they’re.

I wonder why eclipse takes a lot of efforts to extract libs (it’s obviously slower than other 2 options in the list). At first, I thought it may want to build small jar packages(it’s obviously smaller ). Later, I found out the real reason is: java can NOT load classes from a Jar inside a Jar! (In real world, how can you put an even larger jar inside a jar? You are kidding!) Java(class loader) is designed to like this.

Now the question is, how eclipse overcome this jar inside jar problem? I unzipped the generated jar file, and found there’s a jarinjarloader class which did a trick behind. See more below.

Method 2: Using onejar-maven-plugin
Because the network between my laptop and the target machine is slow. It takes times to transfer the runnable jar file, even it’s not big. So I need to build it remotely without eclipse, use maven instead. This plugin did the similar trick jarinjarloder did, the 2nd option in eclipse. It basically let java call its main function, and it help you load all dependency jar files including yourselves. Here is the configuration clip:

		  	<plugin>
		  	    <groupId>com.jolira</groupId>
		  	    <artifactId>onejar-maven-plugin</artifactId>
		  	    <version>1.4.4</version>
		  	    <executions>
		  	        <execution>
		  	            <configuration>
		  	                <attachToBuild>true</attachToBuild>
		  	                <classifier>onejar</classifier>
		  	            </configuration>
		  	            <goals>
		  	                <goal>one-jar</goal>
		  	            </goals>
		  	        </execution>
		  	    </executions>
		  	</plugin>

The associated plugins may also needed:

			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<version>2.3.2</version>
				<configuration>
					<source>1.6</source>
					<target>1.6</target>
					<encoding>UTF-8</encoding>
				</configuration>
			</plugin>
		   	<plugin>
		  	    <groupId>org.apache.maven.plugins</groupId>
		  	    <artifactId>maven-jar-plugin</artifactId>
		  	    <configuration>
		  	        <archive>
		  	            <manifest>
		  	                <mainClass>com.wordpress.alexzeng.automation</mainClass>
		  	            </manifest>
		  	        </archive>
		  	    </configuration>
		  	</plugin>

It works well as long as you get the dependency configured well in pom.xml. I missed an ojdbc6.jar at first.

Method 3: Using maven-dependency-plugin, corresponding to the 3rd option in eclipse.
I tried this in my project, but failed somehow. I have 23 jar files in final lib directory. But somehow it only includes 19 jar files in the Class-Path of META-INF/MANIFEST.MF. It refused to work even after I added the jar files to classpath variable in environment. I think this method should work, and this is the best one for big projects (you don’t want to have a very big single jar file). You can reference http://www.mkyong.com/maven/how-to-create-a-jar-file-with-maven/ for a simple case.

Last but not least, their’s also a corresponding method in maven to match the eclipse 1st option, use maven-assembly-plugin pre-defined descriptor jar-with-dependencies. It’s really simple and handy if extract dependency jars didn’t cause troubles. 

		    <plugin>
                      <groupId>org.apache.maven.plugins</groupId>
                      <artifactId>maven-assembly-plugin</artifactId>
                      <version>2.4</version>
		      <configuration>
		          <descriptorRefs>
		             <descriptorRef>jar-with-dependencies</descriptorRef>
		          </descriptorRefs>
		        <archive>
		          <manifest>
		            <mainClass>com.wordpress.alexzeng.automation</mainClass>
		          </manifest>
		        </archive>
		      </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
		    </plugin>

As a newbie, I spent a lot of time to get it clear. I hope it can save you sometime to if you need to do the same.

Debug Cassandrar JVM thread 100% CPU usage issue

Recently, one of our cassandra nodes run into an issue: one CPU thread utilization is 100% while the others are almost idle. The node shows “Down” in the nodetool now and then.

Java supports threads, and threads can use different CPU thread without problem in theory. But why one CPU is 100% while the others are idle? What the 100% utilization CPU is doing? Here is what I did:

1. Find out the thread ID that uses 100% CPU
In this Linux host, I run top to see the CPU usage, and then type 1 to show each CPU thread uage, then type H to show thread of processes. When I see their is one CPU thread is 100% usage, I can see the top thread as below.

 35376 cassandr  20   0 28.8g  10g 1.2g R 99.7  8.0  11:13.08 java

Its CPU usage is 99.7%

2. Find out what the thread is doing
I use “jstack -l ” to dump all JVM thread calling stacks. I need to find the thread with ID 35376. The number of thread id in jstack dump file is hexadecimal format while it’s decimal format in top output. Decimal 35376 equals 8a30 HEX. Got it in the output:

...
"VM Thread" prio=10 tid=0x00007f2a78313000 nid=0x8a30 runnable
...

So I know that “VM Thread” is the culprit. From my basic understanding, its main job is GC(Garbage Collection). If there is no memory leak problem in the application java code, GC can be improved by adjust JVM HEAP SIZE parameters. First I need to check the current heap usage.

3. Get JVM heap usage
We can just “jmap -heap “:

$ ./jmap -heap 122576
Attaching to process ID 122576, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 20.10-b01

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 8589934592 (8192.0MB)
   NewSize          = 2147483648 (2048.0MB)
   MaxNewSize       = 2147483648 (2048.0MB)
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 85983232 (82.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 1932787712 (1843.25MB)
   used     = 1932773920 (1843.2368469238281MB)
   free     = 13792 (0.013153076171875MB)
   99.99928641930438% used
Eden Space:
   capacity = 1718091776 (1638.5MB)
   used     = 1718091768 (1638.4999923706055MB)
   free     = 8 (7.62939453125E-6MB)
   99.99999953436713% used
From Space:
   capacity = 214695936 (204.75MB)
   used     = 214682152 (204.73685455322266MB)
   free     = 13784 (0.01314544677734375MB)
   99.99357975737371% used
To Space:
   capacity = 214695936 (204.75MB)
   used     = 0 (0.0MB)
   free     = 214695936 (204.75MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 6442450944 (6144.0MB)
   used     = 6442450896 (6143.999954223633MB)
   free     = 48 (4.57763671875E-5MB)
   99.99999925494194% used
Perm Generation:
   capacity = 58068992 (55.37890625MB)
   used     = 34732264 (33.123268127441406MB)
   free     = 23336728 (22.255638122558594MB)
   59.81206630898639% used

New Generation is 99.999% usage. This could be a problem because when the code create a new object it will need get memory from this part. If it’s full, JVM needs to scan the memory area to release memory. Before it’s done, the program can do nothing but wait.

4. Adjust JVM heap size
I set the parameters in cassandra-env.sh :

MAX_HEAP_SIZE="16G"
HEAP_NEWSIZE="4G"

It will set it in JVM options as follows:

JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"

5. Check result
After restart cassandra, and let it run a while, I run jmap again:

$ ./jmap -heap 35339
Attaching to process ID 35339, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 20.10-b01

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 17179869184 (16384.0MB)
   NewSize          = 4294967296 (4096.0MB)
   MaxNewSize       = 4294967296 (4096.0MB)
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 85983232 (82.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 3865509888 (3686.4375MB)
   used     = 1271912480 (1212.9902648925781MB)
   free     = 2593597408 (2473.447235107422MB)
   32.90413210294704% used
Eden Space:
   capacity = 3436052480 (3276.875MB)
   used     = 1013629048 (966.671989440918MB)
   free     = 2422423432 (2310.203010559082MB)
   29.499812761881913% used
From Space:
   capacity = 429457408 (409.5625MB)
   used     = 258283432 (246.31827545166016MB)
   free     = 171173976 (163.24422454833984MB)
   60.14180386428449% used
To Space:
   capacity = 429457408 (409.5625MB)
   used     = 0 (0.0MB)
   free     = 429457408 (409.5625MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 12884901888 (12288.0MB)
   used     = 3744285600 (3570.8290100097656MB)
   free     = 9140616288 (8717.170989990234MB)
   29.059480875730515% used
Perm Generation:
   capacity = 62611456 (59.7109375MB)
   used     = 37563512 (35.82335662841797MB)
   free     = 25047944 (23.88758087158203MB)
   59.99463101449038% used

In another case, it’s tomcat and home-grown java application, we got similar 100% CPU usage on one thread. While the solution is opposite : I reduced the JVM heap size to fix it because the default MAX JVM heap size is 32GB which is much more than enough. The program performance is good at start, but it will run slower and slower. Because setting too large heap size may also causes similar problem because the bigger the JVM heap size, the more work GC need to do, similar to Oracle shared pool size.

jmap and jstack is available in JDK. If you don’t have JDK, you need to download the exact same version(including minor version) JDK of the corresponding JRE otherwise it will complain incompatible version.

Kindly reminder, I am a newbie to java, thus my understanding may be wrong. Please use your own judgement on the contents.

References:
http://java.sys-con.com/node/1611555
http://middlewaremagic.com/weblogic/?tag=young-generation