What Is an Uber Jar

What is an uber jar?

Über is the German word for above or over (it's actually cognate with the English over).

Hence, in this context, an uber-jar is an "over-jar", one level up from a simple JAR (a), defined as one that contains both your package and all its dependencies in one single JAR file. The name can be thought to come from the same stable as ultrageek, superman, hyperspace, and metadata, which all have similar meanings of "beyond the normal".

The advantage is that you can distribute your uber-jar and not care at all whether or not dependencies are installed at the destination, as your uber-jar actually has no dependencies.

All the dependencies of your own stuff within the uber-jar are also within that uber-jar. As are all dependencies of those dependencies. And so on.


(a) I probably shouldn't have to explain what a JAR is to a Java developer but I'll include it for completeness. It's a Java archive, basically a single file that typically contains a number of Java class files along with associated metadata and resources.

What is a shaded jar? And what is the difference/similarities between uber jar and shaded jar?

I'll explain what an uber JAR is first because this underpins the shading explanation.

Uber JAR

An uber JAR is a JAR which contains the contents of multiple JARs (or, less commonly, multiple other JARs themselves)

Your application will almost certainly use other packages and these packages might be provided as JARs. When using Maven these dependencies would be expressed as follows:

<dependency>
<groupId>...</groupId>
<artifactId>...</artifactId>
<version>...</version>
</dependency>

At runtime your application will expect to find the classes contained in this JAR on its classpath.

Rather than shipping each of these dependent JARs along with your application, you could create an uber JAR which contains all of the classes etc from these dependent JARs and then simply run your application from this uber JAR.

Shading

Shading provides a way of creating an uber JAR and renaming the packages which that uber JAR contains. If your uber JAR is likely to be used as a dependency in another application then there's a risk that the versions of the dependent classes in the uber JAR might clash with versions of those same dependencies in this other application. Shading helps to avoid any such issue by renaming the packages within the uber JAR.

For example:

  1. You create an uber JAR which contains v1.0.0 of the Foo library.
  2. Someone else uses your uber JAR in their application, Bar
  3. The Bar application has its own dependency on Foo but on v1.2.0 of that library.

Now, if there is any clash between versions 1.0.0 and 1.2.0 of Foo we may have a problem because the owner of Bar cannot rely on which one will be loaded so either their code will misbehave or your code - when running within their application - will misbehave.

Shading helps to avoid issues such as this and also allows the provider of Foo to be explicit about the versions of the dependent libraries it uses.

The maven-shade-plugin allows you to (a) create an uber JAR and (b) to shade its contents.

Summary

Creating an uber JAR is a useful technique for simplifying your deployment process.

Shading is an extension to the uber JAR idea which is typically limited to use cases where

  • The JAR is a library to be used inside another application/library
  • The authors of the JAR want to be sure that the dependencies used by the JAR are in their control
  • The authors of the JAR want to avoid 'version clash' issues for any applications/libraries using the JAR

What is a fat JAR?

The fat jar is the jar, which contains classes from all the libraries, on which your project depends and, of course, the classes of current project.

In different build systems fat jar is created differently, for example, in Gradle one would create it with (instruction):

task fatJar(type: Jar) {
manifest {
attributes 'Main-Class': 'com.example.Main'
}
baseName = project.name + '-all'
from { configurations.compile.collect { it.isDirectory() ? it : zipTree(it) } }
with jar
}

In Maven it's being done this way (after setting up regular jar):

<pluginRepositories>
<pluginRepository>
<id>onejar-maven-plugin.googlecode.com</id>
<url>http://onejar-maven-plugin.googlecode.com/svn/mavenrepo</url>
</pluginRepository>


<plugin>
<groupid>org.dstovall</groupid>
<artifactid>onejar-maven-plugin</artifactid>
<version>1.4.4</version>
<executions>
<execution>
<configuration>
<onejarversion>0.97</onejarversion>
<classifier>onejar</classifier>
</configuration>
<goals>
<goal>one-jar</goal>
</goals>
</execution>
</executions>
</plugin>

Why two java/scala uber jars running on the same cluster bump into shading issues?

JAR is just a ZIP of directories containing .class files and apps resources.

Uber JAR is just taking all your dependencies (other JARs - extracted, compilation output) and put them in a single archive so that whatever uses them, don't have to fetch other JARs.

If you build 2 Uber JARs with different versions of the same dependency then when you'll try to load them in the same ClassLoader at one, it will have issues because there will be 2 .class files of the same class.

So if you always intend to deploy for the same cluster, just bundle them together. And if you want to deploy them separately, it would be easier to not build 2 uberjars because dependencies will overlap. You could e.g. build 2 unerjar of dependencies and make your code depend on it (then have 2 JARs on the class path), or use whatever other strategy that avoids conflicts.



Related Topics



Leave a reply



Submit