How to Make It Easy and Simple to Start Java Processes in Linux/Docker
Check out this tutorial that will help you simplify the process of starting Java programs by cutting down on some of the scripts and syntax.
Join the DZone community and get the full member experience.
Join For FreeOne of our colleagues works as the DevOps-engineer, and he often deals with the automation of the installation and configuration of a variety of IT-systems in various environments, from containers to the cloud. He used to work with many systems based on the Java-stack: from small (like Tomcat) to large (Hadoop, Cassandra, etc.).
Almost every such system, even the simplest, for some reason had a complex unique launch system. At a minimum, these were multi-line shell scripts, as in Tomcat, and even entire frameworks, as in Hadoop. Our current "patient" in this series, inspired us to write this article — the repository of Nexus OSS 3 artifacts, the launch script which takes ~ 400 lines of code.
Opacity, redundancy, entanglement of startup scripts creates problems even when manually installing one component on the local system. And now imagine that the set of such components and services needs to be packaged in a Docker container, simultaneously writing another layer of abstraction for more or less adequate orchestration, deployed in the Kubernetes cluster and implemented this process in the form of CI / CD-payline.
In short, let's look at the example of the Nexus 3 mentioned above, how to return from the labyrinth of shell scripts to something more similar to java -jar <program.jar>, given the convenient modern DevOps-tools.
Why Such Complexity?
In a nutshell, in ancient times, when UNIX was not interrogated at the mention of "in the sense of Linux?", there were no Systemd and Docker, etc. To manage the processes, we used portable shell scripts (init scripts) and PID-files. Init scripts set the necessary environment settings, which in different UNIX-s were their own, and, depending on the arguments, they started the process or restarted/stopped it using the ID from the PID-file. The approach is simple and straightforward, but these scripts stop working with every non-standard situation, requiring manual measurement, do not allow running several copies of the process...but not the point.
So, if you look closely at the above-mentioned startup-scripts in Java-projects, you can see in the obvious signs of this prehistoric approach, including even mentioning SunOS, HP-UX and other UNIX-s. Typically, these scripts do something like:
- Use the POSIX shell syntax with all its crutches for UNIX/Linux portability
- Determine the version and release of the OS via uname, /etc/* release, etc.
- Look for JRE/JDK in the secluded corners of the file system and choose the most appropriate version by tricky rules, sometimes even specific for each OS
- Calculate the numerical parameters of the JVM, for example, the memory size (-Xms, -Xmx), the number of GC streams, etc.
- Optimize the JVM via -XX-parameters taking into account the specificity of the selected version of the JRE/JDK
- Finds their components, libraries, paths to them by surrounding directories, configuration files, etc.
- Configure the environment: ulimits, environment variables, etc.
- Generate a CLASSPATH loop of type: for f in $ path/*. jar; do CLASSPATH = "$ {CLASSPATH}: $ f"; made
- The parsing arguments to the command line: start|stop|restart|reload|status|...
- Collect the Java-command, as a result, it is necessary to execute, from listed above
And, finally, execute this Java command. Often, all the same, notorious PID files are used, either explicitly or implicitly, and, nohup, special TCP ports and other tricks from the last century (see Karaf example)
The Nexus 3 startup script is a good example of such a script.
In fact, all the above-scripted logic, as it were, attempts to replace the system administrator, who would install and configure everything manually for a particular system from beginning to end. But in general, any requirements of the most diverse systems are, in principle, impossible to take into account. Therefore, it turns out, on the contrary, a headache, as for developers, you need to support these scripts, and for system engineers, who later need to understand these scripts. From my point of view, it is much easier for a system engineer to understand the JVM parameters once and adjust it as necessary, than every time you install a new system, you must understand the intricacies of its startup scripts.
What To Do?
U - forgive! KISS and YAGNI to us. Especially since there is a 2018 year in the yard, which means that:
- With very few exceptions, UNIX == Linux
- The task of process management is solved both for a separate server (Systemd, Docker), and for clusters (Kubernetes, etc.)
- There was a bunch of convenient configuration management tools (Ansible, etc.)
- In the administration came and fully entrenched the total automation: instead of manually configuring the fragile, unique "snowflake servers", you can now automatically create unified reproduced virtual machines and containers with a number of convenient tools, including the Ansible and Docker, mentioned above
- Tools for collecting runtime statistics are used everywhere, both for the JVM itself (example) and for the Java application (example)
- And, most importantly, there were specialists: system and DevOps-engineers who know how to use the above technologies and understand how to properly install JVM on a specific system and then adjust it based on the collected runtime statistics
So let's again go through the functionality of startup-scripts again taking into account the listed items, without trying at the same time to do the work for the system engineer, and remove from there all the "extra."
- POSIX shell syntax ⇒ /bin/bash
- OS version definition ⇒ UNIX == Linux, if there are OS-specific parameters, you can describe them in the documentation
- Search JRE/JDK ⇒ we have the only version, and this is OpenJDK (well, or Oracle JDK, if it's really necessary), Java and the company is in the standard system path
- Calculation of numerical parameters JVM, tuning JVM ⇒ this can be described in the documentation for application scaling
- Search for your components and libraries ⇒ describe the structure of the application and how to configure it in the documentation
- Setting the environment ⇒ describe the requirements and features in the documentation
- Generation of CLASSPATH ⇒ -cp path /to/my/jars/* or even, generally, Uber-JAR
- Parsing the arguments of the command line ⇒ arguments will not be. Everything except the startup will be taken care of by the process manager
- Building a Java command
Executing a Java command
In the end, we just need to compile and execute a Java command of the form java <opts> -jar <program.jar> using the selected process manager (Systemd, Docker, etc.). All parameters and options (<opts>) are left to the discretion of the system engineer, who will tailor them to a specific environment. If the list of options <opts> is quite long, you can return to the idea of the startup script, but, in this case, as compact and declarative as possible, i.e. which does not contain any program logic.
Example
As an example, let's see how you can simplify the Nexus 3 launch script.
The easiest option is to not get into the jungle of this script — just run it in real conditions (./nexus start) and look at the result. For example, you can find the full list of arguments for the running application in the process table (via ps -ef), or run the script in debug mode (bash -x ./nexus start) to observe the entire execution process and at the very end a start command.
We finally got the next Java command:
-/usr/java/jdk1.8.0_171-amd64/bin/java -server -Dinstall4j.jvmDir=/usr/java/jdk1.8.0_171-amd64 -Dexe4j.moduleName=/home/nexus/nexus-3.12.1-01/bin/nexus -XX:+UnlockDiagnosticVMOptions -Dinstall4j.launcherId=245 -Dinstall4j.swt=false -Di4jv=0 -Di4jv=0 -Di4jv=0 -Di4jv=0 -Di4jv=0 -Xms1200M -Xmx1200M -XX:MaxDirectMemorySize=2G -XX:+UnlockDiagnosticVMOptions -XX:+UnsyncloadClass -XX:+LogVMOutput -XX:LogFile=../sonatype-work/nexus3/log/jvm.log -XX:-OmitStackTraceInFastThrow -Djava.net.preferIPv4Stack=true -Dkaraf.home=. -Dkaraf.base=. -Dkaraf.etc=etc/karaf -Djava.util.logging.config.file=etc/karaf/java.util.logging.properties -Dkaraf.data=../sonatype-work/nexus3 -Djava.io.tmpdir=../sonatype-work/nexus3/tmp -Dkaraf.startLocalConsole=false -Di4j.vpt=true -classpath /home/nexus/nexus-3.12.1-01/.install4j/i4jruntime.jar:/home/nexus/nexus-3.12.1-01/lib/boot/nexus-main.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.main-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.osgi.core-6.0.0.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.diagnostic.boot-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.jaas.boot-4.0.9.jar com.install4j.runtime.launcher.UnixLauncher start 9d17dc87 '' '' org.sonatype.nexus.karaf.NexusMain
First, we apply to it a couple of simple tricks:
We will change /the/long/and/winding/road/to/my/java to java, because it is in the system path
put the list of Java parameters in a separate array, sort it and remove duplicates
We get something more digestible:
JAVA_OPTS = (
'-server'
'-Dexe4j.moduleName=/home/nexus/nexus-3.12.1-01/bin/nexus'
'-Di4j.vpt=true'
'-Di4jv=0'
'-Dinstall4j.jvmDir=/usr/java/jdk1.8.0_171-amd64'
'-Dinstall4j.launcherId=245'
'-Dinstall4j.swt=false'
'-Djava.io.tmpdir=../sonatype-work/nexus3/tmp'
'-Djava.net.preferIPv4Stack=true'
'-Djava.util.logging.config.file=etc/karaf/java.util.logging.properties'
'-Dkaraf.base=.'
'-Dkaraf.data=../sonatype-work/nexus3'
'-Dkaraf.etc=etc/karaf'
'-Dkaraf.home=.'
'-Dkaraf.startLocalConsole=false'
'-XX:+LogVMOutput'
'-XX:+UnlockDiagnosticVMOptions'
'-XX:+UnlockDiagnosticVMOptions'
'-XX:+UnsyncloadClass'
'-XX:-OmitStackTraceInFastThrow'
'-XX:LogFile=../sonatype-work/nexus3/log/jvm.log'
'-XX:MaxDirectMemorySize=2G'
'-Xms1200M'
'-Xmx1200M'
'-classpath /home/nexus/nexus-3.12.1-01/.install4j/i4jruntime.jar:/home/nexus/nexus-3.12.1-01/lib/boot/nexus-main.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.main-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.osgi.core-6.0.0.jar:/home/nexus/nexus-3.12.1-01/lib/boot/org.apache.karaf.diagnostic.boot-4.0.9.jar:/home/nexus/nexus-3.12.1-01/lib/boot/'
)
java ${JAVA_OPTS[*]} com.install4j.runtime.launcher.UnixLauncher start 9d17dc87 '' '' org.sonatype.nexus.karaf.NexusMain
Now you can go deep.
Install4j is such a graphical Java installer. It seems that it is used for the initial installation of the system. We do not need it on the server, we remove it.
We agree on the location of the components and the Nexus data on the file system:
Put the application itself in /opt/nexus- <version>
For convenience, create a symbolic link/opt/nexus ->/opt/nexus- <version>
The script itself will be placed instead of the original as opt/nexus/bin/nexus
all the data of our Nexus will be on a separate file system, mounted as /data/nexus
The very creation of directories and links is the destiny of configuration management systems (for everything about all 5-10 lines in Ansible), so let's leave this task to system engineers.
Let our script change the working directory to /opt/nexus at startup — then we can change the paths to Nexus components to relative ones.
Options like -Dkaraf. * are the settings for Apache Karaf, the OSGi container, to which our Nexus is obviously packed. Let's change karaf.home, karaf.base, karaf.etc, and karaf.data according to the placement of components, if possible using relative paths.
Seeing that CLASSPATH consists of a list of jar files that lie in the same lib/directory, we replace the entire list with lib / * (we'll also have to turn off the wildcard expansion using set -o noglob
).
Let's change java
to exec java
so that our script will run Java as a child process (the process manager will not see this child process), but "replace" itself with java (exec description).
Let's see what happened:
#!/bin/bash
JAVA_OPTS=(
'-Xms1200M'
'-Xmx1200M'
'-XX:+UnlockDiagnosticVMOptions'
'-XX:+LogVMOutput'
'-XX:+UnsyncloadClass'
'-XX:LogFile=/data/nexus/log/jvm.log'
'-XX:MaxDirectMemorySize=2G'
'-XX:-OmitStackTraceInFastThrow'
'-Djava.io.tmpdir=/data/nexus/tmp'
'-Djava.net.preferIPv4Stack=true'
'-Djava.util.logging.config.file=etc/karaf/java.util.logging.properties'
'-Dkaraf.home=.'
'-Dkaraf.base=.'
'-Dkaraf.etc=etc/karaf'
'-Dkaraf.data=/data/nexus/data'
'-Dkaraf.startLocalConsole=false'
'-server'
'-cp lib/boot/*'
)
set -o noglob
cd /opt/nexus \
&& exec java ${JAVA_OPTS[*]} org.sonatype.nexus.karaf.NexusMain
Total of 27 lines instead of 400, and it's transparent, clear, declarative, with no superfluous logic. If necessary, this script can easily be turned into a template for Ansible/Puppet/Chef and add only the logic that is needed for a particular situation.
This script can be used as ENTRYPOINT in Dockerfile or called in the unit-file Systemd, at the same time having adjusted there ulimits and other system parameters, for example:
[Unit]
Description=Nexus
After=network.target
[Service]
Type=simple
LimitNOFILE=1048576
ExecStart=/opt/nexus/bin/nexus
User=nexus
Restart=on-abort
[Install]
WantedBy=multi-user.target
Conclusion
What conclusions can be drawn from this article? In principle, it all boils down to a couple of points:
Each system has its own purpose, that is, it is not necessary to hammer nails with a microscope.
Simplicity (KISS, YAGNI) taxis — to realize only what is needed for this particular situation.
And most importantly: it's cool that there are IT specialists of different profiles. Let's interact and make our IT systems easier, clearer and better!
Opinions expressed by DZone contributors are their own.
Comments