Kubernetes Demystified: Restrictions on Java Application Resources
Take a look at how to fix this common issue from Java developers on how to manage containerized Java applications that are taking too much memory.
Join the DZone community and get the full member experience.
Join For FreeThis series of articles explores some of the common problems enterprise customers encounter when using Kubernetes.
As container technology becomes increasingly sophisticated, more and more enterprise customers are choosing Docker and Kubernetes as the foundation for their application platforms. However, these customers encounter many problems in practice. This series of articles presents some insights and best practices drawn from the Alibaba Cloud container service team's experience in helping customers navigate this process.
Regarding the containerized deployment of Java applications, some users have reported that, although they have set container resource restrictions, their active Java application containers are inexplicably killed by OOM Killer.
This problem is the result of a very common mistake: the failure to correctly set container resource restrictions and the corresponding JVM heap size.
Here, we use a Tomcat application as an example. You can obtain its instance code and Kubernetes deployment file from GitHub.
git clone https://github.com/denverdino/system-info
cd system-info`
We use the following Kubernetes pod definition:
- The app in the pod is an initialization container, responsible for copying one JSP application to the "webapps" directory of the
Tomcat
container. Note: In the image, the JSP application index.jsp is used to display JVM and system resource information. - The
Tomcat
container remains active and we have restricted the maximum memory usage to 256 MB.
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
initContainers:
- image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info
name: app
imagePullPolicy: IfNotPresent
command:
- "cp"
- "-r"
- "/system-info"
- "/app"
volumeMounts:
- mountPath: /app
name: app-volume
containers:
- image: tomcat:9-jre8
name: tomcat
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /usr/local/tomcat/webapps
name: app-volume
ports:
- containerPort: 8080
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "500m"
volumes:
- name: app-volume
emptyDir: {}
We execute the following command to deploy and test the application:
$ kubectl create -f test.yaml
pod "test" created
$ kubectl get pods test
NAME READY STATUS RESTARTS AGE
test 1/1 Running 0 28s
$ kubectl exec test curl http://localhost:8080/system-info/
...
Now we can see the system CPU, memory, and other information displayed in HTML format. We can use the html2text command to convert the information to text format.
Note: Here, we test the application on a 2C 4G node. Testing in different environments may produce slightly different results.
$ kubectl exec test curl http://localhost:8080/system-info/ | html2text
Java version Oracle Corporation 1.8.0_162
Operating system Linux 4.9.64
Server Apache Tomcat/9.0.6
Memory Used 29 of 57 MB, Max 878 MB
Physica Memory 3951 MB
CPU Cores 2
**** Memory MXBean ****
Heap Memory Usage init = 65011712(63488K) used = 19873704(19407K) committed
= 65536000(64000K) max = 921174016(899584K)
Non-Heap Memory Usage init = 2555904(2496K) used = 32944912(32172K) committed =
33882112(33088K) max = -1(-1K)
As we can see, the system memory in the container is 3,951 MB, but the maximum JVM heap size is 878 MB. Why is this the case? Didn't we set the container resource capacity to 256 MB? In this situation, the application memory usage exceeds 256 MB, but the JVM has not implemented garbage collection (GC). Rather, the JVM process is directly killed by the system's OOM killer.
The root cause of the problem:
- If we do not set a JVM heap size, the maximum heap size is set by default based on the memory size of the host environment.
- Docker containers use cgroups to limit the resources used by processes. Therefore, if the JVM in the container still uses the default settings based on the host environment memory and CPU cores, this results in incorrect JVM heap calculation.
Likewise, the default JVM GC and JIT complier thread counts are determined by the number of host CPU cores. If we run multiple Java applications on a single node, even if we set CPU restrictions, there is still a possibility that the GC thread will preempt switching between applications, affecting application performance.
Now that we understand the root cause of the problem, it is easy to solve it.
Solutions
Enable cgroup Resource Awareness
The Java community was also aware of this problem and now supports auto-sensing of container resource restrictions in Java SE 8u131+ and JDK 9. To use this method, add the following parameter:
java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap ⋯
Continuing with the preceding Tomcat
container example, we add the environment variable "JAVA_OPTS":
apiVersion: v1
kind: Pod
metadata:
name: cgrouptest
spec:
initContainers:
- image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info
name: app
imagePullPolicy: IfNotPresent
command:
- "cp"
- "-r"
- "/system-info"
- "/app"
volumeMounts:
- mountPath: /app
name: app-volume
containers:
- image: tomcat:9-jre8
name: tomcat
imagePullPolicy: IfNotPresent
env:
- name: JAVA_OPTS
value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap"
volumeMounts:
- mountPath: /usr/local/tomcat/webapps
name: app-volume
ports:
- containerPort: 8080
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "500m"
volumes:
- name: app-volume
emptyDir: {}
Now, we deploy a new pod and repeat the test:
$ kubectl create -f cgroup_test.yaml
pod "cgrouptest" created
$ kubectl exec cgrouptest curl http://localhost:8080/system-info/ | html2txt
Java version Oracle Corporation 1.8.0_162
Operating system Linux 4.9.64
Server Apache Tomcat/9.0.6
Memory Used 23 of 44 MB, Max 112 MB
Physica Memory 3951 MB
CPU Cores 2
**** Memory MXBean ****
Heap Memory Usage init = 8388608(8192K) used = 25280928(24688K) committed =
46661632(45568K) max = 117440512(114688K)
Non-Heap Memory Usage init = 2555904(2496K) used = 31970840(31221K) committed =
32768000(32000K) max = -1(-1K)
As we can see, the maximum JVM heap size has changed to 112 MB, ensuring the application will not be killed by the OOM killer. But this raises another question: why do we only set the maximum JVM heap memory to 112 MB if we set the maximum container memory limit to 256 MB?
The answer involves the finer points of JVM memory management. Memory consumption in JVM includes both heap and non-heap memory. The memory required for class metadata, JIT complied code, thread stacks, GC, and other such processes are taken from the non-heap memory. Therefore, based on the cgroup resource restrictions, the JVM reserves a portion of the memory for non-heap use to ensure system stability. (In the preceding example, we can see that, after starting Tomcat, non-heap memory occupies nearly 32 MB.)
In the latest version, JDK 10, further optimizations and enhancements were made to JVM operations in containers.
Perception of cgroup Resource Restrictions in Containers
If we cannot use the new features of JDK 8 and 9 (for example, if we are still using old JDK 6 applications), we can use a script in the container to obtain the container's cgroup resource restrictions and use this to set the JVM heap size.
Starting with Docker 1.7, container cgroup information is mounted in the container, allowing applications to obtainer memory, CPU, and other settings from /sys/fs/cgroup/memory/memory.limit_in_bytes
and other files. Therefore, launch commands for applications in the container contain the correct resource settings for -Xmx, -XX:ParallelGCThreads, and other parameters based on the cgroup configuration.
Conclusion
This article looks at a common heap setting problem arising when running Java applications in containers. Containers differ from virtual machines in that their resource restrictions are implemented using cgroups. Moreover, if internal container processes are not aware of the cgroup restrictions, memory and CPU allocation can produce resource conflicts and problems.
It is very easy to solve this problem by using the new JVM features or a custom script to correctly set the resource restrictions. These solutions address the vast majority of resource restriction problems.
However, these solutions leave unresolved one resource restriction problem that affects container applications. Some older monitoring tools and system commands such as "free" and "top" still acquire the host's CPU and memory settings when running in a container. This means that certain monitoring tools cannot accurately compute resource consumption when run in containers. The common solution to this problem proposed in the community is to use LXCFS to maintain consistency between the container's resource visibility behaviors and the virtual machine. A subsequent article will discuss the use of this method on Kubernetes.
Published at DZone with permission of Leona Zhang. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments