A simple setup to troubleshoot pod level performance issue

Shi
CI/CD/DevOps
Published in
3 min readOct 3, 2023

--

sometimes, when we are developing containerised application, we need to collect some pod level metrics to study the impact of certain design; in this experiment, we want to design an experiment to collect some metric to study the impact of an JVM agent on the application, and compare the cpu/memory usage with and without the agent.

I will start by building a docker image using the popular java security benchmark webgoat8; here I am building a new docker image from scratch because the existing docker image found in docker hub repo (https://hub.docker.com/r/webgoat/webgoat-8.0/) is using an very old base image and it makes any dependencies update very difficult (yes, trust me, I learnt the hard way).

FROM debian:11-slim
LABEL NAME = "WebGoat: A deliberately insecure Web Application"

RUN apt update
RUN apt -yq install \
unzip \
curl \
wget \
openjdk-17-jre \
&& rm -rf /var/lib/apt/lists/*

RUN \
useradd -ms /bin/bash webgoat && \
chgrp -R 0 /home/webgoat && \
chmod -R g=u /home/webgoat
USER webgoat

WORKDIR /tmp
RUN curl -k -o agent.zip "${URL}/rest/api/latest/installers/agents/binaries/JAVA"
RUN mkdir -p /tmp/ss && unzip -d /tmp/ss agent.zip
ENV JAVA_TOOL_OPTIONS="-javaagent:/tmp/ss/agent.jar"

WORKDIR /home/webgoat
RUN wget https://github.com/WebGoat/WebGoat/releases/download/v2023.4/webgoat-2023.4.jar -O /home/webgoat/webgoat.jar

ENTRYPOINT [ "java", \
"-Duser.home=/home/webgoat", \
"-Dfile.encoding=UTF-8", \
"--add-opens", "java.base/java.lang=ALL-UNNAMED", \
"--add-opens", "java.base/java.util=ALL-UNNAMED", \
"--add-opens", "java.base/java.lang.reflect=ALL-UNNAMED", \
"--add-opens", "java.base/java.text=ALL-UNNAMED", \
"--add-opens", "java.desktop/java.beans=ALL-UNNAMED", \
"--add-opens", "java.desktop/java.awt.font=ALL-UNNAMED", \
"--add-opens", "java.base/sun.nio.ch=ALL-UNNAMED", \
"--add-opens", "java.base/java.io=ALL-UNNAMED", \
"--add-opens", "java.base/java.util=ALL-UNNAMED", \
"--add-opens", "java.base/sun.nio.ch=ALL-UNNAMED", \
"--add-opens", "java.base/java.io=ALL-UNNAMED", \
"-Drunning.in.docker=true", \
"-Dwebgoat.host=0.0.0.0", \
"-Dwebwolf.host=0.0.0.0", \
"-Dwebgoat.port=8080", \
"-Dwebwolf.port=9090", \
"-jar", "webgoat.jar" ]

then we could define two pods based on this dockerfile, and put them in the same namespace, behind the same service and apply the same resource limit/request; pod0 is loaded with agent and pod1 is loaded without agent (as JAVA_TOOL_OPTIONS has been reset).

apiVersion: v1
kind: Namespace
metadata:
name: ss
---
apiVersion: v1
data:
.dockerconfigjson: xxxxxxxxxxxxxxxx
kind: Secret
metadata:
name: regcred
namespace: ss
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: Pod
metadata:
name: pod0
namespace: ss
labels:
type: withss
app: webgoat
spec:
containers:
- name: c0
image: registry.gitlab.com/webgoat8-ss:2023.8.0
resources:
limits:
cpu: 800m
memory: 1024Mi
requests:
cpu: 500m
memory: 1024Mi
imagePullSecrets:
- name: regcred
---
apiVersion: v1
kind: Pod
metadata:
name: pod1
namespace: ss
labels:
type: noss
app: webgoat
spec:
containers:
- name: c1
image: registry.gitlab.com/webgoat8-ss:2023.8.0
env:
- name: JAVA_TOOL_OPTIONS
value: ""
resources:
limits:
cpu: 800m
memory: 1024Mi
requests:
cpu: 500m
memory: 1024Mi
imagePullSecrets:
- name: regcred
---
apiVersion: v1
kind: Service
metadata:
name: webgoatsvc
namespace: ss
spec:
ports:
- name: 80-80
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: webgoat
type: NodePort

then we can proceed to deploy the application to the Azure Kubernetes Service.

How do we generate the test workload? we could find out the IP of the service and then launch a curl pod to create some rand traffic to the login page and I will expect the traffic will be loadbalanced to the two pod evenly (50:50).


$k get svc -n ss
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
webgoatsvc NodePort 10.96.142.182 <none> 8080:30029/TCP 9m24s

$k run curlpod --rm -it --image=curlimages/curl -- sh

for i in `seq 1 100000`; do
ret=$(curl -s 10.96.142.182:8080/WebGoat/login)
echo $i
sleep 0.01
done

next, we need to visualize the pod mem/cpu usage in Grafana for easy comparison.

We can leverage managed Grafana/Prometheus in Azure monitor workspace to monitor pod mem/cpu consumption for application deployed in AKS cluster.

Take note that by default, the pod0 and pod1 are displayed in two pages and we need to customize the queries to superimpose them in the same chart.

One gotchas here: by default, the two metric are stacked up, not overlayed. If the latter is what you wish for, there is a Graph style option in Grafan to achieve this.

--

--

Shi
CI/CD/DevOps

I am a coder/engineer/application security specialist. I like to play around with language and tools; I have strong interest in efficiency improvement.