Protecting Go Applications: Limiting the Number of Requests and Memory Consumption
Let's discuss how you can limit the number of requests to your Go application on the application side or Istio side, and how to limit the amount of memory consumed.
Join the DZone community and get the full member experience.
Join For FreeIf you're writing backend in Go, you've probably thought about how to limit the number of requests to your application. This problem can be solved in several ways. For instance, if you have AWS WAF Cloudflare WAF or any other WAF, you can set request limits for a specific endpoint at the WAF level. However, there are other ways to solve this problem. In this article, we'll discuss how to address this issue at the application level or via a proxy in front of the application. We'll also discuss how to limit the amount of memory consumed by your application.
Our Go application will have two endpoints: /foo and /bar. /foo will only accept POST requests, while /bar will only accept GET requests. Here's the initial code of our application.
package main
import (
type User struct {
Name string `json:"name"`
Address string `json:"address"`
type FooHandler struct {
func (h *FooHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
var (
msgPrefix = "FooHandler.ServeHTTP"
user User
log.Printf("%s: request received\n", msgPrefix)
if request.Method != http.MethodPost {
b, err := io.ReadAll(request.Body)
if err != nil {
log.Printf("%s: io.ReadAll error %v\n", msgPrefix, err)
if err := json.Unmarshal(b, &user); err != nil {
log.Printf("%s: json.Unmarshal error %v\n", msgPrefix, err)
_, err = fmt.Fprintf(writer, "Hello %s, this is /foo handler\n", user.Name)
if err != nil {
type BarHandler struct {
func (h *BarHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
var (
msgPrefix = "BarHandler.ServeHTTP"
log.Printf("%s: request received\n", msgPrefix)
if request.Method != http.MethodGet {
_, err := fmt.Fprint(writer, "response from /bar handler\n")
if err != nil {
func main() {
var (
fooHandler = &FooHandler{}
barHandler = &BarHandler{}
mux := http.NewServeMux()
mux.Handle("/foo", fooHandler)
mux.Handle("/bar", barHandler)
log.Println("http server is starting on 9080")
log.Fatal(http.ListenAndServe(":9080", mux))
Two very simple endpoints. /foo accepts POST requests, reads the entire body of the POST, and deserializes it into a User structure. /bar simply accepts GET requests and always responds with the same data. In the first part of the article, we'll discuss limiting the number of requests to the application, and in the second part, we'll discuss the memory of the POST endpoint.
Limit the Number of Requests
So, let's start with limiting the number of requests to the application.
Application Level Limiter
First, we will attempt to solve this problem at the application level, starting with the use of the go-limiter library.
First of all, we will need to create an instance of the limiter.Store interface, where we will specify the time interval and the allowed number of tokens within this interval.
store, err := memorystore.New(&memorystore.Config{
Tokens: 1,
Interval: time.Minute,
As evident from the name, the store we create holds request statistics in the application's memory. This is a decent default option, but it's important to remember that it can lead to several issues:
- The application will consume too much memory due to the data from the rate limiter
- Two different instances of the application will limit requests differently
- The rate-limiting may not work as expected with constant application deployment since each new deployment results in a new pod, resetting the request counters. The same applies to autoscaling
Next, we need to determine what will serve as the key for limiting requests. User's IP address? Session ID? Custom HTTP header? Different rules for different handlers? In this example, we will have a single shared limiter based on the user's IP address.
We could start using this store directly in request handlers, but from a design perspective, this is not the best solution. It's better to create middleware for the rate limiter. Fortunately, the library already has such middleware, which will use the IP as the key. All we have to do is to connect it. We'll limit the number of requests to the POST endpoint and won't restrict anything for the GET endpoint.
func main() {
store, err := memorystore.New(&memorystore.Config{
Tokens: 1,
Interval: time.Minute,
if err != nil {
middleware, err := httplimit.NewMiddleware(store, httplimit.IPKeyFunc())
if err != nil {
var (
fooHandler = &FooHandler{}
barHandler = &BarHandler{}
mux := http.NewServeMux()
mux.Handle("/foo", middleware.Handle(fooHandler))
mux.Handle("/bar", barHandler)
log.Println("http server is starting on 9080")
log.Fatal(http.ListenAndServe(":9080", mux))
The application-side rate limiting is ready, but let's think about whether we want to keep this logic specifically within the application. Yes, having a limiter on the application side gives us flexibility and frees us up, but in most cases, we would like to delegate this task to some proxy in front of the application. Therefore, as the next step, let's try to move the rate limiting to the sidecar proxy level in Kubernetes. That is, in our application pod, in addition to the container with the application itself, another container will appear whose task will be to manage network traffic, including limiting the number of requests.
Further experiments with Kubernetes were conducted using Ubuntu 23.10 + minikube v1.32.0 + Istio 1.21.0.
Istio has two types of rate limiters: local and global. The global limiter, for decision-making, will communicate with a separately deployed service where all request statistics will be stored. The local limiter will not communicate externally and will make decisions at the level of each individual pod.
Local Limiter
Let's start with the local limiter.
All configurations can be found in the repository.
- Create Dockerfile
- Create deployment.yaml
- Create service.yaml
- Create gateway.yaml
- And the most interesting part: envoy with our limiter set to 4 requests per minute. Create envoy.local.yaml
- Apply the manifests (before running the commands, you need to output the environment settings required for working with Docker containers in the Minikube context. This is done with the following command: eval $(minikube docker-env))
docker build --rm -t go-app:v1 .
kubectl apply -f gateway.yaml
kubectl apply -f envoy.local.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
7. Test
a. Run tunnel
minikube tunnel
b. Get gateway URL
export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?("http2")].port}')
export SECURE_INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?("https")].port}')
c. Call the endpoint
curl -v \
--header 'Content-Type: application/json' \
--data '{"name":"Sherlock", "address":"221B Baker Street"}' \
Three more requests will be processed successfully, but the fifth one will receive a 429 Too Many Requests
curl -v \
--header 'Content-Type: application/json' \
--data '{"name":"Sherlock", "address":"221B Baker Street"}' \
* processing:
* Trying
* Connected to ( port 80
> POST /foo HTTP/1.1
> Host:
> User-Agent: curl/8.2.1
> Accept: */*
> Content-Type: application/json
> Content-Length: 50
< HTTP/1.1 429 Too Many Requests
< x-local-rate-limit: true
< content-length: 18
< content-type: text/plain
< date: Fri, 29 Mar 2024 16:17:43 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 1
* Connection #0 to host left intact
Global Limiter
The local limiter is a good tool, but it lacks the capabilities that the global limiter has in terms of configuring rate-limiting rules.
What if we want to create a combination like this:
- For the /foo endpoint, limit the total number of requests to 10 per minute and no more than two requests from a single IP address per minute
- For the /bar endpoint, limit the total number of requests to 20 per minute
- For all other endpoints, add a limit of 100 requests per minute
The local limiter won't help us here, but the global limiter can solve this task.
So, as we have already discussed, unlike the local limiter, the global limiter will use a separate service to make decisions on limiting, which it will call before each request to the application. This can be any service that implements Envoy’s rate limit service protocol. We will use the solution from envoyproxy.
1. Let's clean up the experiment for the local limiter beforehand.
kubectl delete -f gateway.yaml
kubectl delete -f envoy.local.yaml
kubectl delete -f deployment.yaml
kubectl delete -f service.yaml
2. Set up rate limit service. Create rate-limit-service.yaml.
3. For the rate limit service, we need to create a ConfigMap with the limiting rules. Let's limit the total number of requests from one IP address to five per minute and the total number of requests to the /foo endpoint to two. We won't limit /bar. Create config.yaml.
4. New envoy. Create
5. Apply the manifests
kubectl apply -f config.yaml
kubectl apply -f rate-limit-service.yaml
kubectl apply -f gateway.yaml
kubectl apply -f
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Limit Memory Consumption
We've covered request limiting. Let's consider another issue. In our /foo endpoint, we read the data from the POST request and load it into memory.
b, err := io.ReadAll(request.Body)
What will happen if we send a POST that's too large? For instance, let's try to send 5GB.
rm -f foo-big.txt
truncate -s 5000M foo-big.txt
curl -v \
--request POST \
--header 'Content-Type: application/json' \
--upload-file foo-big.txt \
With such a request on my laptop, either minikube would crash or the container with the application would receive an OOM (Out Of Memory) error.
kubectl get pods
go-app-58899c7bf6-4snf2 1/2 OOMKilled 3 (63s ago) 7m16s
So, just one malicious user can crash our application. Is there a way to protect ourselves and limit the amount of data we are willing to accept from a user? In Go, this can be done with just one line of code.
// 500KB
request.Body = http.MaxBytesReader(writer, request.Body, 512000)
Now, ReadAll will return an error if the request body exceeds the set limit. Let's add handling for this error.
maxBytesError := &http.MaxBytesError{}
if errors.As(err, &maxBytesError) {
The final code:
package main
import (
type User struct {
Name string `json:"name"`
Address string `json:"address"`
type FooHandler struct {
func (h *FooHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
var (
msgPrefix = "FooHandler.ServeHTTP"
user User
log.Printf("%s: request received\n", msgPrefix)
if request.Method != http.MethodPost {
// 500KB
request.Body = http.MaxBytesReader(writer, request.Body, 512000)
b, err := io.ReadAll(request.Body)
if err != nil {
log.Printf("%s: io.ReadAll error %v\n", msgPrefix, err)
maxBytesError := &http.MaxBytesError{}
if errors.As(err, &maxBytesError) {
if err := json.Unmarshal(b, &user); err != nil {
log.Printf("%s: json.Unmarshal error %v\n", msgPrefix, err)
_, err = fmt.Fprintf(writer, "Hello %s, this is /foo handler\n", user.Name)
if err != nil {
type BarHandler struct {
func (h *BarHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
var (
msgPrefix = "BarHandler.ServeHTTP"
log.Printf("%s: request received\n", msgPrefix)
if request.Method != http.MethodGet {
_, err := fmt.Fprint(writer, "response from /bar handler\n")
if err != nil {
func main() {
var (
fooHandler = &FooHandler{}
barHandler = &BarHandler{}
mux := http.NewServeMux()
mux.Handle("/foo", fooHandler)
mux.Handle("/bar", barHandler)
log.Println("http server is starting on 9080")
log.Fatal(http.ListenAndServe(":9080", mux))
If you are testing this with minikube, build an image with a new tag (for example, v2) and update the image tag in deployment.yaml, otherwise minikube will deploy the old image from cache.
Now, instead of crashing our application, the request will receive a 400 Bad Request
rm -f foo-big.txt
truncate -s 5000M foo-big.txt
curl -v \
--request POST \
--header 'Content-Type: application/json' \
--upload-file foo-big.txt \
* processing:
* Trying
* Connected to ( port 80
> POST /foo HTTP/1.1
> Host:
> User-Agent: curl/8.2.1
> Accept: */*
> Content-Type: application/json
> Content-Length: 5242880000
> Expect: 100-continue
< HTTP/1.1 100 Continue
< HTTP/1.1 400 Bad Request
< date: Fri, 29 Mar 2024 16:48:14 GMT
< content-length: 0
< server: istio-envoy
< connection: close
* we are done reading and this is set to close, stop send
* Closing connection
So, we've protected our application by limiting the number of requests and the amount of memory consumed by the /foo endpoint. As for the rate limiter, in my opinion, it's best when implemented at the WAF level. However, if for some reason this option doesn't suit you, you can use a sidecar proxy or implement it directly within the application.
Useful Links
- GitHub project with the configs.
- Minikube
- Istio rate limiter
Opinions expressed by DZone contributors are their own.