Protecting Go Applications: Limiting the Number of Requests and Memory Consumption

Let's discuss how you can limit the number of requests to your Go application on the application side or Istio side, and how to limit the amount of memory consumed.

Apr. 26, 24 · Tutorial

Likes (56)

Comment

Save

24.0K Views

If you're writing backend in Go, you've probably thought about how to limit the number of requests to your application. This problem can be solved in several ways. For instance, if you have AWS WAF Cloudflare WAF or any other WAF, you can set request limits for a specific endpoint at the WAF level. However, there are other ways to solve this problem. In this article, we'll discuss how to address this issue at the application level or via a proxy in front of the application. We'll also discuss how to limit the amount of memory consumed by your application.

Our Go application will have two endpoints: /foo and /bar. /foo will only accept POST requests, while /bar will only accept GET requests. Here's the initial code of our application.

    Go
   
 

   package main


import (
  "encoding/json"
  "fmt"
  "io"
  "log"
  "net/http"
)


type User struct {
  Name    string `json:"name"`
  Address string `json:"address"`
}


type FooHandler struct {
}


func (h *FooHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
  var (
     msgPrefix = "FooHandler.ServeHTTP"
     user      User
  )


  log.Printf("%s: request received\n", msgPrefix)


  if request.Method != http.MethodPost {
     writer.WriteHeader(http.StatusMethodNotAllowed)
     return
  }


  b, err := io.ReadAll(request.Body)
  if err != nil {
     log.Printf("%s: io.ReadAll error %v\n", msgPrefix, err)
     writer.WriteHeader(http.StatusInternalServerError)
     return
  }


  if err := json.Unmarshal(b, &user); err != nil {
     log.Printf("%s: json.Unmarshal error %v\n", msgPrefix, err)
     writer.WriteHeader(http.StatusBadRequest)
     return
  }


  _, err = fmt.Fprintf(writer, "Hello %s, this is /foo handler\n", user.Name)
  if err != nil {
     log.Println(err)
  }
}


type BarHandler struct {
}


func (h *BarHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
  var (
     msgPrefix = "BarHandler.ServeHTTP"
  )


  log.Printf("%s: request received\n", msgPrefix)


  if request.Method != http.MethodGet {
     writer.WriteHeader(http.StatusMethodNotAllowed)
     return
  }


  _, err := fmt.Fprint(writer, "response from /bar handler\n")
  if err != nil {
     log.Println(err)
  }
}


func main() {
  var (
     fooHandler = &FooHandler{}
     barHandler = &BarHandler{}
  )


  mux := http.NewServeMux()
  mux.Handle("/foo", fooHandler)
  mux.Handle("/bar", barHandler)


  log.Println("http server is starting on 9080")
  log.Fatal(http.ListenAndServe(":9080", mux))
}
  

Two very simple endpoints. /foo accepts POST requests, reads the entire body of the POST, and deserializes it into a User structure. /bar simply accepts GET requests and always responds with the same data. In the first part of the article, we'll discuss limiting the number of requests to the application, and in the second part, we'll discuss the memory of the POST endpoint.

Limit the Number of Requests

So, let's start with limiting the number of requests to the application.

Application Level Limiter

First, we will attempt to solve this problem at the application level, starting with the use of the go-limiter library.

First of all, we will need to create an instance of the limiter.Store interface, where we will specify the time interval and the allowed number of tokens within this interval.

    Go
   
   store, err := memorystore.New(&memorystore.Config{
  Tokens:   1,
  Interval: time.Minute,
})

As evident from the name, the store we create holds request statistics in the application's memory. This is a decent default option, but it's important to remember that it can lead to several issues:

The application will consume too much memory due to the data from the rate limiter
Two different instances of the application will limit requests differently
The rate-limiting may not work as expected with constant application deployment since each new deployment results in a new pod, resetting the request counters. The same applies to autoscaling

Next, we need to determine what will serve as the key for limiting requests. User's IP address? Session ID? Custom HTTP header? Different rules for different handlers? In this example, we will have a single shared limiter based on the user's IP address.

We could start using this store directly in request handlers, but from a design perspective, this is not the best solution. It's better to create middleware for the rate limiter. Fortunately, the library already has such middleware, which will use the IP as the key. All we have to do is to connect it. We'll limit the number of requests to the POST endpoint and won't restrict anything for the GET endpoint.

    Go
   
 

   func main() {
  store, err := memorystore.New(&memorystore.Config{
     Tokens:   1,
     Interval: time.Minute,
  })
  if err != nil {
     log.Fatal(err)
  }


  middleware, err := httplimit.NewMiddleware(store, httplimit.IPKeyFunc())
  if err != nil {
     log.Fatal(err)
  }


  var (
     fooHandler = &FooHandler{}
     barHandler = &BarHandler{}
  )


  mux := http.NewServeMux()
  mux.Handle("/foo", middleware.Handle(fooHandler))
  mux.Handle("/bar", barHandler)


  log.Println("http server is starting on 9080")
  log.Fatal(http.ListenAndServe(":9080", mux))
}
  

The application-side rate limiting is ready, but let's think about whether we want to keep this logic specifically within the application. Yes, having a limiter on the application side gives us flexibility and frees us up, but in most cases, we would like to delegate this task to some proxy in front of the application. Therefore, as the next step, let's try to move the rate limiting to the sidecar proxy level in Kubernetes. That is, in our application pod, in addition to the container with the application itself, another container will appear whose task will be to manage network traffic, including limiting the number of requests.

Further experiments with Kubernetes were conducted using Ubuntu 23.10 + minikube v1.32.0 + Istio 1.21.0.

Istio has two types of rate limiters: local and global. The global limiter, for decision-making, will communicate with a separately deployed service where all request statistics will be stored. The local limiter will not communicate externally and will make decisions at the level of each individual pod.

Local Limiter

Let's start with the local limiter.

All configurations can be found in the repository.

Create Dockerfile
Create deployment.yaml
Create service.yaml
Create gateway.yaml
And the most interesting part: envoy with our limiter set to 4 requests per minute. Create envoy.local.yaml
Apply the manifests (before running the commands, you need to output the environment settings required for working with Docker containers in the Minikube context. This is done with the following command: eval $(minikube docker-env))

    Shell
   
 

   docker build --rm -t go-app:v1 .
kubectl apply -f gateway.yaml
kubectl apply -f envoy.local.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
  

7. Test

a. Run tunnel

    Shell
   
   minikube tunnel

b. Get gateway URL

    Shell
   
   export INGRESS_HOST=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].port}')
export SECURE_INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="https")].port}')
export GATEWAY_URL=$INGRESS_HOST:$INGRESS_PORT

c. Call the endpoint

    Shell
   
   curl -v \
  --header 'Content-Type: application/json' \
  --data '{"name":"Sherlock", "address":"221B Baker Street"}' \
   http://$GATEWAY_URL/foo

Three more requests will be processed successfully, but the fifth one will receive a 429 Too Many Requests error.

    Shell
   
 

   curl -v \
  --header 'Content-Type: application/json' \
  --data '{"name":"Sherlock", "address":"221B Baker Street"}' \
   http://$GATEWAY_URL/foo
* processing: http://10.98.27.132:80/foo
*   Trying 10.98.27.132:80...
* Connected to 10.98.27.132 (10.98.27.132) port 80
> POST /foo HTTP/1.1
> Host: 10.98.27.132
> User-Agent: curl/8.2.1
> Accept: */*
> Content-Type: application/json
> Content-Length: 50
>
< HTTP/1.1 429 Too Many Requests
< x-local-rate-limit: true
< content-length: 18
< content-type: text/plain
< date: Fri, 29 Mar 2024 16:17:43 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 1
<
* Connection #0 to host 10.98.27.132 left intact
local_rate_limited
  

Global Limiter

The local limiter is a good tool, but it lacks the capabilities that the global limiter has in terms of configuring rate-limiting rules.

What if we want to create a combination like this:

For the /foo endpoint, limit the total number of requests to 10 per minute and no more than two requests from a single IP address per minute
For the /bar endpoint, limit the total number of requests to 20 per minute
For all other endpoints, add a limit of 100 requests per minute

The local limiter won't help us here, but the global limiter can solve this task.

So, as we have already discussed, unlike the local limiter, the global limiter will use a separate service to make decisions on limiting, which it will call before each request to the application. This can be any service that implements Envoy’s rate limit service protocol. We will use the solution from envoyproxy.

1. Let's clean up the experiment for the local limiter beforehand.

    Shell
   
   kubectl delete -f gateway.yaml
kubectl delete -f envoy.local.yaml
kubectl delete -f deployment.yaml
kubectl delete -f service.yaml

2. Set up rate limit service. Create rate-limit-service.yaml.

3. For the rate limit service, we need to create a ConfigMap with the limiting rules. Let's limit the total number of requests from one IP address to five per minute and the total number of requests to the /foo endpoint to two. We won't limit /bar. Create config.yaml.

4. New envoy. Create envoy.global.yaml.

5. Apply the manifests

    Shell
   
 

   kubectl apply -f config.yaml
kubectl apply -f rate-limit-service.yaml
kubectl apply -f gateway.yaml
kubectl apply -f envoy.global.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
  

Limit Memory Consumption

We've covered request limiting. Let's consider another issue. In our /foo endpoint, we read the data from the POST request and load it into memory.

    Go
   
   b, err := io.ReadAll(request.Body)

What will happen if we send a POST that's too large? For instance, let's try to send 5GB.

    Shell
   
 

   rm -f foo-big.txt
truncate -s 5000M foo-big.txt
curl -v \
  --request POST \
  --header 'Content-Type: application/json' \
  --upload-file foo-big.txt \
   http://$GATEWAY_URL/foo
  

With such a request on my laptop, either minikube would crash or the container with the application would receive an OOM (Out Of Memory) error.

    Shell
   
   kubectl get pods
NAME                      READY   STATUS      RESTARTS      AGE
go-app-58899c7bf6-4snf2   1/2     OOMKilled   3 (63s ago)   7m16s

So, just one malicious user can crash our application. Is there a way to protect ourselves and limit the amount of data we are willing to accept from a user? In Go, this can be done with just one line of code.

    Go
   
   // 500KB
request.Body = http.MaxBytesReader(writer, request.Body, 512000)

Now, ReadAll will return an error if the request body exceeds the set limit. Let's add handling for this error.

    Go
   
 

   maxBytesError := &http.MaxBytesError{}
if errors.As(err, &maxBytesError) {
  writer.WriteHeader(http.StatusBadRequest)
  return
}
  

The final code:

    Go
   
 

   package main


import (
  "encoding/json"
  "errors"
  "fmt"
  "io"
  "log"
  "net/http"
)


type User struct {
  Name    string `json:"name"`
  Address string `json:"address"`
}


type FooHandler struct {
}


func (h *FooHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
  var (
     msgPrefix = "FooHandler.ServeHTTP"
     user      User
  )


  log.Printf("%s: request received\n", msgPrefix)


  if request.Method != http.MethodPost {
     writer.WriteHeader(http.StatusMethodNotAllowed)
     return
  }


  // 500KB
  request.Body = http.MaxBytesReader(writer, request.Body, 512000)


  b, err := io.ReadAll(request.Body)
  if err != nil {
     log.Printf("%s: io.ReadAll error %v\n", msgPrefix, err)


     maxBytesError := &http.MaxBytesError{}
     if errors.As(err, &maxBytesError) {
        writer.WriteHeader(http.StatusBadRequest)
        return
     }


     writer.WriteHeader(http.StatusInternalServerError)
     return
  }


  if err := json.Unmarshal(b, &user); err != nil {
     log.Printf("%s: json.Unmarshal error %v\n", msgPrefix, err)
     writer.WriteHeader(http.StatusBadRequest)
     return
  }


  _, err = fmt.Fprintf(writer, "Hello %s, this is /foo handler\n", user.Name)
  if err != nil {
     log.Println(err)
  }
}


type BarHandler struct {
}


func (h *BarHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
  var (
     msgPrefix = "BarHandler.ServeHTTP"
  )


  log.Printf("%s: request received\n", msgPrefix)


  if request.Method != http.MethodGet {
     writer.WriteHeader(http.StatusMethodNotAllowed)
     return
  }


  _, err := fmt.Fprint(writer, "response from /bar handler\n")
  if err != nil {
     log.Println(err)
  }
}


func main() {
  var (
     fooHandler = &FooHandler{}
     barHandler = &BarHandler{}
  )


  mux := http.NewServeMux()
  mux.Handle("/foo", fooHandler)
  mux.Handle("/bar", barHandler)


  log.Println("http server is starting on 9080")
  log.Fatal(http.ListenAndServe(":9080", mux))
}
  

If you are testing this with minikube, build an image with a new tag (for example, v2) and update the image tag in deployment.yaml, otherwise minikube will deploy the old image from cache.

Now, instead of crashing our application, the request will receive a 400 Bad Request error.

    Shell
   
 

   rm -f foo-big.txt
truncate -s 5000M foo-big.txt
curl -v \
  --request POST \
  --header 'Content-Type: application/json' \
  --upload-file foo-big.txt \
   http://$GATEWAY_URL/foo
* processing: http://10.98.27.132:80/foo
*   Trying 10.98.27.132:80...
* Connected to 10.98.27.132 (10.98.27.132) port 80
> POST /foo HTTP/1.1
> Host: 10.98.27.132
> User-Agent: curl/8.2.1
> Accept: */*
> Content-Type: application/json
> Content-Length: 5242880000
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
< HTTP/1.1 400 Bad Request
< date: Fri, 29 Mar 2024 16:48:14 GMT
< content-length: 0
< server: istio-envoy
< connection: close
<
* we are done reading and this is set to close, stop send
* Closing connection
  

Conclusion

So, we've protected our application by limiting the number of requests and the amount of memory consumed by the /foo endpoint. As for the rate limiter, in my opinion, it's best when implemented at the WAF level. However, if for some reason this option doesn't suit you, you can use a sidecar proxy or implement it directly within the application.

Useful Links

GitHub project with the configs.
Minikube
Istio rate limiter

application Go (programming language) rate limit Requests

Opinions expressed by DZone contributors are their own.

Related

Trending