Understanding Kubernetes Limits and Requests

Kubernetes Requests and Limits

Ever been to an all-you-can-eat buffet and realized halfway through that your plate wasn’t big enough for all that sushi? That’s sort of what happens when you mess up Kubernetes resource requests and limits.

In this article, we’ll walk through what Kubernetes “requests” and “limits” really mean, how they affect your applications, and how to avoid the oh-so-common production pitfalls

What Are Kubernetes Requests and Limits?

In Kubernetes, each container in a Pod can specify two types of resource constraints:

    • Request: The amount of CPU/memory Kubernetes guarantees for your container.
    • Limit: The maximum amount of CPU/memory your container is allowed to use.

    Quick Analogy:

    Imagine you’re booking a table at a restaurant.

    • The request is like telling the host: “I need a table for 2 people.”
    • The limit is like the restaurant saying: “Cool, but we won’t seat more than 4 people at your table.”

    If you suddenly show up with 6 friends, you’re out of luck. The same applies to containers – they can only scale to what has been allocated within their limit.

    Real Example: A Container Spec
    resources:
      requests:
        memory: "512Mi"
        cpu: "500m"
      limits:
        memory: "1Gi"
        cpu: "1"

    Breakdown:

    • Request: This Pod gets at least 512Mi of memory and 0.5 CPU cores.
    • Limit: It can utilize up to 1 GiB of memory and 1 full CPU core, if available.

    If the container tries to use more than 1Gi of memory? It’ll be OOMKilled (Out Of Memory). Too much CPU? It gets throttled.

    Real-World Scenario: The “Spiky” Service

    Let’s say you’re running an image-processing microservice that usually just waits around but occasionally goes turbo when processing uploads.

    What you might set:

    requests:
      cpu: "100m"
      memory: "256Mi"
    limits:
      cpu: "1"
      memory: "512Mi"

    Why this works:

    • Most of the time, the service is idle and doesn’t waste resources.
    • When it gets a burst of activity, Kubernetes allows it to spike up to 1 CPU core.

    But here’s the catch…

    If you don’t set a limit, it can hog resources and cause your other apps to slow down. If you set too low a limit, it’ll get throttled and your users will notice the lag.

    Real-World Disaster: The OOMKill Mystery

    A team once deployed a Node.js app without setting memory limits. It ran fine in dev.

    One day, in production, the memory usage suddenly ballooned due to a poorly written image conversion function. Since there was no limit, it ate up all available memory on the node and took down unrelated Pods — including the payment gateway.

    Moral of the story: Memory limits = firewalls for bad code.

    CPU Requests vs Limits: The Hidden Throttle

    Here’s something people often miss:

    • CPU requests are used by the Kubernetes scheduler to decide where to place the Pod.
    • CPU limits affect runtime behavior.

    If your container hits its CPU limit, it doesn’t crash — it gets throttled. That’s fine if it’s batch processing, but for latency-sensitive apps? Users will feel the lag.

    Best Practices (That’ll Save You Headaches)

    • Always set both requests and limits – but don’t make them equal by default.
    • Profile your app – use tools like Prometheus and Grafana to observe real usage.
    • Avoid setting CPU limits unless necessary – especially for compute-heavy apps where throttling would hurt more than help.
    • Use vertical pod autoscaling (VPA) – let Kubernetes recommend better values over time.
    • Use ResourceQuotas and LimitRanges – especially in multi-team clusters, to prevent noisy neighbors.

    Final Thought: It’s Not Just About the App

    Resource requests and limits are like the plumbing of your infrastructure. They may seem boring, but when they break or are misconfigured, the whole building feels it.

    Setting them right is a mix of art, science, and experience — like cooking with just the right amount of spice.

    TL;DR

    TermWhat it MeansImpact
    RequestMinimum guaranteed CPU/memoryScheduler uses it to place Pod
    LimitMax CPU/memory allowedThrottling or OOMKill if exceeded

    Subscribe to Blog via Email

    Enter your email address to subscribe to
    this blog and receive notifications of new posts by email.
    0 Shares:
    You May Also Like