If you have ever created a web application, you know that without the proper procedure to control traffic, your application may be overwhelmed, which causes reaction times, server cracks and poor user experience. Even worse, this can put you at risk of attacks by the Deny of Service (DOS). This is the place where the rate comes to restrict.
In this tutorial, you will create a distributed rate lameer. This is the system you need when your application is deployed to several servers or virtual machines, and you need to impose a global threshold on all incoming requests.
You will create a simple URL shortening application and then enforce a strong rate range for it using a powerful and effective combination of tools:
For your web application and flask.
Repeating requests, Work again as your fast, central data store.
Terfarm and proxamox for description and supply of your virtual machine infrastructure.
Dooker to contain your application for easily deployment.
Nanix as a burden balance to distribute traffic to your app servers.
K6 to load your system and prove that your rate actually acts.
It aims for new developers who learn about different concepts of system design or for experts who want only one refresher.
By the end of this guide, you will not only understand the code, but also understand the system architecture required to deploy a scaleable, flexible application.
Let’s start!
Provisions
Although it doesn’t have to follow as well, I will recommend setting up a proxamox server on an old laptop as well as the titles and codes learning with the article. I recommend that YouTube playlist To start, please note that I am not affiliated with this channel in any way. I just got this helpful for me.
However, if you do not have a local proxamox server, you can leave this section and only follow to understand how the rate limit is made and how it is arranged to work properly with multiple servers.
The table of content
Big picture: Our System architecture
Before we dive into the code, let’s look at the architecture we are building. I will use Proxamox Virtual atmosphere Setting the server cluster exactly the way you are in the data center.
How to configure proxamox
Proxmox Virtual Environment An open source platform for virtualization. It allows you to easily manage multiple VMS, ccontainers and other cloths. For example, I turned my old gaming computer into a proximox server, which allows me to run more than 20 virtual machines simultaneously, making it like my own data center. This allows me to experience distributed applications by imitating the data center environment.
You just need an old computer to set up your cluster. You can download ISO image Here And boot with a USB drive. Once you install it, you can create a host machine through a web browser on another computer on the same network.
For example, my proximox is located on the server 10.0.0.108 And I can access it through the browser on my laptop.

We describe all our virtual machines main.tf File and run an easy command terraform apply Rotate these servers. Read more about how to use Terfarm with proximoxes, I recommend this Blog Post
On the issue of our use, we will have some virtual machines that will work as a variety of servers.
A burden baller
A rate limeter (a redis cache)
Two web server
A postgrass database
A virtual machine that will check the load by imitating hundreds of calls per minute.
If all this seems annoying, don’t worry too much about it. You don’t have to decide all this to follow it.
The main rate limit
Since our application will run on numerous servers (or “nodes”), we cannot store the application count on each person on the server. Why? Because each server will have its own separate count, and we will not have one World Rate limit
The solution is to use a central data store so that all our application nodes can access. It comes from here.
Our Setup’s Areogram is:

Consumer requests first hit our nanix load baller.
The balance balance equally distributes traffic between our two web server VMS. The layout is simple, using an upstream block to describe the servers.
Each web server operates our Flask application inside the Dokar Container.
Before taking action on any application, the Flassic app interacts with the Central Redis Rate Lamer VM to find out if the user has exceeded the rate limit.
If the user is in the range, the app acts on the application and interacts with the postgrass QL database. If they are excessive, it sends back the “429 more requests” error.
This architecture ensures that it does not matter which web server handles the application, the same, the rate range against the shared data source is examined.
Step 1: How to specify infrastructure with tarfers
Manually setting up numerous virtual machines can be painful and suffering from mistakes. That is why we use the Terfarum, an infrastructure as a code (IAC) tool. It allows us to explain our entire infrastructure in configuration files.
Note: If you just want to look at the rate range and how to see how it use it, you can leave this section.
Ours main.tf The file explains all the components of our system. Let’s see a key piece: Redis VM.
resource "proxmox_vm_qemu" "redis_cache" {
vmid = 130
name = "redis-cache-rate-limiter"
target_node = "pve"
agent = 1
cores = 1
memory = 1024
ipconfig0 = "ip=10.0.0.130/24,gw=10.0.0.1"
provisioner "remote-exec" {
inline = (
"sleep 30; sudo apt-get update -y",
"sudo apt-get install -y docker.io docker-compose",
"sudo mkdir -p /opt/redis"
)
}
provisioner "file" {
source = "files/redis-docker-compose.yml"
destination = "/home/${var.ssh_user}/docker-compose.yml"
}
provisioner "remote-exec" {
inline = (
"sudo mv /home/${var.ssh_user}/docker-compose.yml /opt/redis/docker-compose.yml",
"cd /opt/redis && sudo docker-compose up -d"
)
}
}
This block tells to make the Tarafarum a Proxmox QEMU virtual machine With a specified IP address (10.0.0.130). After forming VM, he uses suppliers to connect through SSH and run commands. Here, it installs the Doker, uploads our redis-docker-compose.yml fileAnd Redis starts container.
redis-docker-compose.yml Self is very straightforward:
version: '3.8'
services:
redis:
image: redis:latest
container_name: redis_cache
restart: always
ports:
- "6379:6379"
volumes:
- redisdata:/data
volumes:
redisdata:
This ensures that we have a permanent, containerized Reds example that is ready to meet our application. Terfourm configuration also explains our web servers, load balance and database.
Step 2: How to enforce the rate limeter logic in Azar
Now, for the heart of our system: the code that enforces the rate rate. We are using a sophisticated and memory algorithm called the sliding window log.
The idea is easy: Every user’s LEW, we have a log of the time stamp of their recent requests. We store this log in a reset set.
Let’s break the code app.py.
Flask @app.before_request Hook
Flask allows us to run the code before handling any application through its desired view function. This is the best place to keep our rate exceeding.
import psycopg2
import string
import random
import redis
import time
from flask import Flask, request, redirect, jsonify
app = Flask(__name__)
DB_HOST = "10.0.0.200"
DB_NAME = "urldb"
DB_USER = "myuser"
DB_PASS = "mypassword"
REDIS_HOST = "10.0.0.130"
RATE_LIMIT_COUNT = 10
RATE_LIMIT_WINDOW = 60
redis_client = redis.Redis(host=REDIS_HOST, port=6379, decode_responses=True)
@app.before_request
def rate_limiter():
key = f"rate_limit:{request.remote_addr}"
now = time.time()
pipe = redis_client.pipeline()
pipe.zadd(key, {str(now): now})
pipe.zremrangebyscore(key, 0, now - RATE_LIMIT_WINDOW)
pipe.zcard(key)
pipe.expire(key, RATE_LIMIT_WINDOW)
results = pipe.execute()
request_count = results(2)
if request_count > RATE_LIMIT_COUNT:
return jsonify(error="Rate limit exceeded"), 429
def get_db_connection():
conn = psycopg2.connect(host=DB_HOST, dbname=DB_NAME, user=DB_USER, password=DB_PASS)
return conn
def init_db():
conn = get_db_connection()
cur = conn.cursor()
cur.execute('''
CREATE TABLE IF NOT EXISTS urls (
id SERIAL PRIMARY KEY,
short_code VARCHAR(6) UNIQUE NOT NULL,
original_url TEXT NOT NULL
);
''')
cur.execute('''
SELECT 1 FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname="idx_original_url" AND n.nspname="public";
''')
if cur.fetchone() is None:
cur.execute('CREATE INDEX idx_original_url ON urls (original_url);')
conn.commit()
cur.close()
conn.close()
def generate_short_code(length=6):
chars = string.ascii_letters + string.digits
return ''.join(random.choice(chars) for _ in range(length))
@app.route("/", methods=('GET'))
def index():
return "URL Shortener is running!\n", 200
@app.route('/shorten', methods=('POST'))
def shorten_url():
original_url = request.form('url')
conn = get_db_connection()
cur = conn.cursor()
cur.execute("SELECT short_code FROM urls WHERE original_url = %s", (original_url,))
existing_url = cur.fetchone()
if existing_url:
short_code = existing_url(0)
else:
short_code = generate_short_code()
cur.execute("INSERT INTO urls (short_code, original_url) VALUES (%s, %s)", (short_code, original_url))
conn.commit()
cur.close()
conn.close()
return jsonify(short_url=f"/{short_code}")
@app.route('/')
def redirect_to_url(short_code):
conn = get_db_connection()
cur = conn.cursor()
cur.execute("SELECT original_url FROM urls WHERE short_code = %s", (short_code,))
url_record = cur.fetchone()
cur.close()
conn.close()
if url_record:
return redirect(url_record(0))
else:
return "URL not found", 404
if __name__ == '__main__':
init_db()
app.run(host='0.0.0.0', port=5000)
How it works, step by step
ID the user: We make a unique redis key for each user based on their IP addresses:
rate_limit:1.2.3.4.Use the pipeline: Delays in the network can be an obstacle. A Redis Pipeline bundles multiple commands in the same application response. It is more effective than sending one by one. It also ensures the order of commands runs without interrupting other customers’ orders.
Current request (ZADD) Login: We add the current Time Stamp (as the Unux’s Tourism) to a configured set. We use a time stamp for both “members” and “scores”, which allow us to filter easily over time.
Clean old requests (Zerimrij Baicor): This is a “sliding window” part. We remove any time stamps from the set that is older than ours
RATE_LIMIT_WINDOW(60 seconds) It effectively rejects requests that are no longer related to the current rate limit.Count of recent requests (ZCARD): We get cardinal (number of items) in the set. After the last phase, this number is counting our requests within the last 60 seconds.
Marking (expiration expiry) to terminate the current record: We ended the term on the key. If a user stops making applications, Redis will automatically delete his rate range data after 60 seconds, and prevents memory from filling out old keys.
Process and check out:
pipe.execute()The command sends all our bundle orders to Redis. We then check the result of our Zcod Command. If counting is more than oursRATE_LIMIT_COUNTWe immediately. 429 return the mistake.
This approach is incredibly fast and efficient. All heavy lifting is made inside the redis, which is better for these types of operations.
Step 3: Containers and testing
To permanently deploy your application to several VMS, we use the Doker. Our postures are standard for a reflection request: It starts with the icon of Azigar, installs dependent requirements.txtCopies the application code, and explains the command to run the app.
But how do we know that it works? We test it!
We use k6To imitate heavy traffic, a modern load test tool. Our test script, rate-test.jsThe rate is specifically designed to verify the rate.
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
stages: (
{ duration: '30s', target: 20 },
{ duration: '1m', target: 20 },
{ duration: '10s', target: 0 },
),
};
export default function () {
const url = 'http://10.0.0.100/shorten';
const payload = { url: `https://www.test-ratelimit-${Math.random()}.com` };
const res = http.post(url, payload);
check(res, {
'status is 200 (OK)': (r) => r.status === 200,
'status is 429 (Too Many Requests)': (r) => r.status === 429,
});
sleep(1);
}
The stages of the array test gradually form the number of virtual consumers to 20. Since our rate limit is 10 minutes per minute, this load is guaranteed to mobilize the limit.
check The function is an important part. This confirms that the server’s response code is either 200 (which means the application was successful) or 429 (which means that our rate limit has stopped the application properly).
We should see that about 10 10 applications pass through about 1600 applications we send from the same IP address.

We can also check the login on our web server to see all the requests that were sent to it.

And if we look at the Redis Cash/Database ourselves, we will see all the keys and TTL that expire.

In this way we limit applications using the Redis Cash Server.
Here are the complete files used in the project.
terraform {
required_providers {
proxmox = {
source = "telmate/proxmox"
version = "3.0.2-rc04"
}
}
}
provider "proxmox" {
pm_api_url = var.proxmox_api_url
pm_api_token_id = var.proxmox_api_token_id
pm_api_token_secret = var.proxmox_api_token_secret
pm_tls_insecure = true
}
locals {
connection_settings = {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
}
}
resource "proxmox_lxc" "postgres_db" {
hostname = "postgres-db-lxc"
target_node = var.target_node
ostemplate = var.lxc_template
rootfs {
storage = "local-lvm"
size = "8G"
}
password = "admin"
unprivileged = true
start = true
features {
nesting = true
}
network {
name = "eth0"
bridge = "vmbr0"
ip = "10.0.0.200/24"
gw = "10.0.0.1"
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = split("/", self.network(0).ip)(0)
}
inline = (
"sudo apt-get update",
"sudo apt-get install -y docker.io docker-compose python3-setuptools",
"sudo usermod -aG docker ${var.ssh_user}",
"sudo mkdir -p /opt/postgres",
"sudo chown ${var.ssh_user}:${var.ssh_user} /opt/postgres"
)
}
provisioner "file" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = split("/", self.network(0).ip)(0)
}
source = "../databases/pg-docker-compose.yml"
destination = "/opt/postgres/docker-compose.yml"
}
provisioner "remote-exec" {
inline = ("cd /opt/postgres && sudo docker-compose up -d")
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = split("/", self.network(0).ip)(0)
}
}
}
resource "proxmox_lxc" "mongo_db" {
hostname = "mongo-db-lxc"
target_node = var.target_node
ostemplate = var.lxc_template
rootfs {
storage = "local-lvm"
size = "8G"
}
password = "admin"
unprivileged = true
start = true
features {
nesting = true
}
network {
name = "eth0"
bridge = "vmbr0"
ip = "10.0.0.210/24"
gw = "10.0.0.1"
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = split("/", self.network(0).ip)(0)
}
inline = (
"sudo apt-get update",
"sudo apt-get install -y docker.io docker-compose python3-setuptools",
"sudo usermod -aG docker ${var.ssh_user}",
"sudo mkdir -p /opt/mongo",
"sudo chown ${var.ssh_user}:${var.ssh_user} /opt/mongo"
)
}
provisioner "file" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = split("/", self.network(0).ip)(0)
}
source = "../databases/mongo-docker-compose.yml"
destination = "/opt/mongo/docker-compose.yml"
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = split("/", self.network(0).ip)(0)
}
inline = ("cd /opt/mongo && docker-compose up -d")
}
}
resource "proxmox_vm_qemu" "redis_cache" {
vmid = 130
name = "redis-cache-rate-limiter"
target_node = "pve"
agent = 1
cpu {
cores = 1
}
memory = 1024
boot = "order=scsi0"
clone = "debian12-cloudinit"
scsihw = "virtio-scsi-single"
vm_state = "running"
automatic_reboot = true
cicustom = "vendor=local:snippets/qemu-guest-agent.yml"
ciupgrade = true
nameserver = "1.1.1.1 8.8.8.8"
ipconfig0 = "ip=10.0.0.130/24,gw=10.0.0.1"
skip_ipv6 = true
ciuser = var.ssh_user
cipassword = var.ssh_password
sshkeys = var.ssh_key
serial {
id = 0
}
disks {
scsi {
scsi0 {
disk {
storage = "local-lvm"
size = "5G"
}
}
}
ide {
ide1 {
cloudinit {
storage = "local-lvm"
}
}
}
}
network {
id = 0
bridge = "vmbr0"
model = "virtio"
}
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = "10.0.0.130"
}
provisioner "remote-exec" {
inline = (
"echo 'Waiting for cloud-init to finish...'",
"while ( ! -f /var/lib/cloud/instance/boot-finished ); do echo 'Still waiting...' && sleep 1; done",
"echo 'Cloud-init finished.'",
"sudo apt-get update -y",
"sudo apt-get install -y docker.io docker-compose",
"sudo mkdir -p /opt/redis",
)
}
provisioner "file" {
source = "../caching/redis-docker-compose.yml"
destination = "/home/${var.ssh_user}/docker-compose.yml"
}
provisioner "remote-exec" {
inline = ( "sudo mv /home/${var.ssh_user}/docker-compose.yml /opt/redis/docker-compose.yml" )
}
provisioner "remote-exec" {
inline = ( "cd /opt/redis && sudo docker-compose up -d" )
}
}
resource "proxmox_vm_qemu" "web-servers" {
count = 2
vmid = count.index + 150
name = "web-server-tf-${count.index + 1}"
target_node = "pve"
agent = 1
cpu {
cores = 1
}
memory = 1024
boot = "order=scsi0"
clone = "debian12-cloudinit"
scsihw = "virtio-scsi-single"
vm_state = "running"
automatic_reboot = true
cicustom = "vendor=local:snippets/qemu-guest-agent.yml"
ciupgrade = true
nameserver = "1.1.1.1 8.8.8.8"
ipconfig0 = "ip=10.0.0.${111 + count.index}/24,gw=10.0.0.1"
skip_ipv6 = true
ciuser = var.ssh_user
cipassword = var.ssh_password
sshkeys = var.ssh_key
serial {
id = 0
}
disks {
scsi {
scsi0 {
disk {
storage = "local-lvm"
size = "5G"
}
}
}
ide {
ide1 {
cloudinit {
storage = "local-lvm"
}
}
}
}
network {
id = 0
bridge = "vmbr0"
model = "virtio"
}
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = "10.0.0.${111 + count.index}"
}
provisioner "remote-exec" {
inline = (
"echo 'Waiting for cloud-init to finish...'",
"while ( ! -f /var/lib/cloud/instance/boot-finished ); do echo 'Still waiting...' && sleep 1; done",
"echo 'Cloud-init finished.'",
"sudo apt-get update -y",
"sudo apt-get install -y docker.io",
"sudo mkdir -p /opt/app",
)
}
provisioner "file" {
source = "../web-servers/app.py"
destination = "/home/${var.ssh_user}/app.py"
}
provisioner "file" {
source = "../web-servers/Dockerfile"
destination = "/home/${var.ssh_user}/Dockerfile"
}
provisioner "file" {
source = "../web-servers/requirements.txt"
destination = "/home/${var.ssh_user}/requirements.txt"
}
provisioner "remote-exec" {
inline = (
"sudo mv /home/${var.ssh_user}/app.py /opt/app/",
"sudo mv /home/${var.ssh_user}/Dockerfile /opt/app/",
"sudo mv /home/${var.ssh_user}/requirements.txt /opt/app/",
"sudo docker build -t my-python-app /opt/app",
"sudo docker stop $(sudo docker ps -q --filter ancestor=my-python-app) 2>/dev/null || true",
"sudo docker rm $(sudo docker ps -aq --filter ancestor=my-python-app) 2>/dev/null || true",
"sudo docker run -d --restart always -p 80:5000 my-python-app"
)
}
depends_on = (
proxmox_lxc.postgres_db,
proxmox_vm_qemu.redis_cache
)
}
resource "proxmox_vm_qemu" "load_balancer" {
name = "lb-1"
target_node = var.target_node
clone = var.vm_template
agent = 1
cpu {
cores = 1
}
memory = 512
boot = "order=scsi0"
scsihw = "virtio-scsi-single"
vm_state = "running"
automatic_reboot = true
cicustom = "vendor=local:snippets/qemu-guest-agent.yml"
ciupgrade = true
nameserver = "1.1.1.1 8.8.8.8"
ipconfig0 = "ip=10.0.0.100/24,gw=10.0.0.1"
skip_ipv6 = true
ciuser = var.ssh_user
cipassword = var.ssh_password
sshkeys = var.ssh_key
serial {
id = 0
}
disks {
scsi {
scsi0 {
disk {
storage = "local-lvm"
size = "5G"
}
}
}
ide {
ide1 {
cloudinit {
storage = "local-lvm"
}
}
}
}
network {
id = 0
bridge = "vmbr0"
model = "virtio"
}
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = "10.0.0.100"
}
provisioner "remote-exec" {
inline = (
"echo 'Waiting for cloud-init to finish...'",
"while ( ! -f /var/lib/cloud/instance/boot-finished ); do echo 'Still waiting...' && sleep 1; done",
"echo 'Cloud-init finished.'",
"sudo apt-get update -y",
"sudo apt-get install -y nginx"
)
}
provisioner "file" {
source = "../web-servers/nginx.conf"
destination = "/tmp/nginx.conf"
}
provisioner "remote-exec" {
inline = (
"sudo mv /tmp/nginx.conf /etc/nginx/sites-available/default",
"sudo systemctl reload nginx"
)
}
}
resource "proxmox_vm_qemu" "load_tester" {
name = "load-tester-vm"
target_node = var.target_node
clone = var.vm_template
agent = 1
cpu {
cores = 1
}
memory = 1024
boot = "order=scsi0"
scsihw = "virtio-scsi-single"
vm_state = "running"
automatic_reboot = true
cicustom = "vendor=local:snippets/qemu-guest-agent.yml"
ciupgrade = true
nameserver = "1.1.1.1 8.8.8.8"
ipconfig0 = "ip=10.0.0.160/24,gw=10.0.0.1"
skip_ipv6 = true
ciuser = var.ssh_user
cipassword = var.ssh_password
sshkeys = var.ssh_key
serial {
id = 0
}
disks {
scsi {
scsi0 {
disk {
storage = "local-lvm"
size = "5G"
}
}
}
ide {
ide1 {
cloudinit {
storage = "local-lvm"
}
}
}
}
network {
id = 0
bridge = "vmbr0"
model = "virtio"
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = "10.0.0.160"
}
inline = (
"echo 'Waiting for cloud-init to finish...'",
"while ( ! -f /var/lib/cloud/instance/boot-finished ); do echo 'Still waiting...' && sleep 1; done",
"echo 'Cloud-init finished.'",
"sudo apt-get update -y",
"sudo apt-get install -y gnupg curl",
"curl -sL | sudo gpg --dearmor -o /usr/share/keyrings/k6-archive-keyring.gpg",
"echo 'deb (signed-by=/usr/share/keyrings/k6-archive-keyring.gpg) stable main' | sudo tee /etc/apt/sources.list.d/k6.list",
"sudo apt-get update",
"sudo apt-get install -y k6"
)
}
provisioner "file" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = "10.0.0.160"
}
source = "../load-testing/script.js"
destination = "/home/${var.ssh_user}/script.js"
}
provisioner "file" {
connection {
type = "ssh"
user = var.ssh_user
private_key = file(var.ssh_private_key_path)
host = "10.0.0.160"
}
source = "../load-testing/rate-test.js"
destination = "/home/${var.ssh_user}/rate-test.js"
}
}
Conclusion
Now you have seen how to create a complete, expanding, and flexible system, which includes an important component for modern web applications: a distributed rate lumbar.
We have covered the whole stack:
Infrastructure as a code With tarfers to explain our virtual machines. (Check my repo Here For all code and any refreshments I make).
A Central, high -speed cache With Reds to store our rate data.
A effective Sliding window log algorithm Implected with flask.
Containerization For permanent deployment with the Doker.
Load balanced With NGINX to distribute traffic.
Load testing To correct our implementation with K6.
If you would like to find out more about the concepts used when building a large -scale system please follow me Srivan Croatory.