🔗 Building a High-Performance URL Shortener with Rate Limiting

🔗 Building a High-Performance URL Shortener with Rate Limiting — 1M RPS 🚀

Designing a URL shortener that can handle millions of requests per second, prevent abuse, and scale dynamically is no small feat. Below is a clean 5-service breakdown — each with its own goal, stack, real-world relevance 🌍, and implementation blueprint 🛠️.



🔹 1. URL Shortening Service

🎯 Goal: Generate unique, collision-free short URLs

💡 Key Idea: Use a Base62/Hashids encoder on either an incrementing ID or a distributed unique ID (e.g., UUID, Snowflake ID) to create short tokens. Store mapping in high-throughput NoSQL.

⚙️ Tech Stack: Java + Spring Boot, Base62/Hashids, Cassandra/ScyllaDB, Kafka (async writes)

🌐 Real-Time Use Case:
A user submits a long product link — the system instantly returns a short URL like https://sh.rt/xyZ123

🔧 Implementation Steps:
  1. Create a REST controller to accept long URLs.
  2. Use a Base62 encoder or Hashids to generate the short token from a UUID or DB-generated ID.
  3. Persist the mapping (short → long) in Cassandra/ScyllaDB.
  4. Publish an async Kafka event to log or track analytics.



🔹 2. Rate Limiting Layer

🎯 Goal: Prevent abuse by limiting requests per IP/user

💡 Key Idea: Use the token bucket algorithm to limit burst traffic, with Redis to coordinate limits across distributed instances.

⚙️ Tech Stack: Bucket4j (Java filter), Redis (token bucket store), Lua scripts (for atomic operations)

🌐 Real-Time Use Case:
A bot tries to generate 500 links/sec — the system throttles it to 10/sec.

🔧 Implementation Steps:
  1. Add Bucket4j or RateLimiter interceptor as a Spring filter.
  2. Store tokens per IP in Redis keys (e.g., ratelimit:ip:192.168.1.1).
  3. Use Lua scripts to ensure atomic decrement and refill per window.
  4. Return HTTP 429 if limits are exceeded.




🔹 3. Redirection Service

🎯 Goal: Redirect short URL to original URL in under 20ms

💡 Key Idea: Use Redis as a blazing-fast cache for short URLs and fall back to persistent DB if needed.

⚙️ Tech Stack: Java (Vert.x or Netty), Redis (hot cache), CDN (edge cache)

🌐 Real-Time Use Case:
User clicks https://sh.rt/xyZ123 → instantly redirected to original page

🔧 Implementation Steps:
  1. Expose GET endpoint like /s/{token}.
  2. Look up original URL in Redis (hot path).
  3. If not found, fall back to Cassandra and repopulate Redis.
  4. Send HTTP 302/301 redirect response to user.




🔹 4. Analytics & Logging Pipeline

🎯 Goal: Track usage, detect abuse, and power dashboards

💡 Key Idea: Use Kafka for real-time event streaming, then process logs with Flink or Spark, and index into Elastic for visualization.

⚙️ Tech Stack: Kafka (event ingestion), Flink/Spark (stream processing), ElasticSearch + Kibana

🌐 Real-Time Use Case:
Marketing dashboard shows top 100 clicked links this hour.

🔧 Implementation Steps:
  1. Publish events (clicks, errors, rate-limit rejections) to Kafka.
  2. Create Flink/Spark jobs to process Kafka streams in real time.
  3. Push processed data into ElasticSearch.
  4. Use Kibana to build dashboards and set up alerts.




🔹 5. Load Balancing & Auto-Scaling Layer

🎯 Goal: Balance load and scale dynamically on traffic surges

💡 Key Idea: Use Kubernetes HPA + NGINX/HAProxy to scale up services based on CPU, memory, or custom metrics like request/sec.

⚙️ Tech Stack: K8s, HAProxy/NGINX, Spring Cloud Gateway, Horizontal Pod Autoscaler (HPA)

🌐 Real-Time Use Case:
A YouTube campaign goes live — traffic surges from 1K RPS to 50K RPS — system scales within seconds.

🔧 Implementation Steps:
  1. Deploy all microservices to Kubernetes.
  2. Expose ingress through Spring Cloud Gateway or NGINX.
  3. Enable HPA to scale pods based on Prometheus metrics (RPS, CPU, memory).
  4. Configure HAProxy/NGINX for load distribution.




✅ Summary

  • 🔗 Shortener: Unique short URLs using Base62 + NoSQL
  • 🚦 Rate Limiter: Token bucket via Redis + Bucket4j
  • Redirector: Instant redirects via Redis/Netty/CDN
  • 📊 Analytics: Real-time event tracking with Kafka + Flink
  • 🌩️ Auto-Scaling: K8s HPA + HAProxy for scale and uptime




📋 Want the Full Implementation?

Download the complete codebase with Redis config, Kafka streams, Spring Boot templates, and K8s YAMLs from here 👇👇👇

➡️ Click here to download URL Shortener System (PDF + Code)


Thanks for reading! 💬 Feel free to clone, tweak, or ask questions anytime 🚀

Comments