The Problem • The Fix • Install • How It Works • Configuration • FAQ
You have background jobs — reports, ETL, batch processing, file uploads. They run fine... until they don't.
@Service
public class OrderService {
private final ExecutorService backgroundExecutor = Executors.newFixedThreadPool(4);
// Business logic — must stay responsive
public Order createOrder(OrderRequest request) {
return orderRepository.save(new Order(request));
}
// Background job — processes thousands of records
public void generateDailyReport(List<Order> orders) {
backgroundExecutor.submit(() -> {
for (Order order : orders) {
reportService.process(order); // 💀 Competes with createOrder() for CPU/memory
}
});
}
}Then this happens:
- Monday 2pm: Marketing kicks off their weekly report
- Monday 2:01pm: CPU hits 95%,
createOrder()latency spikes from 50ms to 3 seconds - Monday 2:02pm: Your Kubernetes pod gets OOMKilled
- Monday 2:03pm: PagerDuty goes off, you stop what you're doing
The root cause: Your background jobs don't know when to back off. They'll happily consume 100% of resources while your users wait.
Throttle is a drop-in replacement for ExecutorService that automatically pauses tasks when your system is under pressure.
// Before: Background jobs compete with your app for resources
ExecutorService executor = Executors.newFixedThreadPool(4);
// After: Background jobs yield when CPU/memory is high
ThrottleService executor = ThrottleServiceFactory.builder()
.cpuMonitor(75, 50) // Pause when CPU > 75%, resume when < 50%
.memoryMonitor(70, 50) // Pause when memory > 70%, resume when < 50%
.build();That's it. Your background jobs now automatically pause during traffic spikes and resume when things calm down.
<dependency>
<groupId>io.github.sdeonvacation</groupId>
<artifactId>throttle</artifactId>
<version>1.0.1</version>
</dependency>Zero dependencies. Just add and go.
// 1. Create the executor (once, typically at startup)
ThrottleService executor = ThrottleServiceFactory.builder()
.cpuMonitor(75, 50)
.memoryMonitor(70, 50)
.build();
// 2. Submit chunked tasks (the chunks are pause points)
executor.submit(new AbstractChunkableTask<Order>(orders, Priority.LOW, 100) {
@Override
public void processChunk(List<Order> chunk) {
chunk.forEach(this::processOrder);
// ↑ After each chunk, Throttle checks CPU/memory
// If system is stressed, it pauses here until resources free up
}
});
// 3. Your app stays responsive. Users are happy. You sleep through the night.┌──────────────────────────────────────────────────────────────────────────────┐
│ YOUR APPLICATION │
├──────────────────────────────────────────────────────────────────────────────┤
│ │
│ API Requests (business logic) Background Jobs (reports, ETL, etc.) │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ Normal threads │ │ ThrottleService │ │
│ │ (unaffected) │ └─────────────────────┘ │
│ └─────────────────┘ │ │
│ ▼ │
│ ┌──────────────────────────┐ │
│ │ Task split into chunks │ │
│ │ [1] [2] [3] [4] [5]... │ │
│ └──────────────────────────┘ │
│ │ │
│ ┌────────────────────┼────────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │Chunk 1 │──────────▶│Chunk 2 │──────────▶│Chunk 3 │ │
│ └────────┘ └────────┘ └────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐│
│ │ Check │ │ Check │ │ Check ││
│ │ CPU/Mem │ │ CPU/Mem │ │ CPU/Mem ││
│ └──────────┘ └──────────┘ └──────────┘│
│ │ │ │ │
│ OK? ─┴─ HOT? OK? ─┴─ HOT? │ │
│ │ │ │ │ │ │
│ ▼ ▼ ▼ ▼ ▼ │
│ Continue PAUSE ──▶ Wait for ──▶ Continue │
│ resources │
│ to cool │
└──────────────────────────────────────────────────────────────────────────────┘
SYSTEM LOAD TASK BEHAVIOR
┌───────────────────┐ ┌───────────────────────┐
│ ████████░░ 80% │ ─────▶ │ ⏸ PAUSED (waiting) │
│ CPU/memory high! │ │ │
└───────────────────┘ └───────────────────────┘
┌───────────────────┐ ┌───────────────────────┐
│ ████░░░░░░ 40% │ ─────▶ │ ▶ RUNNING (resumed) │
│ CPU/memory normal │ │ │
└───────────────────┘ └───────────────────────┘
- You split work into chunks — Throttle needs natural pause points
- Throttle executes chunks — Just like a normal executor
- Between chunks, Throttle checks resources — Is CPU hot? Is memory tight?
- If resources are constrained, Throttle pauses — Your task waits (not spinning, actually blocked)
- When resources free up, Throttle resumes — Right where it left off
Why chunks? Java can't pause a thread mid-execution. Chunks give Throttle safe points to pause without losing work.
| Without Throttle | With Throttle |
|---|---|
| Background job runs at full speed | Background job pauses when system is stressed |
| API latency spikes during batch jobs | API stays responsive |
| OOMKilled pods | Graceful degradation |
| PagerDuty at 2am | Sleep |
Perfect for:
- ETL pipelines and batch processing
- Report generation
- Bulk API calls (rate-limited external services)
- File processing (uploads, exports, archives)
- Database migrations and cleanup jobs
- Any "background work" that can be chunked
Not for:
- Sub-millisecond latency requirements (chunk overhead is ~1ms)
- Work that can't be split into chunks
- Simple fire-and-forget tasks with no resource concerns
Every parameter is tunable. Sane defaults, full control when you need it.
ThrottleService executor = ThrottleServiceFactory.builder()
// Resource thresholds
.cpuMonitor(75, 50) // hot=75%, cold=50%
.memoryMonitor(70, 50) // hot=70%, cold=50%
// Threading (bring your own or use defaults)
.workerThreadPool(Executors.newFixedThreadPool(10))
.monitoringThreadPool(Executors.newFixedThreadPool(2))
// Timing
.hysteresis(Duration.ofSeconds(10)) // Min time in state before transition
.coldMonitoringInterval(Duration.ofSeconds(5)) // Resume detection polling interval
.hotMonitoringDebounceInterval(Duration.ofMillis(100)) // Debounce between checkpoint samples
// Queue management
.queueCapacity(100)
.overflowPolicy(OverflowPolicy.REJECT) // or DISCARD_OLDEST, BLOCK
// Task killing (opt-in, for runaway tasks)
.maxPauseCount(5) // Kill a task after 5 pauses
.taskTerminationEnabled(true) // Enabled by default
// Anti-starvation
.starvationThreshold(Duration.ofHours(2)) // Boost priority after 2h
.build();// High priority tasks run first
executor.submit(new MyTask(items, Priority.HIGH, 100));
// Low priority tasks yield to high priority
executor.submit(new MyTask(items, Priority.LOW, 100));ExecutorMetrics metrics = executor.getMetrics();
metrics.getActiveThreads(); // Currently running
metrics.getQueueSize(); // Waiting in queue
metrics.getTasksCompleted(); // Finished successfully
metrics.getTasksFailed(); // Failed with exception
metrics.getTasksKilled(); // Killed for pausing too much
metrics.isPaused(); // Is system currently paused?
metrics.getPauseCount(); // Total pauses since startupPlug these into Prometheus, Datadog, or whatever you use.
| Feature | Throttle | ExecutorService | Resilience4j Bulkhead | Spring Batch |
|---|---|---|---|---|
| Auto-pauses on CPU pressure | ✅ | ❌ | ❌ | ❌ |
| Auto-pauses on memory pressure | ✅ | ❌ | ❌ | ❌ |
| Auto-resumes when clear | ✅ | ❌ | ❌ | ❌ |
| Priority scheduling | ✅ | ❌ | ❌ | Limited |
| Zero dependencies | ✅ | ✅ | ❌ | ❌ |
| Chunked checkpoints | ✅ | ❌ | ❌ | ✅ |
Why do I need to split work into chunks?
Java doesn't let you pause a thread mid-execution. Chunks give Throttle safe points to pause without losing progress. Think of them as checkpoints in a video game.
What's the overhead?
Minimal. Throttle only checks resources between chunks (not continuously). The check itself is ~1ms. If you're processing items in chunks of 100, that's 0.01ms per item.
My pods auto-scale. Why do I need this?
Auto-scaling and Throttle solve different problems:
| Auto-scaling | Throttle | |
|---|---|---|
| What it does | Adds more instances | Prioritizes work within each instance |
| Reaction time | Minutes (spin up new pods) | Milliseconds (pause between chunks) |
| Solves | "Not enough capacity" | "Background jobs starving business logic" |
| Cost | More $$ (more instances) | Zero (just smarter scheduling) |
The gap auto-scaling can't fill: When your pod is at 80% CPU, auto-scaling might spin up another pod. But within that pod, your batch job is still competing with API requests. Throttle makes the batch job yield.
Use both: Auto-scaling for capacity. Throttle for priority.
What about environments without auto-scaling?
Throttle becomes critical. On Cloud Foundry, on-prem, or fixed-capacity environments, you can't add instances when load spikes. Throttle is your safety valve — background work automatically backs off before you hit OOM.
What happens to paused tasks during shutdown?
executor.shutdown() wakes all paused tasks and lets them complete. shutdownNow() interrupts them.
Can I use this with Spring Boot?
Yes. Create a @Bean and inject it:
@Bean
public ThrottleService throttleService() {
return ThrottleServiceFactory.builder()
.cpuMonitor(75, 50)
.memoryMonitor(70, 50)
.build();
}What if my task keeps getting paused?
Two mechanisms handle this:
-
Priority boosting — Tasks waiting too long (default: 2 hours) get their priority bumped (LOW → MEDIUM → HIGH). This prevents starvation.
-
Task termination (optional) — If
taskTerminationEnabled(true)is set, tasks that pause more thanmaxPauseCounttimes (default 5) are killed withTaskTerminatedException. Disabled by default.
Want to see Throttle in action? The simulator lets you watch real-time pause/resume behavior:
git clone https://github.com/sdeonvacation/throttle.git
cd throttle/simulator
mvn spring-boot:run
# Open http://localhost:8080/api/simulator/dashboard- DESIGN.md — Architecture deep-dive with diagrams
- Javadoc — API reference
- Simulator docs — Test scenarios and edge cases
PRs welcome! See CONTRIBUTING.md.
If Throttle saved you from a 2am page, consider giving it a ⭐
Made by Sambhrant Maurya
