Windows¶
Windows are the memory of the circuit breaker. They store recent call outcomes (like successes, failures, and response times) and provide the metrics that trippers use to decide whether to open the circuit.
Fluxgate provides two types of windows:
| Window Type | Tracking Method | Best For... |
|---|---|---|
| CountWindow | Last N calls | Services with stable traffic, where you want to evaluate a fixed number of recent operations. |
| TimeWindow | Last N seconds | Services with variable or bursty traffic, where time-based evaluation is more meaningful. |
CountWindow¶
CountWindow tracks a fixed number of the most recent calls. It's a great choice for services with stable and predictable traffic patterns.
How It Works¶
It maintains a fixed-size circular buffer in memory. When a new call is recorded, it overwrites the oldest one if the window is full. This guarantees that the window always contains exactly the last N calls, providing a consistent volume for evaluation.
Basic Usage¶
from fluxgate import CircuitBreaker
from fluxgate.windows import CountWindow
# This breaker will base its decisions on the last 100 calls.
cb = CircuitBreaker(
name="stable_api",
window=CountWindow(size=100),
...
)
TimeWindow¶
TimeWindow tracks calls that have occurred over the last N seconds. It's ideal for services with irregular or bursty traffic where a time-based perspective is more important than a call count.
How It Works¶
It uses a series of time-based buckets (one for each second in the window). When a call is recorded, its outcome is aggregated into the bucket corresponding to the current timestamp. Old buckets that fall outside the time window expire automatically and are reused.
This approach ensures that a sudden burst of failures doesn't dominate the metrics for too long.
Basic Usage¶
from fluxgate import CircuitBreaker
from fluxgate.windows import TimeWindow
# This breaker will base its decisions on calls made in the last 60 seconds.
cb = CircuitBreaker(
name="variable_traffic_api",
window=TimeWindow(size=60),
...
)
Choosing a Window¶
Comparison¶
| Feature | CountWindow | TimeWindow |
|---|---|---|
| Memory Usage | Proportional to the number of calls (size). |
Proportional to the duration in seconds (size). |
| Traffic Spikes | A burst of calls can quickly flush out old data. | Retains data for the full duration, smoothing out bursts. |
| Low Traffic | Gathers a full set of metrics faster. | May take longer to collect enough data to be meaningful. |
| Evaluation Basis | A fixed number of calls. | A fixed duration of time. |
| Granularity | Per-call. | Per-second. |
When should I use CountWindow?¶
CountWindow is an excellent choice when you have:
- Stable and predictable traffic: The rate of calls doesn't fluctuate dramatically.
- A need for memory efficiency: It often consumes less memory than
TimeWindowfor equivalent coverage. - A desire for fast evaluation: It can fill up and provide meaningful metrics quickly.
Common use cases: Internal microservice-to-microservice communication, background processing, or batch jobs.
When should I use TimeWindow?¶
TimeWindow is generally recommended and is a safer default choice, especially when you have:
- Irregular or bursty traffic: It handles sudden spikes in traffic gracefully.
- A need for time-based policies: Your SLOs are likely defined in terms of time (e.g., "99.9% uptime over any 5-minute window").
- A focus on real-time responsiveness: It ensures that decisions are always based on a recent time period, regardless of call volume.
Common use cases: Public-facing APIs, user-facing services, or calls to volatile external services.
Metrics¶
Both window types provide the same rich set of metrics for trippers.
from fluxgate.windows import CountWindow
from fluxgate.metric import Record
window = CountWindow(size=100)
# Record calls manually
window.record(Record(success=True, duration=0.5))
window.record(Record(success=False, duration=1.2))
# Get the aggregated metric object
metric = window.get_metric()
print(f"Total calls: {metric.total_count}")
print(f"Failed calls: {metric.failure_count}")
print(f"Average duration: {metric.avg_duration}")
Available Metrics:
total_count: Total number of calls recorded in the window.failure_count: Number of calls tracked as failures.total_duration: Sum of the durations of all calls.slow_count: Number of calls that exceeded theslow_threshold.avg_duration: The average response time (total_duration / total_count).failure_rate: The ratio of failed calls (failure_count / total_count).slow_rate: The ratio of slow calls (slow_count / total_count).
Automatic Reset¶
Windows automatically clear their metrics when the circuit breaker transitions between states (e.g., OPEN → HALF_OPEN or HALF_OPEN → CLOSED). This ensures that each recovery attempt and each new CLOSED period begins with a clean slate.
Performance Considerations¶
| Operation | CountWindow | TimeWindow |
|---|---|---|
| Memory | O(N), where N is size |
O(N), where N is size |
record() |
O(1) | O(1) |
get_metric() |
O(1) | O(1) |
Both implementations are highly optimized and designed for negligible overhead.