Skip to content

Core HR Performance Budget

Purpose: Establish quantitative latency, throughput, and resource guardrails for critical user flows and platform components. Budgets guide design, tuning, regression detection, and release readiness.

1. Scope & Principles

  • User-centric p95 and p99 latency targets for interactive actions.
  • Separation of synchronous UX flows vs asynchronous background work.
  • Budgets are ceilings, not averages; sustained exceedance triggers investigation.
  • Prefer additive caching and incremental optimization before deep refactors.
  • Observability (traces + metrics) required to measure every budget.

2. Flow Latency Budgets

Flow / Action Target p95 Target p99 Ceiling (hard stop) Notes
Password Login 400ms 650ms 800ms Includes DB user fetch + token issue
SSO Login (OIDC) 800ms 1200ms 1500ms External IdP median ~300ms; optimize local steps
Signup Provisioning 1200ms 1800ms 2000ms Tenant + first user + baseline events
Leave Request Submit 350ms 500ms 650ms Balance calc + event emission + cache update
Leave Cancellation 250ms 400ms 500ms Simple status + balance reversal
Goal Status Change 200ms 300ms 400ms Includes analytics event (US-332)
Document Upload ACK (pre-scan) 500ms 800ms 1000ms Returns pending status; excludes scan
Document Scan Completion (async) 30s 45s 60s p95 turnaround from enqueue to status Clean
Subscription Upgrade (US-205) 2000ms 3000ms 3500ms Gateway API variability; proration calc local <150ms
Profile Read (standard) 250ms 400ms 500ms Includes permission filtering & cache assist
Event Emission Overhead <10ms <20ms 30ms Serialization + dispatch; measured inside span
Analytics Event Write <15ms <30ms 50ms Non-blocking path preferred

3. Component Budgets

Component Budget Dimension Target Hard Ceiling Measurement Method
API Pod CPU per request Avg CPU time <50ms 80ms span CPU metrics / eBPF samples
API Pod RSS Memory Steady-state <300MB 400MB container metrics
DB Simple Query (indexed) p95 latency <20ms 35ms query tracing sample
DB Multi-Join Query p95 latency <60ms 90ms trace spans annotated
Leave Accrual Transaction p95 latency <120ms 180ms transaction root span
Cache GET Hit p95 latency <5ms 10ms client metrics
Cache Hit Ratio (leave balances) Percentage >95% <90% triggers alert aggregated counters
Queue Publish (domain event) p95 latency <25ms 40ms publish span
Queue Lag (critical flows) p95 delay <2s 5s consumer metrics
Document Scan Worker CPU Avg/core <65% 80% node exporter metrics
Billing Webhook Processing p95 latency <500ms 800ms webhook handler span
Trace Sampling Rate (critical flows) Percentage retained 80% 90% max (avoid overhead) collector config audit
Instrumentation Overhead CPU impact <2% 5% diff baseline vs instrumented load

4. Throughput & Concurrency Assumptions

Metric Baseline Scaled Target (Year 1) Notes
Peak Concurrent Auth Requests 50 RPS 500 RPS Password + SSO combined
Peak Leave Ops (submit/cancel) 20 RPS 200 RPS Seasonal spikes around holidays
Peak Document Uploads 5 RPS 50 RPS Burst handling with queue backpressure
Peak Goal Status Changes 10 RPS 100 RPS Review cycle bursts
Event Throughput (domain+analytics) 150 RPS 1500 RPS Partition & batching strategy evolves

5. Capacity Guardrails

  • DB connection pool wait p95 < 10ms; alert on sustained >25ms.
  • Queue backlog (age p95) must remain under 2 * flow p95 latency budget for synchronous dependencies.
  • Cache memory eviction rate <5% per hour for critical keys (leave balances, tenant config).
  • SSO token introspection failures <0.5% of attempts monthly.

6. Regression Detection

Signal Threshold Action
Any p95 flow > budget for 3 consecutive 5-min windows Degradation alert Create performance incident ticket
DB query p99 > 2x target Investigate query plan Add/adjust index or cache layer
Cache hit ratio < target for 15 min Hot path analysis Evaluate key expiry & prewarm jobs
Event publish latency p95 > 40ms Broker health check Scale partitions / investigate network
Trace retention < target Collector config audit Adjust sampling rules

7. Testing Strategy

  • Load Generation: k6 scenarios per critical flow (login, leave submit, doc upload ACK, upgrade).
  • Profiling: CPU & memory under sustained 10-minute peak; capture flamegraphs.
  • Synthetic Daily Run: Executes lightweight smoke + latency assertions (GitHub Action or cron job).
  • Pre-Release Load Test: 2x expected peak for 30 minutes; must remain within p99 ceilings.
  • Performance Test Artifacts: Stored in docs/performance/results/<YYYY-MM-DD>/ with summary JSON.

8. Optimization Playbook (Ordered)

  1. Confirm measurement accuracy (trace spans & metrics present).
  2. Identify top contributors (flamegraph / span breakdown).
  3. Apply low-risk improvements (indexes, cache TTL tuning, batch writes).
  4. Consider architectural changes (async boundaries, denormalization) if >25% over budget persists.
  5. Re-run load & update budget review notes.

9. Future Expansion

  • Introduce percentiles for cold-start (deployment rollover) vs warmed state.
  • Separate regional latency budgets when multi-region deployed.
  • Automated SLO derivation & error budget tracking (phase 2).
  • Real-user monitoring (RUM) for frontend correlation.

10. Governance & Review Cadence

  • Monthly review: Adjust budgets based on growth metrics & infra changes.
  • Change control: Any upward adjustment requires documented rationale & mitigation plan.
  • Ownership: Platform engineering maintains doc; each domain owner signs off revisions.

11. Instrumentation Requirements Checklist

  • Trace spans exist for each budgeted flow start/end.
  • Span attributes include tenantId (non-PII surrogate), user role, component.
  • Metrics exported: latency histograms, cache hits, queue lag, DB timings.
  • Alert rules defined for all regression thresholds.
  • Load test scripts stored & versioned.

12. Open Questions

  1. Introduce per-tenant isolation latency budgets? (e.g., queries under tenant predicate.)
  2. Track warm vs cold path separately for document scan worker startup?
  3. Need SLA differentiation between paid tiers (e.g., faster document scan)?

Version: 1.0 (2025-11-22)