Service Discovery: Finding Services in a Dynamic World
When services come and go, hardcoded IPs don't work. Service discovery patterns for dynamic infrastructure.
February 23, 2026 · 7 min · 1393 words · Rob Washington
Table of Contents
In static infrastructure, services live at known addresses. Database at 10.0.1.5, cache at 10.0.1.6. Simple, predictable, fragile.
In dynamic infrastructure — containers, auto-scaling, cloud — services appear and disappear constantly. IP addresses change. Instances multiply and vanish. Hardcoded addresses become a liability.
Service discovery solves this: how do services find each other when everything is moving?
# Hardcoded - works until it doesn'tDATABASE_URL="postgres://10.0.1.5:5432/mydb"# What happens when:# - Database moves to a new server?# - You add read replicas?# - The IP changes after maintenance?
Registration without health checking is dangerous. Dead instances stay in the registry.
Active checks — registry pings services:
1
2
3
4
# Registry periodically checks each instanceforinstanceinregistry.instances("payment-service"):ifnothttp.get(f"http://{instance}/health").ok:registry.mark_unhealthy(instance)
Passive checks — services send heartbeats:
1
2
3
4
5
# Service sends heartbeat every 10 secondswhilerunning:registry.heartbeat("payment-service",my_address)time.sleep(10)# Miss 3 heartbeats → marked unhealthy
Hybrid — both active and passive:
1
2
3
4
5
6
7
8
# Consul supports bothcheck {http = "http://localhost:8080/health"interval = "10s"}check {ttl = "30s" # Must call /v1/agent/check/pass/:check_id}
defget_instances(service_name):try:returnregistry.query(service_name)exceptRegistryUnavailable:returncached_instances[service_name]# Stale but better than nothing
All instances unhealthy:
1
2
3
4
instances=registry.get_healthy_instances("payment")ifnotinstances:# Fall back to any instance? Return error? Use circuit breaker?raiseServiceUnavailable("No healthy payment instances")
Split brain:
Multiple registries with inconsistent views. Use consensus-based registries (Consul, etcd) or accept eventual consistency.
Modern approach: sidecar proxy handles all discovery and routing.
The service just calls localhost. The sidecar handles discovery, load balancing, retries, mTLS, observability.
Pros: Zero application changes, consistent behavior
Cons: Complexity, resource overhead, another thing to operate
Implementations: Istio, Linkerd, Consul Connect
Service discovery is infrastructure plumbing — invisible when working, catastrophic when broken. Start with DNS for simple cases. Add a registry when you need health checking and metadata. Consider a service mesh when you need advanced traffic management.
The goal: services find each other reliably, without hardcoded addresses, without human intervention. Everything else is implementation details.
📬 Get the Newsletter
Weekly insights on DevOps, automation, and CLI mastery. No spam, unsubscribe anytime.