Tracing & Observability¶
This recipe shows how to add structured observability to restgdf requests
using the resilience extra and Python’s standard logging / structlog.
Prerequisites¶
pip install "restgdf[resilience]"
Enable resilience + logging¶
import logging
from restgdf import Config, ResilienceConfig
logging.basicConfig(level=logging.DEBUG)
cfg = Config(
resilience=ResilienceConfig(
enabled=True,
rate_per_service_root_per_second=10.0,
),
)
The restgdf.resilience module logs every retry attempt, 429 cooldown, and
error mapping through the restgdf.retry logger at DEBUG level. Set
the logger to DEBUG to see each attempt:
logging.getLogger("restgdf.retry").setLevel(logging.DEBUG)
Tracing individual requests¶
The ResilientSession wrapper is transparent — you use it exactly like a
plain aiohttp.ClientSession, but retries and rate limits happen
automatically:
import asyncio
from aiohttp import ClientSession
from restgdf.resilience import ResilientSession
async def traced_query():
async with ClientSession() as raw:
session = ResilientSession(raw, cfg.resilience)
url = "https://maps1.vcgov.org/arcgis/rest/services/Beaches/MapServer/6/query"
params = {"where": "1=1", "f": "json", "resultRecordCount": "1"}
async with session.get(url, params=params) as resp:
data = await resp.json()
print(data)
asyncio.run(traced_query())
Inspecting error attributes¶
When the retry wrapper raises, the exception carries structured attributes populated automatically:
Exception |
Attributes |
|---|---|
|
|
|
|
|
|
|
|
Example:
from restgdf.errors import RateLimitError, RestgdfTimeoutError
try:
await session.get(url).__aenter__()
except RateLimitError as exc:
print(f"429 at {exc.url}, retry after {exc.retry_after}s")
except RestgdfTimeoutError as exc:
print(f"Timeout ({exc.timeout_kind}) for {exc.url}")
Integrating with OpenTelemetry¶
For full distributed tracing with automatic span creation, install the
restgdf[telemetry] extra — see the :doc:/recipes/observability recipe.
The example below shows how to add manual spans around resilience-wrapped
calls using the structured error attributes:
from opentelemetry import trace
tracer = trace.get_tracer("restgdf-app")
async def otel_query(session, url, params):
with tracer.start_as_current_span("arcgis.query", attributes={"url": url}) as span:
try:
async with session.get(url, params=params) as resp:
span.set_attribute("http.status_code", resp.status)
return await resp.json()
except Exception as exc:
span.set_status(trace.StatusCode.ERROR, str(exc))
for attr in ("url", "status_code", "timeout_kind", "retry_after"):
val = getattr(exc, attr, None)
if val is not None:
span.set_attribute(f"restgdf.{attr}", val)
raise
Monitoring 429 cooldowns¶
The CooldownRegistry and LimiterRegistry are per-ResilientSession
(not process-global), so each session has independent rate-limit state.
When a 429 response is received, the retry wrapper:
Parses the
Retry-Afterheader (integer seconds or RFC 7231 HTTP-date).Clamps the value to
ResilienceConfig.respect_retry_after_max_s(default 60 s).Sets a cooldown deadline for the service root — subsequent requests to the same
FeatureServer/MapServerwait until the deadline passes.Stamina’s exponential back-off then retries the request.
The cooldown is separate from the token-bucket limiter — a 429 does
not drain AsyncLimiter tokens.