restgdf migration guide¶
2.0.0 migration notes¶
restgdf 2.0.0 reshapes install surface, error taxonomy, configuration, authentication, observability, and streaming on top of the typed pydantic models that now ship in the release. The preserved 1.x → 2.0 guide follows below.
Summary¶
Install surface:
pip install restgdfis now a light-core install (typed metadata, raw rows, directory crawl, token/session helpers). GeoPandas/pandas/pyogrio move behindpip install "restgdf[geo]". New optional extrasrestgdf[resilience](stamina + aiolimiter) andrestgdf[telemetry](opentelemetry-api + aiohttp client instrumentation) opt callers into retry/rate-limiting and OpenTelemetry respectively. All extras compose with each other and withrestgdf[geo].Streaming:
FeatureLayergrows three canonical streaming shapes on top ofiter_pages(stream_features/stream_feature_batches/stream_rows) with sharedon_truncation/order/max_concurrent_pagesknobs and the R-61feature_layer.streamparent span.stream_gdf_chunksremains as the legacyGeoDataFrame-per-page shape (requiresrestgdf[geo], backed bychunk_generator, completion-order only, does not take the shared knobs).row_dict_generatoris deprecated in favour ofstream_rows. Seedocs/recipes/streaming.md.Errors: Single taxonomy rooted at
restgdf.errors.RestgdfError(public). Every class additively multi-inherits the matching builtin (ValueError,TimeoutError,PermissionError,IndexError,ModuleNotFoundError) so existingexceptclauses keep catching.Configuration:
restgdf.Config(eight frozen sub-configs) now supersedes the flatSettings. SixRESTGDF_*env vars are aliased to structuredRESTGDF_<CATEGORY>_<FIELD>names; the old names still work and emitDeprecationWarningon read.Authentication: Default wire transport is now header-based (
X-Esri-Authorization); token refresh is single-flight, bounded- retry, reactive on 498/499, and emits structuredrestgdf.authlog events.expires_atis a tz-aware UTCdatetime.Observability: Named loggers under
restgdf.<suffix>(transport/retry/limiter/concurrency/auth/pagination/normalization/schema_drift) each attached with aNullHandler. Telemetry emits exactly one INTERNALfeature_layer.streamparent span periter_pagescall.Public API additions: adapters subpackage, pandas-first
FeatureLayer.get_df(),NormalizedGeometry/NormalizedFeatureiterator,AdvancedQueryCapabilitiestyped companion,PaginationPlan/build_pagination_plan, pure-helpernormalize_spatial_referenceandnormalize_date_fields, andResilientSession/RestgdfInstrumentor.
Quick upgrading checklist¶
TL;DR — follow these eight steps to migrate an existing 1.x codebase. Detailed explanations of each breaking change appear in the sections below.
Pin the geo extra if you use GeoDataFrames: change
restgdfto"restgdf[geo]"in requirements/lockfiles.Widen
exceptclauses:except RuntimeError:→except PaginationError:; replaceFIELDDOESNOTEXISTsentinel withexcept FieldDoesNotExistError:.Swap
get_settings()forget_config()(env vars keep old names during transition).Migrate
row_dict_generatortostream_rowsand pick anon_truncationpolicy.Audit auth config: header is now the default token transport; set
transport="body"if needed.Split
refresh_threshold_secondsintorefresh_leeway_seconds+clock_skew_seconds.Attach handlers to the named loggers you care about (
restgdf.auth,.pagination, etc.).(Optional) Install extras:
restgdf[resilience],restgdf[telemetry].
Breaking changes¶
Install
pip install restgdfno longer guarantees thatgeopandas,pandas, orpyogrioare importable.GeoDataFrame and pandas-backed helpers —
FeatureLayer.get_gdf(),sample_gdf(),head_gdf(),fieldtypes,get_fields_frame(), multi-fieldget_unique_values(),get_value_counts(),get_nested_count(), andrestgdf.utils.getgdfhelpers — now requirepip install "restgdf[geo]"and raiseOptionalDependencyError(subclass ofConfigurationError/ModuleNotFoundError) with an install hint when the optional stack is missing.Light-core workflows keep working without the geo stack: typed response models,
FeatureLayer.from_url,.metadata,.count,.get_oids(), raw row iteration, directory crawling, and token helpers.
Errors
getgdf/_get_sub_featuresraiserestgdf.errors.PaginationErrorinstead ofRuntimeErrorwhen a page reportsexceededTransferLimit=true. The exception now carriesbatch_indexandpage_sizeattributes.PaginationErrormulti- inheritsIndexError(2.x behavior preserved) but no longer multi-inheritsRuntimeError— migrateexcept RuntimeError:call sites toexcept PaginationError:/except ArcGISServiceError:/except RestgdfError:.The legacy
FIELDDOESNOTEXISTsentinel (and its re-export throughrestgdf.utils.getinfo) is gone. Call sites must nowexcept FieldDoesNotExistError(newSchemaValidationErrorsubclass).Metadata/query/crawl helpers that decode a successful JSON body matching the ArcGIS
{"error": {...}}envelope now raiseRestgdfResponseErrorimmediately withrawattached, instead of silently treating it as schema drift.
Auth
ArcGISTokenSessionnow defaults to sending the token via theX-Esri-Authorizationheader. If your server requires the legacy body/query transport, setAuthConfig(transport="body")orTokenSessionConfig(transport="body").refresh_leeway_secondsdefault raised from 60 → 120 — proactive token refresh now fires two minutes before expiry instead of one.refresh_threshold_secondsonTokenSessionConfigis retained as a deprecation alias; reads/writes emitDeprecationWarning. Migrate to the explicitrefresh_leeway_seconds+clock_skew_secondssplit (defaults120+30, total150seconds). When the legacy alias is passed to the constructor, the total is split asclock_skew_seconds = min(30, total)andrefresh_leeway_seconds = total - clock_skew_secondsso existing tunings survive.
Streaming
FeatureLayer.row_dict_generatoremitsDeprecationWarning. It delegates tostream_rows; migrate at your convenience.All
stream_*methods default toon_truncation="raise". Callers that previously silently ignoredexceededTransferLimit=truemust either opt intoon_truncation="ignore"(warn-and-continue) oron_truncation="split"(OID-bisect and recurse, up to 32 levels).
New public APIs¶
Errors (restgdf.errors)
RestgdfError
├── ConfigurationError(RestgdfError, ValueError)
│ └── OptionalDependencyError(ConfigurationError, ModuleNotFoundError)
├── RestgdfResponseError(RestgdfError, ValueError)
│ ├── SchemaValidationError
│ │ └── FieldDoesNotExistError
│ ├── ArcGISServiceError
│ │ └── PaginationError(ArcGISServiceError, IndexError) # .batch_index, .page_size
│ └── AuthenticationError(RestgdfResponseError, PermissionError)
│ ├── InvalidCredentialsError # HTTP 401
│ ├── TokenExpiredError # HTTP 498 after refresh
│ ├── TokenRequiredError
│ ├── TokenRefreshFailedError # /generateToken retries exhausted
│ └── AuthNotAttachedError # HTTP 499
├── TransportError
│ ├── RestgdfTimeoutError(TransportError, TimeoutError)
│ └── RateLimitError(TransportError) # .retry_after
└── OutputConversionError
All classes re-export from the top-level
restgdfpackage.RestgdfResponseError,TransportError,RestgdfTimeoutError, andRateLimitErrornow accept optionalurl,status_code,request_id, andtimeout_kindkwargs (all default toNone).RateLimitError.retry_afteris populated from the server’sRetry-Afterheader (integer seconds or RFC 7231 HTTP-date) by the resilience wrapper.
Streaming (FeatureLayer)
Method |
Yields |
Install |
|---|---|---|
|
one raw ArcGIS feature dict |
base |
|
|
base |
|
row-shaped dict (attrs + geometry) |
base |
|
|
|
|
raw page envelope |
base |
stream_features, stream_feature_batches, stream_rows, and
iter_pages share these knobs:
on_truncation: "raise" | "ignore" | "split"(default"raise").order: "request" | "completion"(default"request";"completion"may interleave pages — do not use for append-ordered downstream writers).max_concurrent_pages: int | None(defaultNone— bounded only byConcurrencyConfig.max_concurrent_requests).
stream_gdf_chunks is backed by the legacy chunk_generator pipeline
(pyogrio + ESRIJSON/GeoJSON parsing) rather than iter_pages. It
yields chunks in completion order and does not accept
on_truncation, order, or max_concurrent_pages, and it does not
emit the R-61 feature_layer.stream parent span. For geo output with
the full knob set, compose stream_rows or stream_features with
your own geometry assembly, or use get_gdf / get_gdf_list.
Each GeoDataFrame yielded by stream_gdf_chunks carries the layer’s
spatial reference in gdf.attrs["spatial_reference"] (R-65).
See docs/recipes/streaming.md for copy-pasteable examples.
Adapters (restgdf.adapters, lazy-loaded via PEP 562)
restgdf.adapters.dict—feature_to_row,features_to_rows, re-exportsas_dict/as_json_dict. Pure-Python, base-install safe.restgdf.adapters.stream— async iteratorsiter_feature_batches,iter_rows,iter_gdf_chunkswrapping the core generator helpers.iter_gdf_chunksrequires the geo extra at call time.restgdf.adapters.pandas—rows_to_dataframe(sync) +arows_to_dataframe(async). Callsrequire_pandas(...)before materialization; no pandas import at module load.restgdf.adapters.geopandas—rows_to_geodataframe+arows_to_geodataframe. Callsrequire_geo_stack(...)before materialization; no geopandas/pyogrio import at module load.
Tabular output (FeatureLayer.get_df)
Pandas-first sibling to get_gdf(). Returns a pandas.DataFrame built
from the same row stream; raises OptionalDependencyError
(extra="pandas") if pandas is missing. Geopandas is not required.
Response normalization (restgdf._models.responses)
NormalizedGeometry/NormalizedFeature+iter_normalized_features( response, *, oid_field=None, sr=None).Wire-level
FeaturesResponse.featuresstayslist[dict]for perf; normalization is opt-in via the iterator.NormalizedGeometry.typeis inferred heuristically from the geometry dict shape (point/multipoint/polyline/polygon/envelope/None).object_idis hoisted fromattributes[oid_field]viaint(value)and tolerates unparsable values by leavingobject_id=None.
Metadata helpers (restgdf.utils._metadata)
normalize_spatial_reference(sr)— pure helper returning(epsg_int | None, raw_dict | None). PreferslatestWkidoverwkidfor EPSG-consuming clients; preserves the original{wkid, latestWkid}mapping as the raw component for round-trip fidelity.normalize_date_fields(features, fields)— converts ArcGISesriFieldTypeDateepoch-ms integers to ISO-8601 UTC strings. Defaults preserve the integer representation; opt in withnormalize_dates=Trueon the adapter layer.GeoDataFrame.attrs["spatial_reference"]propagation throughconcat_gdfsandstream_gdf_chunks.
Pagination (restgdf.utils._pagination, re-exported via
restgdf.utils.getinfo)
Frozen
PaginationPlandataclass + syncbuild_pagination_plan(total_records, max_record_count, *, factor=1.0, advertised_factor=None). Emits(resultOffset, resultRecordCount)tuples byte-identical to the previous inline arithmetic. Whenfactorexceedsadvertised_factorthe planner clamps to the advertised value and logs a warning viarestgdf.pagination. In 2.0.0,get_query_data_batchesnow forwards liveadvancedQueryCapabilities.maxRecordCountFactorvalues intobuild_pagination_plan(advertised_factor=...)when the server advertises a positive numeric factor.
Advanced query capabilities (restgdf._models.responses)
AdvancedQueryCapabilities— typedPermissiveModelcompanion for the ArcGISadvancedQueryCapabilitiessub-object (five flags restgdf routes on:supportsPagination,supportsQueryByOIDs,supportsReturnExceededLimitFeatures,supportsPaginationOnAggregatedQueries,maxRecordCountFactor). Unknown keys survive via the permissive tier. Exposed as the additive companionLayerMetadata.advanced_query_capabilities_typed: AdvancedQueryCapabilities | None— the raw dict onadvanced_query_capabilitiesstays the default representation so permissive-tier consumers keep working byte-for-byte.
Transport protocols (restgdf._client)
AsyncHTTPSession—typing.Protocol(@runtime_checkable) capturing theget/post/close/closedsurface restgdf call sites rely on.isinstance(aiohttp.ClientSession(), AsyncHTTPSession)holds at runtime.
Drift observation (restgdf._models._drift)
FieldSetDriftObserver— observer class that tracks attribute-key appearance/disappearance across feature-page batches. Emitsfield_appeared/field_disappearedrecords via therestgdf.schema_driftlogger. Empty pages are skipped. Observation-only; never blocking.
Resilience extra (pip install restgdf[resilience])
restgdf.resilience.ResilientSessionwraps anyAsyncHTTPSessionwith stamina-based retry (429/5xx awareness, configurable max-attempts) and per-service-root token-bucket rate limiting.Controlled by
restgdf.ResilienceConfig(a peer sub-config onConfig.resilience). Disabled by default; opt in viaRESTGDF_RESILIENCE_ENABLED=1orResilienceConfig(enabled=True). When disabled,ResilientSessionis a zero-overhead pass-through.LimiterRegistry(backed byaiolimiter.AsyncLimiter) andCooldownRegistryprovide per-service-root rate limiting and separate 429 back-off. The_service_root()helper truncates the URL atFeatureServer/MapServer/ImageServer/SceneServerto derive the rate-limit key. Cooldown is separate from the token bucket — 429 back-off does NOT drain tokens.
Telemetry extra (pip install restgdf[telemetry])
restgdf.telemetry.RestgdfInstrumentor— dynamic subclass ofAioHttpClientInstrumentorthat adds CLIENT spans for every aiohttp request.feature_layer_stream_span— async context manager producing a single INTERNALfeature_layer.streamparent span (R-21). Now wired intoiter_pagesvia_iter_pages_rawso everystream_*call emits exactly one parent span. Usescontextlib.aclosingto ensure span cleanup on early break.span_context_fields()— convenience for non-restgdf loggers wanting the currenttrace_id/span_id._SpanContextFilterauto-attached to therestgdfroot logger; stampstrace_id/span_idon every log record when a span is active.Telemetry is disabled by default (
TelemetryConfig.enabled = False).import restgdf.telemetryalways succeeds; runtime functions raiseOptionalDependencyErrorwhen OTel is absent and telemetry is enabled.
Logging surface (restgdf._logging)
get_logger(suffix: str = "")returns a namedrestgdf.<suffix>logger with aNullHandlerattached.suffixmust be""or one ofLOGGER_SUFFIXES(transport,retry,limiter,concurrency,auth,pagination,normalization,schema_drift); unknown suffixes raiseValueError.build_log_extra(*, service_root=None, layer_id=None, operation=None, page_index=None, page_size=None, retry_attempt=None, retry_delay_s=None, limiter_wait_s=None, timeout_category=None, result_count=None, exception_type=None)returns a normalizedextra=envelope. Unknown keys raiseTypeError.service_rootis URL-scrubbed internally so?token=…values never appear in logs.get_drift_logger/ therestgdf.schema_driftlogger name remain unchanged;get_drift_logger()is a thin alias forget_logger("schema_drift").
Deprecations¶
restgdf._typesre-exports (LayerMetadata,ServiceInfo, etc.) — import directly fromrestgdforrestgdf._models.*. Alias shim emitsDeprecationWarning; removal no earlier than 3.x final.restgdf.Settings/restgdf.get_settings()— userestgdf.Config/restgdf.get_config()/restgdf.reset_config_cache().get_settings()emitsDeprecationWarningon first call and delegates toget_config().reset_settings_cache()clears both caches bidirectionally.TokenSessionConfig.refresh_threshold_seconds— userefresh_leeway_seconds+clock_skew_seconds.restgdf.get_token(synchronous helper) — migrate toArcGISTokenSessionfor async token lifecycle.get_tokennow acceptspydantic.SecretStrpasswords.FeatureLayer.row_dict_generator— useFeatureLayer.stream_rows. Behavior is equivalent;stream_rowsadds theon_truncation/order/max_concurrent_pagesknobs and emits the R-61 parent span when telemetry is enabled.
All deprecations are shim-backed — existing code keeps working and
emits DeprecationWarning. Removal of any deprecated surface will only
happen in a future major release.
Configuration¶
restgdf.Config composes eight frozen pydantic sub-configs:
TransportConfig— user-agent, default headers, verify-SSL.TimeoutConfig—total_s(default30.0); replaces the flatSettings.timeout_seconds.RetryConfig— stamina knobs surfaced via the resilience extra.LimiterConfig— aiolimiter token-bucket knobs.ConcurrencyConfig—max_concurrent_requests(default 8, matches aiohttpTCPConnectordefault). Enforced at the three internalasyncio.gathersites (orchestrator call paths), not at leaf helpers.AuthConfig— token URL, transport (default"header"),refresh_leeway_seconds(default 120),clock_skew_seconds(default30),referer, credentials.TelemetryConfig—enabled(defaultFalse), log level, span attributes.ResilienceConfig— opt-in wrapper for the stamina-based retry policy and per-service-root token-bucket rate limiter. Disabled by default (enabled=False); toggle viaRESTGDF_RESILIENCE_ENABLED=1or an explicitResilienceConfig(enabled=True, ...).
Use restgdf.get_config() to resolve the process-wide cached instance
and restgdf.reset_config_cache() to clear it (tests, long-running
processes).
Environment-variable aliases¶
Old flat RESTGDF_* variables are still honoured with
DeprecationWarning on read. When both old and new are set, the new
name wins and the warning notes the override.
Deprecated (still honoured) |
Replacement |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
RESTGDF_CHUNK_SIZE and RESTGDF_DEFAULT_HEADERS_JSON remain on
Settings only for now; they do not yet have a Config home.
Resilience / telemetry toggles:
RESTGDF_RESILIENCE_ENABLED=1turns on retry + rate-limiting.RESTGDF_TELEMETRY_ENABLED=1turns on OpenTelemetry spans.
Observability¶
Named loggers — one per responsibility:
restgdf.transport/retry/limiter/concurrency/auth/pagination/normalization/schema_drift. All attached with aNullHandler; opt in by adding a handler.Auth log events —
update_tokenemitsauth.refresh.start,auth.refresh.success, andauth.refresh.failureat DEBUG level viarestgdf.auth.Streaming spans —
iter_pagesemits exactly one INTERNALfeature_layer.streamparent span per call when the telemetry extra is installed andTelemetryConfig.enabled=True. The span is constructed insiderestgdf.utils.getgdf._iter_pages_raw. No per-page child spans are emitted.Tracing recipe —
docs/recipes/tracing.mddocuments structured observability, error-attribute inspection, and the OpenTelemetry integration.
Transport and auth hardening¶
Bounded internal concurrency. Every top-level orchestration call (
service_metadata,fetch_all_data,safe_crawl) caps in-flight HTTP fan-out through a singleasyncio.BoundedSemaphoresized toConcurrencyConfig.max_concurrent_requests. Saturation semantics = wait (no new exception). No public signature changed.Single-flight token refresh.
ArcGISTokenSession.update_tokenis guarded by a lazily-initialized per-instanceasyncio.Lockwith double-checkedtoken_needs_update(). Happy-path behavior (no contention, no expiry) is unchanged; under N concurrent requesters exactly one/generateTokenPOST is issued.Reactive 498/499.
_call_with_auth_retryintercepts HTTP 498 (token expired) with a single-flight refresh + one automatic retry. HTTP 499 (token not attached) raisesAuthNotAttachedErrorimmediately — no retry.Bounded token retry.
update_tokenretries transient network errors up to 3 times with exponential backoff (0.5 s → 1.0 s). Deterministic errors (invalid credentials, content-type mismatches) propagate immediately. After exhaustion,TokenRefreshFailedErroris raised with the last exception chained as__cause__.Feature-count timeout retry.
_feature_count_with_timeoutinrestgdf.utils.getinfowrapsget_feature_countwith exponential- backoff retry on timeout failures only (asyncio.TimeoutError,TimeoutError,aiohttp.ServerTimeoutError). Connection-level failures (aiohttp.ClientConnectionError) and deterministic failures (RestgdfResponseError, validation errors, 4xx) fail fast. Exhausted timeouts raiseRestgdfTimeoutErrorwith the original exception chained as__cause__. Inline-only; no soft-dep on the resilience extra.verify_sslplumbed through.ArcGISTokenSession.update_tokennow forwardsssl=self.verify_sslto aiohttp, matching the existing behavior of other library-maintained request sites.Safe-crawl concurrency bound.
Directory.safe_crawlroutes the per-layerfeature_countprobe through aBoundedSemaphoresized fromConcurrencyConfig.max_concurrent_requests.Request-verb seam.
restgdf.utils._http._choose_verb(url, body=None)returns"POST"for/query//queryRelatedRecords,"GET"for bare service/layer metadata URLs, and"POST"for unknown URLs. Internal seam only; call sites are unchanged in 3.0.Referer binding. When
TokenSessionConfig.referer/AuthConfig.refereris set,token_request_payloadincludes"referer": <url>and switches the ArcGISclientfield from"requestip"to"referer".UTC wall-clock expiry.
ArcGISTokenSession.expires_atreturns a tz-aware UTCdatetime.datetime(orNone)._utc_now()shim is monkeypatchable for deterministic time tests.
Upgrading checklist¶
Pin the geo extra if you materialize GeoDataFrames or call any of the pandas/geopandas helpers: change
restgdfto"restgdf[geo]"inrequirements.txt/pyproject.toml/ lockfiles / deployment manifests.Widen
exceptclauses on the small number of legacy shapes:except RuntimeError:aroundget_gdf/ feature-count paths →except PaginationError:(or the broaderArcGISServiceError/RestgdfError). ReplaceFIELDDOESNOTEXISTsentinel checks withexcept FieldDoesNotExistError:.Swap
get_settings()forget_config()if you touched the flatSettingsmodel directly; same forreset_settings_cache()→reset_config_cache(). Environment variables can stay on their old names during the transition.Migrate
row_dict_generatortostream_rows. Pick anon_truncationpolicy explicitly (default is now"raise"— previously pagination silently continued).Audit auth config. If you relied on the implicit body/query token transport, set
transport="body". Bump monitoring windows for the 60 → 120 second proactive-refresh shift if you alert onauth.refresh.startcadence.Split
refresh_threshold_secondsintorefresh_leeway_seconds+clock_skew_secondsinTokenSessionConfigto silence the deprecation warning.Attach a handler to the named loggers you care about (
restgdf.auth,restgdf.pagination,restgdf.schema_drift,restgdf.retry,restgdf.limiter,restgdf.transport,restgdf.concurrency,restgdf.normalization). They areNullHandler-muted by default.(Optional) Install the extras you want:
pip install "restgdf[resilience]"for retry + rate limiting,pip install "restgdf[telemetry]"for OpenTelemetry spans.
Migrating from restgdf 1.x to 2.0¶
restgdf 2.0 replaces the dict / TypedDict public surface with
pydantic 2.13 BaseModel classes. This means
every response and config object you consumed in 1.x is now a typed model:
attribute access instead of dict indexing, runtime validation instead of
silent KeyError, and structured drift logging instead of opaque failures.
This guide lists every breaking change, the migration aids shipped with 2.0, and the new capabilities you can opt into.
Contents¶
Why 2.0¶
Real-world ArcGIS REST responses vary between vendor versions, deployments,
and service types. In 1.x these variances surfaced as KeyError /
IndexError deep in call stacks, or silently passed through as partially
valid dicts. 2.0 fixes that:
Pydantic-validated envelopes catch malformed responses at the boundary and raise a typed
RestgdfResponseErrorthat carries the raw payload and request context.Permissive models (
LayerMetadata,ServiceInfo,FieldSpec, crawl models) accept unknown extra keys and tolerate missing optional fields; drift is reported through a dedicatedrestgdf.schema_driftlogger instead of crashing.Strict models (
CountResponse,ObjectIdsResponse,TokenResponse,ErrorResponse) keep their fail-fast contract on operation-critical payloads.Typed credentials — passwords are
pydantic.SecretStr, redacted fromrepr()/ logs.
Breaking changes¶
Every change below is a public-API shape change from 1.x.
What changed |
1.x (dict) |
2.0 (model) |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
(did not exist) |
|
|
returns |
returns |
|
returns |
returns |
|
returns |
returns |
|
returns |
returns |
|
returns raw |
unchanged signature, but internally validates; see |
|
plain |
|
|
ad-hoc dataclass |
backed by |
|
|
re-exported pydantic models, emit |
Code rewrites¶
Dict indexing → attribute access. The examples below are representative, not exhaustive.
FeatureLayer.metadata
# 1.x
layer_name = fl.metadata["name"]
fields = fl.metadata["fields"]
max_record_count = fl.metadata["maxRecordCount"]
# 2.0
layer_name = fl.metadata.name
fields = fl.metadata.fields # list[FieldSpec] | None
max_record_count = fl.metadata.max_record_count
Directory.services
# 1.x
for svc in directory.services:
print(svc["name"], svc["url"])
# 2.0
for svc in directory.services: # list[CrawlServiceEntry]
print(svc.name, svc.url)
if svc.metadata is not None: # LayerMetadata | None
print(svc.metadata.max_record_count)
AGOLUserPass
# 1.x
creds = AGOLUserPass(username="alice", password="hunter2")
token_form["password"] = creds.password
# 2.0
creds = AGOLUserPass(username="alice", password="hunter2")
token_form["password"] = creds.password.get_secret_value()
get_metadata
# 1.x
md = await get_metadata(url, session)
service_type = md.get("type")
# 2.0
md = await get_metadata(url, session) # LayerMetadata
service_type = md.type # str | None
ArcGIS camelCase round-trip¶
Models accept either camelCase (native ArcGIS) or snake_case input via
pydantic.AliasChoices. To emit camelCase for an ArcGIS round-trip, use
model.model_dump(by_alias=True). To get Python-native snake_case keys,
use model.model_dump() or the restgdf.compat.as_dict helper.
Migration aids¶
restgdf.compat.as_dict(obj)¶
Wrap any returned model to get a plain Python dict. Non-model values
(plain dicts, None, primitives) pass through unchanged, so you can
sprinkle it through transitional code without type checks:
from restgdf.compat import as_dict
for entry in directory.services:
row = as_dict(entry) # dict whether model or legacy
save(row["name"], row.get("url"))
as_dict uses model_dump(mode="python", by_alias=False) — snake_case
keys, nested models recursively converted.
restgdf.compat.as_json_dict(obj)¶
Like as_dict, but mode="json" so every value is JSON-serializable
(SecretStr → "**********" placeholder, datetime → ISO string).
Handy for structured logging:
from restgdf.compat import as_json_dict
logger.info("crawl_result", extra={"payload": as_json_dict(report)})
Deprecated restgdf._types aliases¶
from restgdf._types import LayerMetadata still works; it now returns
the pydantic class and emits a DeprecationWarning:
import warnings
warnings.filterwarnings("default", category=DeprecationWarning)
from restgdf._types import LayerMetadata # DeprecationWarning
Switch the import to from restgdf import LayerMetadata. The shim will
be removed in 3.x.
New capabilities¶
Typed response errors¶
Strict-tier envelopes raise RestgdfResponseError on validation failure,
carrying the raw payload and context:
from restgdf import RestgdfResponseError, get_feature_count
try:
count = await get_feature_count(url, session)
except RestgdfResponseError as exc:
logger.error(
"ArcGIS returned malformed count envelope",
extra={
"model": exc.model_name, # "CountResponse"
"context": exc.context, # the request URL
"raw": exc.raw, # the decoded body
},
)
raise
Permissive payloads still log ordinary vendor drift, but they no longer treat a
top-level ArcGIS JSON error envelope ({"error": {...}}) as harmless metadata
variance. If a metadata/query/crawl helper decodes JSON successfully and the
payload is an ArcGIS error envelope, restgdf now raises
RestgdfResponseError immediately with the original raw body attached.
Schema-drift observability¶
Permissive models never raise on ordinary vendor variance (unknown extras,
missing optional fields, bad optional field types); instead they log one
record per (model_name, path, kind, value_type) tuple through
restgdf.schema_drift. The logger is installed with a NullHandler by
default — opt in by attaching a handler (see below). Top-level ArcGIS error
envelopes remain fail-fast and are surfaced as RestgdfResponseError rather
than schema drift.
Pydantic round-trip¶
You can validate, inspect, and re-emit any response payload:
from restgdf import LayerMetadata
md = LayerMetadata.model_validate(raw_dict)
native = md.model_dump(by_alias=True) # camelCase, ArcGIS-compatible
python = md.model_dump() # snake_case, Python-native
Centralized Settings / get_settings()¶
See the Settings usage section below.
SecretStr on credential passwords¶
See the SecretStr credentials section below.
Environment variables¶
All settings are overridable via RESTGDF_* env vars. Unset vars use
the documented default.
Variable |
Field |
Type |
Default |
|---|---|---|---|
|
|
int (>0) |
|
|
|
float (>0) |
|
|
|
str (non-empty) |
|
|
|
one of |
|
|
|
|
|
|
|
int (≥0) |
|
|
|
JSON dict string |
|
Malformed values raise RestgdfResponseError at first access.
Drift logger¶
restgdf.schema_drift is the single logger name. It is silent by
default — install a handler to see what ArcGIS deployments are sending
you:
import logging
drift_logger = logging.getLogger("restgdf.schema_drift")
drift_logger.setLevel(logging.DEBUG)
drift_logger.addHandler(logging.StreamHandler())
# Now any permissive model drift is visible:
# WARNING restgdf.schema_drift: LayerMetadata.max_record_count missing at <url>
# DEBUG restgdf.schema_drift: LayerMetadata unknown extra 'foo' at <url>
Log levels:
WARNING — a field
restgdfactually consumes is missing or has the wrong shape. Library behavior is preserved (defaults toNone), but operators likely want to see this.DEBUG — an unknown-extra key is present. Purely informational.
Drift events are deduped per process via (model_name, path, kind, value_type) so repeated calls against the same drifty server don’t
spam the log.
Settings usage¶
Settings is a frozen pydantic BaseModel. get_settings() returns a
process-cached instance; tests can reset the cache to pick up environment
changes:
import os
from restgdf import Settings, get_settings
from restgdf._models._settings import reset_settings_cache
# Default: read from os.environ.
settings = get_settings()
print(settings.chunk_size, settings.user_agent)
# Programmatic override.
settings = Settings(chunk_size=250, user_agent="my-app/1.0")
# In tests, mutate env then reset the cache.
os.environ["RESTGDF_CHUNK_SIZE"] = "500"
reset_settings_cache()
assert get_settings().chunk_size == 500
# Bypass the real environment entirely:
settings = Settings.from_env({"RESTGDF_TOKEN_URL": "http://internal/arcgis"})
SecretStr credentials¶
AGOLUserPass.password is a pydantic.SecretStr: its literal value is
never in repr() or str(), so it is safe for log records, tracebacks,
and error reports.
from restgdf import AGOLUserPass
creds = AGOLUserPass(username="alice", password="hunter2")
print(creds)
# username='alice' password=SecretStr('**********') ...
# Unwrap only at the HTTP-POST boundary.
password_str = creds.password.get_secret_value()
Do not store or log the unwrapped value.
Troubleshooting¶
AttributeError: 'LayerMetadata' object has no attribute 'get'¶
You’re calling .get(...) on what used to be a dict. Switch to attribute
access:
# old
name = md.get("name", "unknown")
# new
name = md.name or "unknown"
Or wrap it: as_dict(md).get("name", "unknown").
TypeError: 'LayerMetadata' object is not subscriptable¶
Indexing (md["name"]) is gone. Use md.name, or as_dict(md)["name"]
during a transitional window.
RestgdfResponseError: Settings validation failed¶
A RESTGDF_* env var is malformed (for example, RESTGDF_CHUNK_SIZE=0
fails gt=0). Check exc.raw for the offending values and exc.context
for the origin ("Settings.from_env").
RestgdfResponseError from get_feature_count / get_object_ids / update_token¶
The ArcGIS server returned a payload that did not match the strict
envelope (often an HTML error page or an {"error": {...}} body).
exc.model_name identifies the expected envelope, exc.context holds
the request URL, and exc.raw holds the decoded body for triage.
Silencing the deprecation warnings¶
During migration you may want to suppress the restgdf._types.*
DeprecationWarning without papering over others:
import warnings
warnings.filterwarnings(
"ignore",
message=r"restgdf\._types\..* is deprecated",
category=DeprecationWarning,
)
Remove the filter once all imports are updated.
SecretStr string coercion¶
str(creds.password) returns "**********", not the password. If some
library expects a plain str, unwrap explicitly with
creds.password.get_secret_value().