Utilities¶
Low-level helpers used by FeatureLayer and
Directory. Most users will never call these directly, but
they are public API and safe to use.
restgdf.utils.crawl¶
- async restgdf.utils.crawl.fetch_all_data(session, base_url, token=None, return_feature_count=False)[source]¶
Fetch all services and their layers in a highly concurrent manner.
- async restgdf.utils.crawl.safe_crawl(session, base_url, token=None, return_feature_count=False)[source]¶
Crawl an ArcGIS REST directory and aggregate results + errors.
Unlike
fetch_all_data(), this function never short-circuits on the first failure. Every recoverable error is captured as a typedCrawlErrorentry inCrawlReport.errorsand successful services are always present inCrawlReport.services.The three failure stages are
"base_metadata"(rootget_metadatacall),"folder_metadata"(per-folderget_metadatacall), and"service_metadata"(per-serviceservice_metadatacall). When a folder’s metadata fails, services discovered in earlier folders (and the base) are still returned.
restgdf.utils.getgdf¶
Get a GeoDataFrame from an ArcGIS FeatureLayer.
- restgdf.utils.getgdf.combine_where_clauses(base_where, extra_where)[source]¶
Combine where clauses without changing the default all-records predicate.
- restgdf.utils.getgdf.chunk_values(values, chunk_size)[source]¶
Split values into evenly-sized chunks.
- async restgdf.utils.getgdf.get_query_data_batches(url, session, **kwargs)[source]¶
Build query payloads for each request needed to read a layer.
restgdf.utils.getinfo¶
A package for getting GeoDataFrames from ArcGIS FeatureLayers.
Phase 2 split: this module is now a compatibility shim that re-exports public
names from the private submodules _http, _metadata, _query, and
_stats. The orchestration helpers get_offset_range and
service_metadata remain DEFINED here so that patches against
restgdf.utils.getinfo.<helper> continue to intercept their sibling calls
(see tests/test_getinfo_seams.py).
The from aiohttp import ClientSession line is PUBLIC API: tests patch
restgdf.utils.getinfo.ClientSession.post / .get. Do not remove it.
- class restgdf.utils.getinfo.ClientSession(base_url=None, *, connector=None, loop=None, cookies=None, headers=None, proxy=None, proxy_auth=None, skip_auto_headers=None, auth=None, json_serialize=<function dumps>, request_class=<class 'aiohttp.client_reqrep.ClientRequest'>, response_class=<class 'aiohttp.client_reqrep.ClientResponse'>, ws_response_class=<class 'aiohttp.client_ws.ClientWebSocketResponse'>, version=(1, 1), cookie_jar=None, connector_owner=True, raise_for_status=False, read_timeout=_SENTINEL.sentinel, conn_timeout=None, timeout=_SENTINEL.sentinel, auto_decompress=True, trust_env=False, requote_redirect_url=True, trace_configs=None, read_bufsize=65536, max_line_size=8190, max_field_size=8190, max_headers=128, fallback_charset_resolver=<function ClientSession.<lambda>>, middlewares=(), ssl_shutdown_timeout=_SENTINEL.sentinel)[source]¶
Bases:
objectFirst-class interface for making HTTP requests.
- ATTRS = frozenset({'_auto_decompress', '_base_url', '_base_url_origin', '_connector', '_connector_owner', '_cookie_jar', '_default_auth', '_default_headers', '_default_proxy', '_default_proxy_auth', '_json_serialize', '_loop', '_max_field_size', '_max_headers', '_max_line_size', '_middlewares', '_raise_for_status', '_read_bufsize', '_request_class', '_requote_redirect_url', '_resolve_charset', '_response_class', '_retry_connection', '_skip_auto_headers', '_source_traceback', '_timeout', '_trace_configs', '_trust_env', '_version', '_ws_response_class', 'requote_redirect_url'})¶
- ws_connect(url, *, method='GET', protocols=(), timeout=_SENTINEL.sentinel, receive_timeout=None, autoclose=True, autoping=True, heartbeat=None, auth=None, origin=None, params=None, headers=None, proxy=None, proxy_auth=None, ssl=True, verify_ssl=None, fingerprint=None, ssl_context=None, server_hostname=None, proxy_headers=None, compress=0, max_msg_size=4194304)[source]¶
Initiate websocket connection.
- property cookie_jar: AbstractCookieJar¶
The session cookies.
- property loop: AbstractEventLoop¶
Session’s loop.
- property timeout: ClientTimeout¶
Timeout for the session.
- property raise_for_status: bool | Callable[[ClientResponse], Awaitable[None]]¶
Should ClientResponse.raise_for_status() be called for each response.
- restgdf.utils.getinfo.default_data(data=None, default_dict=None)[source]¶
Return a dict with default values for ArcGIS REST API requests.
- restgdf.utils.getinfo.default_headers(headers=None)[source]¶
Return request headers merged with ArcGIS-compatible defaults.
- async restgdf.utils.getinfo.get_feature_count(url, session, **kwargs)[source]¶
Get the feature count for a layer.
The JSON body is validated against
CountResponse(strict tier). A missing/ill-typedcountkey raisesRestgdfResponseErrorwith the original payload and request URL attached for operator triage.
- restgdf.utils.getinfo.get_fields_frame(layer_metadata)[source]¶
Get the fields of a layer as a DataFrame.
- restgdf.utils.getinfo.get_max_record_count(metadata)[source]¶
Get the maximum record count for a layer.
- async restgdf.utils.getinfo.get_metadata(url, session, token=None)[source]¶
Get the parsed metadata model for a layer.
The JSON body is validated against
LayerMetadata(permissive tier). Vendor-variance extras are preserved viaextra="allow"; missing fields default toNonerather than raise. Drift is logged throughrestgdf._models._driftrather than returned to the caller.
- restgdf.utils.getinfo.get_object_id_field(metadata)[source]¶
Get the object id field name for a layer.
- async restgdf.utils.getinfo.get_object_ids(url, session, **kwargs)[source]¶
Get the object id field name and matching object ids for a layer query.
The JSON body is validated against
ObjectIdsResponse(strict tier) so missing field names or non-list id payloads raiseRestgdfResponseErrorbefore the caller can misuse them. ArcGIS returnsobjectIds: nullfor zero-row matches; the model coerces that to[].
- async restgdf.utils.getinfo.get_offset_range(url, session, **kwargs)[source]¶
Get the offset range for a layer.
Orchestrator: resolves
get_feature_count,get_metadata, andget_max_record_countthrough this module’s namespace so thatunittest.mock.patch("restgdf.utils.getinfo.<helper>")intercepts.
- async restgdf.utils.getinfo.get_unique_values(url, fields, session, sortby=None, **kwargs)[source]¶
Get the unique values for a field.
- async restgdf.utils.getinfo.get_value_counts(url, field, session, **kwargs)[source]¶
Get the value counts for a field.
- restgdf.utils.getinfo.getfields(layer_metadata, types=False)¶
Get the fields of a layer.
- restgdf.utils.getinfo.getfields_df(layer_metadata)¶
Get the fields of a layer as a DataFrame.
- async restgdf.utils.getinfo.getuniquevalues(url, fields, session, sortby=None, **kwargs)¶
Get the unique values for a field.
- async restgdf.utils.getinfo.getvaluecounts(url, field, session, **kwargs)¶
Get the value counts for a field.
- async restgdf.utils.getinfo.nested_count(url, fields, session, **kwargs)[source]¶
Get the nested value counts for a field.
- async restgdf.utils.getinfo.nestedcount(url, fields, session, **kwargs)¶
Get the nested value counts for a field.
- async restgdf.utils.getinfo.service_metadata(session, service_url, token=None, return_feature_count=False)[source]¶
Asynchronously retrieve layers for a single service.
Orchestrator: resolves
get_metadataandget_feature_countthrough this module’s namespace so thatunittest.mock.patch("restgdf.utils.getinfo.<helper>")intercepts. The aggregated payload is validated againstLayerMetadatavia the drift adapter before being returned, so vendor-variance extras are logged (not raised) and callers get a typed envelope.
restgdf.utils.utils¶
restgdf.utils.token¶
Token-session helpers for ArcGIS Online / Enterprise.
The AGOLUserPass and TokenSessionConfig models live in
restgdf._models.credentials. They are re-exported here for
backward compatibility with from restgdf.utils.token import
AGOLUserPass and with the public from restgdf import AGOLUserPass
surface documented in the README. The legacy frozen dataclass
AGOLUserPass was migrated to a pydantic StrictModel in v2.0.0;
the import path is unchanged but the constructor is keyword-only.
- class restgdf.utils.token.AGOLUserPass(*, username, password, referer=None, expiration=60)[source]¶
Bases:
StrictModelArcGIS Online / Enterprise credentials used to mint tokens.
passwordis stored aspydantic.SecretStr. Callcreds.password.get_secret_value()only at the HTTP-POST boundary; never store or log the unwrapped value.- password: SecretStr¶
- model_config = {'extra': 'ignore', 'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class restgdf.utils.token.ArcGISTokenSession(session, credentials=None, token_url='https://www.arcgis.com/sharing/rest/generateToken', token_refresh_threshold=60, token=None, expires=None, verify_ssl=True, config=None)[source]¶
Bases:
objectWrap an aiohttp session with ArcGIS token refresh behavior.
Construction knobs (
token_url,token_refresh_threshold,credentials) are validated viaTokenSessionConfigin__post_init__()so a bogus scheme or zero-length username fails fast withRestgdfResponseErrorrather than surfacing as a 401 or anaiohttperror deep in the request path.- session: ClientSession¶
- credentials: AGOLUserPass | None = None¶
- config: TokenSessionConfig | None = None¶
- update_dict(input_dict=None)[source]¶
Return a request payload/query dict merged with the active token.
- async update_token()[source]¶
Update the token by making a request to the token URL.
The
/generateTokenpayload is validated againstTokenResponse(strict tier) so malformed/error envelopes raiseRestgdfResponseErrorinstead ofKeyErrordeep in caller code paths.
- class restgdf.utils.token.TokenSessionConfig(*, token_url, credentials, refresh_threshold_seconds=60, verify_ssl=True)[source]¶
Bases:
StrictModelValidated configuration for
ArcGISTokenSession.token_urlis intentionally a plainstrwith a custom validator rather thanpydantic.AnyHttpUrl. ArcGIS Enterprise deployments commonly run plain HTTP on internal networks, andAnyHttpUrlnormalizes/rejects real-world URLs (for example it appends trailing slashes and may reject edge cases). Accepting anyhttp://orhttps://string matches the behavior ArcGIS clients need.- credentials: AGOLUserPass¶
- model_config = {'extra': 'ignore', 'populate_by_name': True, 'validate_by_alias': True, 'validate_by_name': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].