pynenc.client_data_store.base_client_data_store¶
Base class and interface for the ClientDataStore system.
Manages serialization and external storage of client-provided data: task arguments, results, and exceptions. Small values pass through as inline serialized strings. Large values are stored externally with content-hash keys for automatic deduplication.
Key components:
BaseClientDataStore: Abstract base with serialize/deserialize and size-based routing
Module Contents¶
Classes¶
Manages serialization and storage of client-provided data. |
Functions¶
Generate a content-hash reference key from serialized data. |
API¶
- class pynenc.client_data_store.base_client_data_store.BaseClientDataStore(app: pynenc.app.Pynenc)[source]¶
Bases:
abc.ABCManages serialization and storage of client-provided data.
Handles task arguments, results, and exceptions. Small values are returned inline as serialized strings. Large values are stored externally and referenced by content-hash keys for automatic deduplication.
Deduplication is achieved through deterministic content-hashing: the same serialized value always produces the same SHA-256 key, so backends naturally deduplicate via INSERT OR REPLACE / upsert semantics.
A small process-local LRU cache avoids repeated backend reads for recently deserialized objects.
Subclasses implement three abstract methods for backend storage:
_store,_retrieve, and_purge.Initialization
Initialize with app reference.
- Parameters:
app (Pynenc) – The Pynenc application instance
- conf() pynenc.conf.config_client_data_store.ConfigClientDataStore¶
Get the client data store configuration.
- serialize_arguments(kwargs: dict[str, Any], disable_cache_args: tuple[str, ...]) dict[str, str][source]¶
Serialize task arguments, externalizing large values.
The
disable_cache_argsconfig controls which arguments always stay inline (never externalized). Use("*",)to disable external storage for all arguments.Deduplication of identical values is handled by content-hash keys: the same serialized content always maps to the same storage key.
- deserialize_arguments(serialized_args: dict[str, str]) dict[str, Any][source]¶
Deserialize argument values, resolving any external references.
Each value is checked: if it is a ClientDataStore reference key, the value is loaded from external storage first, then deserialized. Inline values are deserialized directly.
- serialize(obj: Any, disable_cache: bool = False) str[source]¶
Serialize an object, storing externally if it meets size thresholds.
Returns either an inline serialized string (small values) or a reference key pointing to externally stored data (large values).
- Parameters:
obj (Any) – Object to serialize
disable_cache (bool) – If True, always return inline serialized string
- Returns:
Serialized string or reference key
- resolve(data: str) Any[source]¶
Resolve a serialized value to a Python object.
If the value is a reference key, loads the data from external storage first. Otherwise deserializes directly.
- Parameters:
data (str) – Serialized string or reference key
- Returns:
The deserialized Python object
- deserialize(data: str) Any[source]¶
Alias for
resolve()— resolve a serialized value to a Python object.… deprecated:: Use
resolve()instead for clarity.- Parameters:
data (str) – Serialized string or reference key
- Returns:
The deserialized Python object
- is_reference(value: str) bool[source]¶
Check if a string is a reference key to externally stored data.
- Parameters:
value (str) – String to check
- Returns:
True if this is a reference key
- _maybe_store(serialized: str) str[source]¶
Route serialized data to external storage or return inline.
Below min_size_to_cache: return inline. Above max_size_to_cache (if set): return inline with warning. Otherwise: store externally and return reference key.
- _resolve_reference(ref_key: str) Any[source]¶
Resolve a reference key to the deserialized object.
Uses a small process-local LRU cache to avoid repeated backend reads for the same key within a single process.
- _cache_deserialized(key: str, obj: Any) None[source]¶
Add to LRU cache, evicting oldest if at capacity.
- abstractmethod _store(key: str, value: str) None[source]¶
Store a serialized value by its content-hash key.
Backends should use upsert/INSERT OR REPLACE semantics so that storing the same key twice is a no-op (content-hash deduplication).