pynenc.app

Module Contents

Classes

Pynenc

The main class of the Pynenc library that creates an application object.

API

class pynenc.app.Pynenc(app_id: str | None = None, config_values: Optional[dict[str, Any]] = None, config_filepath: Optional[str] = None)[source]

The main class of the Pynenc library that creates an application object.

Parameters:
  • app_id (Optional[str]) – The id of the application.

  • config_values (Optional[dict[str, Any]]) – A dictionary of configuration values.

  • config_filepath (Optional[str]) – A path to a configuration file.

Note

All of these base classes are abstract and cannot be used directly. If none is specified, they will default to MemTaskBroker, MemStateBackend, etc. These default classes do not actually distribute the code but are helpers for tests or for running an application on your localhost. They may help to parallelize to some degree but cannot be used in a production system.

Initialization

property app_id: str
property tasks: dict[str, pynenc.task.Task]

Get the dictionary of registered tasks.

Returns:

A dictionary mapping task_id to Task instances.

get_task(task_id: str) Optional[pynenc.task.Task][source]

Get a task by its ID.

Parameters:

task_id – The ID of the task to retrieve.

Returns:

The Task instance if found, None otherwise.

__getstate__() dict[source]
__setstate__(state: dict) None[source]
conf() pynenc.conf.config_pynenc.ConfigPynenc
logger() logging.Logger
orchestrator() pynenc.orchestrator.base_orchestrator.BaseOrchestrator
broker() pynenc.broker.base_broker.BaseBroker
state_backend() pynenc.state_backend.base_state_backend.BaseStateBackend
serializer() pynenc.serializer.base_serializer.BaseSerializer
arg_cache() pynenc.arg_cache.base_arg_cache.BaseArgCache
property runner: pynenc.runner.base_runner.BaseRunner

Get the runner for this app, prioritizing thread/process-specific context.

First, it checks the thread-local context for a runner (via get_current_runner). This is crucial in the MultiThreadRunner, where each process runs a ThreadRunner and needs to use its own runner instance rather than the app’s default.

If no context runner exists, it falls back to the instance-level runner. This mechanism ensures correct runner isolation across threads and processes.

Returns:

The runner instance for the current context or the app instance.

purge() None[source]

Purge all data from the broker and state backend

task(func: Optional[pynenc.types.Func] = None, *, parallel_batch_size: Optional[int] = None, retry_for: Optional[tuple[type[Exception], ...]] = None, max_retries: Optional[int] = None, running_concurrency: Optional[pynenc.conf.config_task.ConcurrencyControlType] = None, registration_concurrency: Optional[pynenc.conf.config_task.ConcurrencyControlType] = None, key_arguments: Optional[tuple[str, ...]] = None, on_diff_non_key_args_raise: Optional[bool] = None, call_result_cache: Optional[bool] = None, disable_cache_args: Optional[tuple[str, ...]] = None) pynenc.task.Task | Callable[[pynenc.types.Func], pynenc.task.Task][source]

The task decorator converts the function into an instance of a BaseTask. It accepts any kind of options, however these options will be validated with the options class assigned to the class.

Parameters:
  • func (Optional[Callable]) – The function to be converted into a Task instance.

  • parallel_batch_size (Optional[int]) – If set to 0, auto parallelization is disabled. If greater than 0, tasks with iterable arguments are automatically split into chunks.

  • retry_for (Optional[Tuple[Exception, ]]) – Exceptions for which the task should be retried.

  • max_retries (Optional[int]) – The maximum number of retries for a task.

  • running_concurrency (Optional[ConcurrencyControlType]) – Controls the concurrency behavior of the task.

  • registration_concurrency (Optional[ConcurrencyControlType]) – Manages task registration concurrency.

  • key_arguments (Optional[Tuple[str, ]]) – Key arguments for concurrency control.

  • on_diff_non_key_args_raise (Optional[bool]) – If True, raises an exception for task invocations with matching key arguments but different non-key arguments.

  • call_result_cache (Optional[bool]) – If True, it will return the latest result of a Task with the same arguments if availble, otherwise it will trigger a new invocation as expected.

  • disable_cache_args (Optional[tuple[str, ]]) – Arguments to exclude from caching, it will accept “*” to disable caching for all arguments.

Returns:

A Task instance or a callable that when called returns a Task instance.

Example:

@app.task(parallel_batch_size=10, max_retries=3)
def my_func(x, y):
    return x + y
direct_task(func: Optional[Func[Params, Result]] = None, *, parallel_func: Optional[pynenc.app.ParallelFunc] = None, aggregate_func: Optional[pynenc.app.AggregateFunc] = None, parallel_batch_size: Optional[int] = None, retry_for: Optional[tuple[type[Exception], ...]] = None, max_retries: Optional[int] = None, running_concurrency: Optional[pynenc.conf.config_task.ConcurrencyControlType] = None, registration_concurrency: Optional[pynenc.conf.config_task.ConcurrencyControlType] = None, key_arguments: Optional[tuple[str, ...]] = None, on_diff_non_key_args_raise: Optional[bool] = None, call_result_cache: Optional[bool] = None, disable_cache_args: Optional[tuple[str, ...]] = None) Func[Params, Result] | Callable[[Func[Params, Result]], Func[Params, Result]][source]

Create a task that directly returns its result rather than returning an invocation.

This decorator maintains the original function’s behavior:

  • For synchronous functions, it waits for the result and returns it directly

  • For async functions, it returns an awaitable that resolves to the result

It also supports parallel execution via the parallel_func parameter, which takes a function that generates arguments for parallel processing, and aggregate_func, which combines the results.

Parameters:
  • func (Optional[Func]) – The function to be converted into a Task instance that returns results directly.

  • parallel_func (Optional[ParallelFunc]) –

    Function that takes a dict of key arguments and returns either:

    1. An iterable of parameters for parallel execution (can be tuples, dicts, or Arguments)

      # Example returning just parameters
      lambda args: [(i, i+1) for i in range(5)]  # Returns tuples
      lambda args: [{"x": i, "y": i+1} for i in range(5)]  # Returns dicts
      
    2. A tuple containing (common_args, param_iter) for efficient handling of large shared data:

      • common_args: Dictionary of arguments shared by all parallel tasks

      • param_iter: Iterable of dictionaries with task-specific arguments

      # Example with common arguments
      lambda args: {
          "common_args": {"large_data": args["large_data"]},  # Shared data (serialized once)
          "param_iter": [{"index": i} for i in range(10)]  # Task-specific args
      }
      

      This second approach provides major performance benefits when dealing with large shared arguments (20MB+) as they’re serialized only once instead of for each parallel task.

  • aggregate_func (Optional[AggregateFunc]) – Function that takes a list of results and aggregates them into a single result.

  • parallel_batch_size (Optional[int]) – If set to 0, auto parallelization is disabled. If greater than 0, tasks with iterable arguments are automatically split into chunks.

  • retry_for (Optional[Tuple[Exception, ]]) – Exceptions for which the task should be retried.

  • max_retries (Optional[int]) – The maximum number of retries for a task.

  • running_concurrency (Optional[ConcurrencyControlType]) – Controls the concurrency behavior of the task.

  • registration_concurrency (Optional[ConcurrencyControlType]) – Manages task registration concurrency.

  • key_arguments (Optional[Tuple[str, ]]) – Key arguments for concurrency control.

  • on_diff_non_key_args_raise (Optional[bool]) – If True, raises an exception for task invocations with matching key arguments but different non-key arguments.

  • call_result_cache (Optional[bool]) – If True, it will return the latest result of a Task with the same arguments if available, otherwise it will trigger a new invocation as expected.

  • disable_cache_args (Optional[tuple[str, ]]) – Arguments to exclude from caching, it will accept “*” to disable caching for all arguments.

Returns:

A function that behaves like the original but is backed by a distributed task system.

Example:

@app.direct_task(max_retries=3)
def my_func(x, y):
    return x + y

# This will return the result directly
result = my_func(1, 2)  # Returns 3

# With parallel execution
@app.direct_task(
    parallel_func=lambda _: [(i, i+1) for i in range(5)],
    aggregate_func=sum
)
def add_parallel(x, y):
    return x + y

result = add_parallel(0, 0)  # Returns sum of all parallel results

# With optimized pre-serialization of large shared data
@app.direct_task(
    parallel_func=lambda args: {
        "common_args": {"large_data": args["large_data"]},
        "param_iter": [{"index": i} for i in range(100)]
    },
    aggregate_func=lambda results: sum(r[0] for r in results)
)
def process_data(large_data: str, index: int = 0) -> tuple[int, int]:
    # Process large data with multiple parallel tasks
    return (len(large_data) + index, index)

# Calling with 20MB of data
huge_data = "x" * (20 * 1024 * 1024)
result = process_data(huge_data)  # Pre-serializes huge_data only once