Parallel Process¶

class parallelformers.parallel.process.ForkingPickler(*args)[source]¶

Bases: _pickle.Pickler

Copy of ForkingPickler of multiprocessing module

classmethod register(type, reduce) → None[source]¶: Register reduce methods for multiprocessing

classmethod dumps(obj: Any, protocol=None) → memoryview[source]¶: Dump objects for multiprocessing

loads(*, fix_imports=True, encoding='ASCII', errors='strict')¶

Read and return an object from the given pickle data.

The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.

Optional keyword arguments are fix_imports, encoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is True, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.

class parallelformers.parallel.process.ParallelProcess(model: torch.nn.modules.module.Module, fp16: bool, rank: int, num_gpus: int, inputs_queue: multiprocessing.context.BaseContext.Queue, outputs_queue: multiprocessing.context.BaseContext.Queue, parallel_mutex: multiprocessing.context.BaseContext.Event, inference_mutex: multiprocessing.context.BaseContext.Event, verbose: str, backend: str, custom_policies: Union[parallelformers.policies.base.policy.Policy, List[parallelformers.policies.base.policy.Policy]])[source]¶

Bases: multiprocessing.context.Process

Parallelization process class

Parameters

model (nn.Module) – model weights
fp16 – (bool): whether use FP16 or not.
rank (int) – current GPU rank
num_gpus (int) – number of gpus for parallelization
inputs_queue (mp.Queue) – input data queue from user
outputs_queue (mp.Queue) – output data queue from model
parallel_mutex (mp.Event) – mutex object to notify parallelization state
inference_mutex (mp.Event) – mutex object to notify inference state
verbose (str) – turn on gpu summary
backend (str) – distributed process backend
custom_policies (Union[Policy, List[Policy]]) – user customized policy objects

Notes

ParallelProcess object handles below two tasks.

Parallelize the model
Handle the inference state

set_environ(rank: int) → None[source]¶

Set environment variable of current process

Parameters: rank (int) – current GPU rank

destroy() → None[source]¶: Callback that executed when the process terminates.

inference(model: torch.nn.modules.module.Module) → None[source]¶

Handle inference state. If an inference request is occurred from main process, Infer the model and pass the output to main process.

Parameters: model (nn.Module) – model weight

check_picklable(obj: Any) → Any[source]¶

Check object is picklable. If it is not picklable, this method will change the dataclass instance to a dictionary. It is is not dataclass raise exception.

Parameters: obj (Any) – object to check picklable
Returns: picklable object
Return type: Any

run() → None[source]¶: Start parallelization process