Parallel Engine¶
- class parallelformers.parallel.engine.ParallelEngine(num_gpus: int, backend: str, custom_policies: Union[parallelformers.policies.base.policy.Policy, List[parallelformers.policies.base.policy.Policy]])[source]¶
Bases:
object
Model parallelization processing engine
- Parameters
Notes
The parallelization process is performed through the following process.
slice parallelizable tensors and replace original tensors on CPU
upload parallelizable (replaced) tensors to multiple GPUs simultaneously
upload non-parallelizable tensors to multiple GPUs (e.g. embedding, lm_head, …)
- parallelize(model: torch.nn.modules.module.Module, fp16: bool) torch.nn.modules.module.Module [source]¶
Parallelize model to multiple GPUs
- Parameters
model (nn.Module) – Huggingface pre-trained transformer model.
fp16 – (bool): whether use FP16 or not.
- Returns
parallelized model
- Return type
nn.Module