Tensor Replacer¶
- class parallelformers.parallel.replacing.TensorReplacer(model: torch.nn.modules.module.Module, mp_group: Any, fp16: bool, num_gpus: int, custom_policies: Union[parallelformers.policies.base.policy.Policy, List[parallelformers.policies.base.policy.Policy]])[source]¶
Bases:
object
Replace original Huggingface’s layer into Megatron tensor sliced layer.
- Parameters
- auto_policy() Optional[List[parallelformers.policies.base.policy.Policy]] [source]¶
Find the proper policy for current model using AutoPolicy
- replace_user_define_modules(model: torch.nn.modules.module.Module, policy_cls: Type[parallelformers.policies.base.policy.Policy]) None [source]¶
Replace modules in the model by user defined policy
- Parameters
model (nn.Module) – model weight
policy_cls (Type[Policy]) – class of policy
- replace_orig_to_megatron_modules(model: torch.nn.modules.module.Module, policy_cls: Type[parallelformers.policies.base.policy.Policy]) torch.nn.modules.module.Module [source]¶
Replace original Huggingface layers to Megatron tensor sliced layers
- Parameters
model (nn.Module) – model weight
policy_cls (Type[Policy]) – class of policy
- Returns
parallelized paramerters
- Return type
nn.Module
- preprocess(function_output: List[parallelformers.policies.base.policy.Layer], policy: parallelformers.policies.base.policy.Policy) Tuple[Dict, Dict, Dict, Dict] [source]¶
Preprocess user’s policy object to replace tensors
- set_parameters(policy: parallelformers.policies.base.policy.Policy, weight_name: Dict[str, torch.Tensor], bias_name: Dict[str, torch.Tensor], weight_param: Dict[str, torch.Tensor], bias_param: Dict[str, torch.Tensor], suffix: str = 'data') parallelformers.policies.base.policy.Policy [source]¶
Set sliced parameters into original model
- Parameters
- Returns
policy object
- Return type
- static set_layer_size(policy: parallelformers.policies.base.policy.Policy, name: str, size: torch.Size) None [source]¶
Apply resize layer size to original layer object
- make_megatron_layer(policy: parallelformers.policies.base.policy.Policy) torch.nn.modules.module.Module [source]¶
Make Megatron tensor sliced layers from original Huggingface layers by tensor slicing.
- Parameters
policy (Policy) – policy object
- Returns
sliced model layer
- Return type
nn.Module