Commands Cheat Sheet

Evaluating engineering tools? Get the comparison in Google Sheets

(Perfect for making buy/build decisions or internal reviews.)

Most-used commands
Your email is safe thing.

Thankyou for your submission

We have sent the cheatsheet on your email!
Oops! Something went wrong while submitting the form.

Installation

pip install deepspeed
Install DeepSpeed library

pip install deepspeed-mii
Install DeepSpeed Model Implementations for Inference (MII)

Configuration

deepspeed --help
Display DeepSpeed CLI help

deepspeed --json_config=ds_config.json
Launch training with DeepSpeed configuration file

Training

deepspeed --num_gpus=4 train.py
Launch training script with 4 GPUs

deepspeed --num_nodes=2 --num_gpus=8 train.py
Distributed training across 2 nodes with 8 GPUs each

model_engine, optimizer, _, _ = deepspeed.initialize(args=args, model=model, model_parameters=params)
Initialize DeepSpeed engine in code

Monitoring

deepspeed --tensorboard_dir=./logs train.py
Enable TensorBoard logging

deepspeed --wandb train.py
Enable Weights & Biases integration

Checkpointing

model_engine.save_checkpoint(save_dir)
Save model checkpoint

_, client_state = model_engine.load_checkpoint(load_dir)
Load model checkpoint

deepspeed --zero_stage=3 --save_interval=1
Save ZeRO-3 checkpoints every epoch

Inference

model = deepspeed.init_inference(model, mp_size=2, dtype=torch.half)
Initialize model for inference with tensor parallelism

from deepspeed.mii import MIIServer
Import MII server for inference

MIIServer.start(model='gpt2', dtype='fp16')
Start inference server with a model

Profiling

deepspeed --autotuning=tune train.py
Enable autotuning for optimal performance

deepspeed --flops_profiler train.py
Enable FLOPS profiler

model_engine.flops_profiler.start_profile()
Start FLOPS profiling

model_engine.flops_profiler.stop_profile()
Stop FLOPS profiling and print results

ZeRO Optimization

deepspeed --zero_stage=1 train.py
Enable ZeRO stage 1 (optimizer state partitioning)

deepspeed --zero_stage=2 train.py
Enable ZeRO stage 2 (optimizer + gradient partitioning)

deepspeed --zero_stage=3 train.py
Enable ZeRO stage 3 (optimizer + gradient + parameter partitioning)

deepspeed --zero_offload train.py
Enable CPU offloading with ZeRO