nvitop.select module
CUDA visible devices selection tool.
Command line usage:
# All devices but sorted
nvisel # or use `python3 -m nvitop.select`
# A simple example to select 4 devices
nvisel -n 4 # or use `python3 -m nvitop.select -n 4`
# Select available devices that satisfy the given constraints
nvisel --min-count 2 --max-count 3 --min-free-memory 5GiB --max-gpu-utilization 60
# Set `CUDA_VISIBLE_DEVICES` environment variable using `nvisel`
export CUDA_DEVICE_ORDER="PCI_BUS_ID" CUDA_VISIBLE_DEVICES="$(nvisel -c 1 -f 10GiB)"
# Use UUID strings in `CUDA_VISIBLE_DEVICES` environment variable
export CUDA_VISIBLE_DEVICES="$(nvisel -O uuid -c 2 -f 5000M)"
# Pipe output to other shell utilities
nvisel -0 -O uuid -c 2 -f 4GiB | xargs -0 -I {} nvidia-smi --id={} --query-gpu=index,memory.free --format=csv
# Normalize the `CUDA_VISIBLE_DEVICES` environment variable (e.g. convert UUIDs to indices or get full UUIDs for an abbreviated form)
nvisel -i -S
Python API:
# Put this at the top of the Python script
import os
from nvitop import select_devices
os.environ['CUDA_VISIBLE_DEVICES'] = ','.join(
select_devices(format='uuid', min_count=4, min_free_memory='8GiB')
)
- nvitop.select.select_devices(devices: Iterable[Device] | None, *, format: Literal['index'], force_index: bool, min_count: int, max_count: int | None, min_free_memory: int | str | None, min_total_memory: int | str | None, max_gpu_utilization: int | None, max_memory_utilization: int | None, tolerance: int, free_accounts: list[str] | None, sort: bool, **kwargs: Any) list[int] | list[tuple[int, int]] [source]
- nvitop.select.select_devices(devices: Iterable[Device] | None, *, format: Literal['uuid'], force_index: bool, min_count: int, max_count: int | None, min_free_memory: int | str | None, min_total_memory: int | str | None, max_gpu_utilization: int | None, max_memory_utilization: int | None, tolerance: int, free_accounts: list[str] | None, sort: bool, **kwargs: Any) list[int] | list[tuple[int, int]]
- nvitop.select.select_devices(devices: Iterable[Device] | None, *, format: Literal['device'], force_index: bool, min_count: int, max_count: int | None, min_free_memory: int | str | None, min_total_memory: int | str | None, max_gpu_utilization: int | None, max_memory_utilization: int | None, tolerance: int, free_accounts: list[str] | None, sort: bool, **kwargs: Any) list[Device]
Select a subset of devices satisfying the specified criteria.
Note
The min count constraint may not be satisfied if the no enough devices are available. This constraint is only enforced when there are both MIG and non-MIG devices present.
Examples
Put the following lines to the top of your script:
import os from nvitop import select_devices os.environ['CUDA_VISIBLE_DEVICES'] = ','.join( select_devices(format='uuid', min_count=4, min_free_memory='8GiB') )
- Parameters:
devices (Iterable[Device]) – The device superset to select from. If not specified, use all devices as the superset.
format (str) – The format of the output. One of
'index'
,'uuid'
, or'device'
. If gets any MIG device with format'index'
set, falls back to the'uuid'
format.force_index (bool) – If
True
, always use the device index as the output format when gets any MIG device.min_count (int) – The minimum number of devices to select.
max_count (Optional[int]) – The maximum number of devices to select.
min_free_memory (Optional[Union[int, str]]) – The minimum free memory (an
int
in bytes or astr
in human readable form) of the selected devices.min_total_memory (Optional[Union[int, str]]) – The minimum total memory (an
int
in bytes or astr
in human readable form) of the selected devices.max_gpu_utilization (Optional[int]) – The maximum GPU utilization rate (in percentage) of the selected devices.
max_memory_utilization (Optional[int]) – The maximum memory bandwidth utilization rate (in percentage) of the selected devices.
tolerance (int) – The tolerance rate (in percentage) to loose the constraints.
free_accounts (List[str]) – A list of accounts whose used GPU memory needs be considered as free memory.
sort (bool) – If
True
, sort the selected devices by memory usage and GPU utilization.
- Returns:
A list of the device identifiers.