src.utils.auto_gpu module#
- class src.utils.auto_gpu.AutoGPU[source]#
Bases:
objectAutomatic GPU memory manager used to select a GPU with sufficient free memory.
- static allocate_gpu(device, memory_MB: int, block_MB: int | None = None)[source]#
[Internal method] Allocate placeholder memory on the target device.
This is used to verify that memory is truly available by actually allocating it, or to proactively reserve GPU memory.
- Parameters:
device (str or torch.device) – Target device.
memory_MB (int) – Amount of memory to allocate in MB.
block_MB (int, optional) – Block size. If None, allocate in one shot.
- Returns:
References to the allocated tensors.
- Return type:
torch.Tensor or List[torch.Tensor]
- choice_gpu(memory_MB, interval=600, force=True)[source]#
Select a GPU with enough free memory.
This method not only queries nvidia-smi, but also tries to allocate memory to verify actual availability. If all GPUs are busy and force=True, it blocks and waits.
- Parameters:
memory_MB (int) – Minimum memory required by the task in MB.
interval (int, optional) – Polling interval in seconds. Default is 600.
force (bool, optional) – Whether to wait until a GPU becomes available. If False and no GPU is available, returns “cpu”. Default is True.
- Returns:
Selected device string, such as “cuda:0” or “cpu”.
- Return type:
str