Skip to content

Accelerators

skyward.accelerators

Accelerator specifications for Skyward.

Provides type-safe accelerator configurations with full IDE autocomplete. Each accelerator factory returns an immutable Accelerator dataclass with defaults from the catalog.

Usage: import skyward as sky

sky.accelerators.H100()              # Default: 80GB, count=1
sky.accelerators.H100(count=4)       # 4x H100
sky.accelerators.A100(memory="40GB") # A100 40GB variant

# Custom accelerator
my_gpu = sky.accelerators.Custom("My-GPU", memory="48GB")

Available accelerators: - NVIDIA Datacenter: H100, H200, GH200, B100, B200, GB200, A100, A800, A40, A10, A10G, A2, L4, L40, L40S, T4, V100, P100, K80 - NVIDIA Consumer: RTX_5090-RTX_5060, RTX_4090-RTX_4060, RTX_3090-RTX_3050, RTX_2080_Ti-RTX_2060, GTX series - NVIDIA Workstation: RTX_A6000-RTX_A2000, Quadro series - AMD Instinct: MI300X, MI300A, MI250X, MI250, MI210, MI100, MI50 - AWS: Trainium1-3, Inferentia1-2 - Habana: Gaudi, Gaudi2, Gaudi3 - Google TPU: TPUv2-TPUv6, TPU slices

__all__ = ['Accelerator', 'Custom', 'H100', 'H200', 'GH200', 'B100', 'B200', 'GB200', 'A100', 'A800', 'A40', 'A10', 'A10G', 'A16', 'A2', 'L4', 'L40', 'L40S', 'T4', 'T4G', 'V100', 'P100', 'P40', 'P4', 'K80', 'RTX_5090', 'RTX_5080', 'RTX_5070_Ti', 'RTX_5070', 'RTX_5060_Ti', 'RTX_5060', 'RTX_4090', 'RTX_4090D', 'RTX_4080_Super', 'RTX_4080', 'RTX_4070_Ti_Super', 'RTX_4070_Ti', 'RTX_4070_Super', 'RTX_4070', 'RTX_4060_Ti', 'RTX_4060', 'RTX_3090_Ti', 'RTX_3090', 'RTX_3080_Ti', 'RTX_3080', 'RTX_3070_Ti', 'RTX_3070', 'RTX_3060_Ti', 'RTX_3060', 'RTX_3060_Laptop', 'RTX_3050', 'RTX_2080_Ti', 'RTX_2080_Super', 'RTX_2080', 'RTX_2070_Super', 'RTX_2070', 'RTX_2060_Super', 'RTX_2060', 'GTX_1660_Ti', 'GTX_1660_Super', 'GTX_1660', 'GTX_1080_Ti', 'GTX_1080', 'GTX_1070_Ti', 'GTX_1070', 'GTX_1060', 'Titan_Xp', 'Titan_V', 'RTX_6000_Ada', 'RTX_5880_Ada', 'RTX_5000_Ada', 'RTX_PRO_6000', 'RTX_PRO_4000', 'RTX_PRO_4500', 'RTX_A6000', 'RTX_A5000', 'RTX_A5000_Pro', 'RTX_A4000', 'RTX_A2000', 'Quadro_RTX_8000', 'Quadro_RTX_6000', 'Quadro_RTX_5000', 'Quadro_RTX_4000', 'Quadro_P4000', 'MI300X', 'MI300B', 'MI300A', 'MI250X', 'MI250', 'MI210', 'MI100', 'MI50', 'RadeonPro_V710', 'RadeonPro_V520', 'Instinct_MI25', 'Trainium1', 'Trainium2', 'Trainium3', 'Inferentia1', 'Inferentia2', 'Gaudi', 'Gaudi2', 'Gaudi3', 'TPUv2', 'TPUv3', 'TPUv4', 'TPUv5e', 'TPUv5p', 'TPUv6', 'TPUv2_8', 'TPUv3_8', 'TPUv3_32', 'TPUv4_64', 'TPUv5e_4', 'TPUv5p_8'] module-attribute

Accelerator dataclass

Immutable accelerator specification.

Represents a GPU/TPU/accelerator configuration with memory, count, and optional metadata (CUDA versions, etc.).

Args: name: Accelerator name (e.g., "H100", "A100", "T4"). memory: VRAM size (e.g., "80GB", "40GB"). count: Number of accelerators per node. metadata: Additional info (CUDA versions, form factor, etc.).

Examples: >>> Accelerator("H100") Accelerator(name='H100', memory='', count=1, metadata=None)

>>> Accelerator("A100", memory="40GB", count=4)
Accelerator(name='A100', memory='40GB', count=4, metadata=None)

>>> Accelerator.from_name("H100", count=8)
Accelerator(name='H100', memory='80GB', count=8, metadata={'cuda': ...})

name instance-attribute

memory = '' class-attribute instance-attribute

count = 1 class-attribute instance-attribute

metadata = None class-attribute instance-attribute

from_name(name, **overrides) classmethod

Create an Accelerator from the catalog with optional overrides.

Looks up the accelerator in the v1 catalog and applies any provided overrides.

Args: name: Accelerator name from the catalog (e.g., "H100", "T4"). **overrides: Override memory, count, or metadata.

Returns: Accelerator instance with defaults from catalog.

Raises: ValueError: If the accelerator name is not in the catalog.

Examples: >>> Accelerator.from_name("T4") Accelerator(name='T4', memory='16GB', count=1, ...)

>>> Accelerator.from_name("H100", count=4)
Accelerator(name='H100', memory='80GB', count=4, ...)

with_count(count)

Return a new Accelerator with a different count.

Args: count: Number of accelerators per node.

Returns: New Accelerator instance with updated count.

Example: >>> h100 = Accelerator.from_name("H100") >>> h100_x4 = h100.with_count(4)

__str__()

Human-readable representation.

__repr__()

Detailed representation for debugging.

__init__(name, memory='', count=1, metadata=None)

A2(*, count=1)

NVIDIA A2 - Entry-level Ampere for inference.

16GB GDDR6 memory, low power consumption.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for A2.

A10(*, count=1)

NVIDIA A10 - Inference-optimized Ampere GPU.

24GB GDDR6 memory, popular for inference workloads.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for A10.

A10G(*, count=1)

NVIDIA A10G - AWS-specific A10 variant.

24GB GDDR6 memory, available on g5 instances.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for A10G.

A16(*, count=1)

NVIDIA A16 - Ampere multi-GPU inference card (2021).

16GB GDDR6 memory per GPU, designed for virtual desktop and inference workloads with up to 4 GPUs per card.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for A16.

A40(*, count=1)

NVIDIA A40 - Professional visualization + compute (2020).

48GB GDDR6 memory, optimized for virtual workstations.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for A40.

A100(*, memory='80GB', count=1)

NVIDIA A100 - Ampere architecture (2020).

The A100 was the first GPU with TF32 and structural sparsity. Available in 40GB PCIe and 80GB SXM variants.

Args: memory: VRAM size - 40GB or 80GB. count: Number of GPUs per node.

Returns: Accelerator specification for A100.

A800(*, count=1)

NVIDIA A800 - China-specific A100 variant.

Reduced NVLink bandwidth to comply with export restrictions. Same 80GB HBM2e memory as A100.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for A800.

B100(*, count=1)

NVIDIA B100 - Blackwell architecture (2024).

Second-generation transformer engine with FP4 support. 192GB HBM3e memory.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for B100.

B200(*, count=1)

NVIDIA B200 - Blackwell architecture (2024).

Flagship Blackwell GPU with 192GB HBM3e. Up to 2.5x inference performance vs H100.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for B200.

GB200(*, count=1)

NVIDIA Grace Blackwell Superchip (2024).

Combines Grace CPU with two Blackwell GPUs. 384GB combined HBM3e memory.

Args: count: Number of superchips per node.

Returns: Accelerator specification for GB200.

GH200(*, count=1)

NVIDIA Grace Hopper Superchip (2023).

Combines Grace CPU with Hopper GPU via NVLink-C2C. 96GB HBM3 for GPU, 480GB LPDDR5X for CPU.

Args: count: Number of superchips per node.

Returns: Accelerator specification for GH200.

GTX_1060(*, count=1)

NVIDIA GTX 1060 - Pascal entry (2016).

GTX_1070(*, count=1)

NVIDIA GTX 1070 - Pascal (2016).

GTX_1080(*, count=1)

NVIDIA GTX 1080 - Pascal high-end (2016).

GTX_1660(*, count=1)

NVIDIA GTX 1660 - Turing without RT cores (2019).

H100(*, memory='80GB', form_factor=None, count=1)

NVIDIA H100 - Hopper architecture (2022).

The H100 is NVIDIA's flagship datacenter GPU for AI/ML workloads. Supports FP8, with massive improvements for transformer models.

Args: memory: VRAM size - 40GB (PCIe) or 80GB (SXM/NVL). form_factor: SXM (high bandwidth), PCIe, or NVL (2x GPU module). count: Number of GPUs per node.

Returns: Accelerator specification for H100.

H200(*, memory='141GB', form_factor=None, count=1)

NVIDIA H200 - Hopper architecture with HBM3e (2024).

H200 features 141GB HBM3e memory with 4.8TB/s bandwidth. Drop-in replacement for H100 with 1.4-1.9x inference speedup.

Args: memory: VRAM size (141GB for standard, varies for NVL). form_factor: SXM or NVL variant. count: Number of GPUs per node.

Returns: Accelerator specification for H200.

K80(*, count=1)

NVIDIA K80 - Kepler architecture (2014).

12GB GDDR5 memory per GPU (24GB total, dual-GPU card). Legacy GPU, CUDA support ends at 10.2.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for K80.

L4(*, count=1)

NVIDIA L4 - Ada Lovelace inference GPU (2023).

24GB GDDR6 memory, replaces T4 for inference. Excellent price/performance for video and LLM inference.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for L4.

L40(*, count=1)

NVIDIA L40 - Ada Lovelace professional GPU (2023).

48GB GDDR6 memory, for visualization and compute.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for L40.

L40S(*, count=1)

NVIDIA L40S - Ada Lovelace compute GPU (2023).

48GB GDDR6 memory, optimized for AI/ML workloads. Higher power limit than L40 for sustained compute.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for L40S.

MI50(*, count=1)

AMD Instinct MI50 - Vega (2018).

MI100(*, count=1)

AMD Instinct MI100 - CDNA 1 (2020).

MI210(*, count=1)

AMD Instinct MI210 - CDNA 2 (2022).

MI250(*, count=1)

AMD Instinct MI250 - CDNA 2 (2021).

MI250X(*, count=1)

AMD Instinct MI250X - CDNA 2 flagship (2021).

MI300A(*, count=1)

AMD Instinct MI300A - CDNA 3 APU (2023).

Integrated CPU + GPU on single package.

MI300B(*, count=1)

AMD Instinct MI300B - CDNA 3 (2023).

MI300X(*, count=1)

AMD Instinct MI300X - CDNA 3 flagship (2023).

192GB HBM3 memory, designed for large language models.

P4(*, count=1)

NVIDIA P4 - Pascal inference GPU (2016).

8GB GDDR5 memory, low-power inference.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for P4.

P40(*, count=1)

NVIDIA P40 - Pascal inference GPU (2016).

24GB GDDR5 memory, INT8 inference support.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for P40.

P100(*, count=1)

NVIDIA P100 - Pascal architecture (2016).

16GB HBM2 memory, first HBM GPU for deep learning.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for P100.

RTX_2060(*, count=1)

NVIDIA RTX 2060 - Turing entry (2019).

RTX_2070(*, count=1)

NVIDIA RTX 2070 - Turing (2018).

RTX_2080(*, count=1)

NVIDIA RTX 2080 - Turing high-end (2018).

RTX_3050(*, count=1)

NVIDIA RTX 3050 - Ampere entry (2022).

RTX_3060(*, count=1)

NVIDIA RTX 3060 - Ampere (2021).

RTX_3070(*, count=1)

NVIDIA RTX 3070 - Ampere (2020).

RTX_3080(*, count=1)

NVIDIA RTX 3080 - Ampere high-end (2020).

RTX_3090(*, count=1)

NVIDIA RTX 3090 - Ampere flagship (2020).

RTX_4060(*, count=1)

NVIDIA RTX 4060 - Ada Lovelace entry (2023).

RTX_4070(*, count=1)

NVIDIA RTX 4070 - Ada Lovelace (2023).

RTX_4080(*, count=1)

NVIDIA RTX 4080 - Ada Lovelace high-end (2022).

RTX_4090(*, count=1)

NVIDIA RTX 4090 - Ada Lovelace consumer flagship (2022).

RTX_4090D(*, count=1)

NVIDIA RTX 4090D - China-specific variant.

RTX_5060(*, count=1)

NVIDIA RTX 5060 - Blackwell consumer entry (2025).

RTX_5070(*, count=1)

NVIDIA RTX 5070 - Blackwell consumer (2025).

RTX_5080(*, count=1)

NVIDIA RTX 5080 - Blackwell consumer high-end (2025).

RTX_5090(*, count=1)

NVIDIA RTX 5090 - Blackwell consumer flagship (2025).

RTX_A2000(*, count=1)

NVIDIA RTX A2000 - Ampere workstation entry (2021).

RTX_A4000(*, count=1)

NVIDIA RTX A4000 - Ampere workstation (2021).

RTX_A5000(*, count=1)

NVIDIA RTX A5000 - Ampere workstation (2021).

RTX_A6000(*, count=1)

NVIDIA RTX A6000 - Ampere workstation flagship (2020).

RTX_PRO_4000(*, count=1)

NVIDIA RTX PRO 4000 - Blackwell workstation (2025).

RTX_PRO_4500(*, count=1)

NVIDIA RTX PRO 4000 - Blackwell workstation (2025).

RTX_PRO_6000(*, count=1)

NVIDIA RTX PRO 6000 - Blackwell workstation (2025).

T4(*, count=1)

NVIDIA T4 - Turing inference GPU (2018).

16GB GDDR6 memory, excellent cost/performance for inference. Widely available on all major clouds (x86_64).

Args: count: Number of GPUs per node.

Returns: Accelerator specification for T4.

T4G(*, count=1)

NVIDIA T4G - Turing inference GPU for ARM64 (2021).

16GB GDDR6 memory, same as T4 but for Graviton instances. Available on AWS g5g instances.

Args: count: Number of GPUs per node.

Returns: Accelerator specification for T4G.

V100(*, memory='32GB', count=1)

NVIDIA V100 - Volta architecture (2017).

First GPU with Tensor Cores. Available in 16GB and 32GB variants. Still widely used for training workloads.

Args: memory: VRAM size - 16GB or 32GB. count: Number of GPUs per node.

Returns: Accelerator specification for V100.

Custom(name, *, memory='', count=1, cuda_min=None, cuda_max=None)

Create a custom accelerator specification.

Use this for accelerators not in the catalog, or when you need to override the default specifications.

Args: name: Accelerator name (e.g., "Custom-GPU", "My-TPU"). memory: VRAM size (e.g., "48GB"). count: Number of accelerators per node. cuda_min: Minimum CUDA version (e.g., "11.8"). cuda_max: Maximum CUDA version (e.g., "13.1").

Returns: Accelerator specification with custom parameters.

Examples: >>> Custom("My-GPU", memory="48GB") Accelerator(name='My-GPU', memory='48GB')

>>> Custom("Experimental-TPU", memory="128GB", count=8)
Accelerator(name='Experimental-TPU', memory='128GB', count=8)

>>> Custom("H100-Custom", memory="80GB", cuda_min="12.0", cuda_max="13.1")
Accelerator(name='H100-Custom', memory='80GB', ...)

Gaudi(*, count=1)

Habana Gaudi - First-gen training accelerator (2019).

Gaudi2(*, count=1)

Habana Gaudi2 - Second-gen training accelerator (2022).

96GB HBM2e memory, 2x performance vs Gaudi.

Gaudi3(*, count=1)

Habana Gaudi3 - Third-gen training accelerator (2024).

128GB HBM2e memory.

GTX_1070_Ti(*, count=1)

NVIDIA GTX 1070 Ti - Pascal (2017).

GTX_1080_Ti(*, count=1)

NVIDIA GTX 1080 Ti - Pascal flagship (2017).

GTX_1660_Super(*, count=1)

NVIDIA GTX 1660 Super - Turing refresh (2019).

GTX_1660_Ti(*, count=1)

NVIDIA GTX 1660 Ti - Turing without RT cores (2019).

Inferentia1(*, count=1)

AWS Inferentia v1 - First-gen inference accelerator.

8GB memory, available on inf1 instances.

Inferentia2(*, count=1)

AWS Inferentia v2 - Second-gen inference accelerator.

32GB HBM memory, available on inf2 instances.

Instinct_MI25(*, count=1)

AMD Instinct MI25 - Vega (2017).

Quadro_P4000(*, count=1)

NVIDIA Quadro P4000 - Pascal workstation (2016).

Quadro_RTX_4000(*, count=1)

NVIDIA Quadro RTX 4000 - Turing workstation (2018).

Quadro_RTX_5000(*, count=1)

NVIDIA Quadro RTX 5000 - Turing workstation (2018).

Quadro_RTX_6000(*, count=1)

NVIDIA Quadro RTX 6000 - Turing workstation (2018).

Quadro_RTX_8000(*, count=1)

NVIDIA Quadro RTX 8000 - Turing workstation (2018).

RadeonPro_V520(*, count=1)

AMD Radeon Pro V520 - Streaming GPU.

RadeonPro_V710(*, count=1)

AMD Radeon Pro V710 - Streaming GPU.

RTX_2060_Super(*, count=1)

NVIDIA RTX 2060 Super - Turing refresh (2019).

RTX_2070_Super(*, count=1)

NVIDIA RTX 2070 Super - Turing refresh (2019).

RTX_2080_Super(*, count=1)

NVIDIA RTX 2080 Super - Turing refresh (2019).

RTX_2080_Ti(*, count=1)

NVIDIA RTX 2080 Ti - Turing flagship (2018).

RTX_3060_Laptop(*, count=1)

NVIDIA RTX 3060 Laptop - Mobile Ampere.

RTX_3060_Ti(*, count=1)

NVIDIA RTX 3060 Ti - Ampere (2020).

RTX_3070_Ti(*, count=1)

NVIDIA RTX 3070 Ti - Ampere (2021).

RTX_3080_Ti(*, count=1)

NVIDIA RTX 3080 Ti - Ampere high-end (2021).

RTX_3090_Ti(*, count=1)

NVIDIA RTX 3090 Ti - Ampere flagship (2022).

RTX_4060_Ti(*, count=1)

NVIDIA RTX 4060 Ti - Ada Lovelace (2023).

RTX_4070_Super(*, count=1)

NVIDIA RTX 4070 Super - Ada Lovelace refresh (2024).

RTX_4070_Ti(*, count=1)

NVIDIA RTX 4070 Ti - Ada Lovelace (2023).

RTX_4070_Ti_Super(*, count=1)

NVIDIA RTX 4070 Ti Super - Ada Lovelace refresh (2024).

RTX_4080_Super(*, count=1)

NVIDIA RTX 4080 Super - Ada Lovelace refresh (2024).

RTX_5000_Ada(*, count=1)

NVIDIA RTX 5000 Ada - Workstation (2023).

RTX_5060_Ti(*, count=1)

NVIDIA RTX 5060 Ti - Blackwell consumer (2025).

RTX_5070_Ti(*, count=1)

NVIDIA RTX 5070 Ti - Blackwell consumer (2025).

RTX_5880_Ada(*, count=1)

NVIDIA RTX 5880 Ada - Workstation (2023).

RTX_6000_Ada(*, count=1)

NVIDIA RTX 6000 Ada - Workstation flagship (2022).

RTX_A5000_Pro(*, count=1)

NVIDIA RTX A5000 Pro - Ampere workstation variant (2021).

Titan_V(*, count=1)

NVIDIA Titan V - Volta consumer (2017).

Titan_Xp(*, count=1)

NVIDIA Titan Xp - Pascal workstation (2017).

TPUv2(*, count=1)

Google TPU v2 - Second-gen tensor processing unit (2017).

TPUv2_8(*, count=1)

Google TPU v2-8 - 8-chip slice.

TPUv3(*, count=1)

Google TPU v3 - Third-gen TPU (2018).

TPUv3_8(*, count=1)

Google TPU v3-8 - 8-chip slice.

TPUv3_32(*, count=1)

Google TPU v3-32 - 32-chip slice.

TPUv4(*, count=1)

Google TPU v4 - Fourth-gen TPU (2021).

TPUv4_64(*, count=1)

Google TPU v4-64 - 64-chip slice.

TPUv5e(*, count=1)

Google TPU v5e - Efficiency-optimized TPU (2023).

TPUv5e_4(*, count=1)

Google TPU v5e-4 - 4-chip slice.

TPUv5p(*, count=1)

Google TPU v5p - Performance-optimized TPU (2023).

TPUv5p_8(*, count=1)

Google TPU v5p-8 - 8-chip slice.

TPUv6(*, count=1)

Google TPU v6 - Sixth-gen TPU (2024).

Trainium1(*, count=1)

AWS Trainium v1 - First-gen training accelerator.

32GB HBM memory, available on trn1 instances.

Trainium2(*, count=1)

AWS Trainium v2 - Second-gen training accelerator.

64GB HBM memory, 4x performance vs Trainium1.

Trainium3(*, count=1)

AWS Trainium v3 - Third-gen training accelerator.

128GB HBM memory.