Accelerators¶
skyward.accelerators
¶
Accelerator specifications for Skyward.
Provides type-safe accelerator configurations with full IDE autocomplete. Each accelerator factory returns an immutable Accelerator dataclass with defaults from the catalog.
Usage: import skyward as sky
sky.accelerators.H100() # Default: 80GB, count=1
sky.accelerators.H100(count=4) # 4x H100
sky.accelerators.A100(memory="40GB") # A100 40GB variant
# Custom accelerator
my_gpu = sky.accelerators.Custom("My-GPU", memory="48GB")
Available accelerators: - NVIDIA Datacenter: H100, H200, GH200, B100, B200, GB200, A100, A800, A40, A10, A10G, A2, L4, L40, L40S, T4, V100, P100, K80 - NVIDIA Consumer: RTX_5090-RTX_5060, RTX_4090-RTX_4060, RTX_3090-RTX_3050, RTX_2080_Ti-RTX_2060, GTX series - NVIDIA Workstation: RTX_A6000-RTX_A2000, Quadro series - AMD Instinct: MI300X, MI300A, MI250X, MI250, MI210, MI100, MI50 - AWS: Trainium1-3, Inferentia1-2 - Habana: Gaudi, Gaudi2, Gaudi3 - Google TPU: TPUv2-TPUv6, TPU slices
__all__ = ['Accelerator', 'Custom', 'H100', 'H200', 'GH200', 'B100', 'B200', 'GB200', 'A100', 'A800', 'A40', 'A10', 'A10G', 'A16', 'A2', 'L4', 'L40', 'L40S', 'T4', 'T4G', 'V100', 'P100', 'P40', 'P4', 'K80', 'RTX_5090', 'RTX_5080', 'RTX_5070_Ti', 'RTX_5070', 'RTX_5060_Ti', 'RTX_5060', 'RTX_4090', 'RTX_4090D', 'RTX_4080_Super', 'RTX_4080', 'RTX_4070_Ti_Super', 'RTX_4070_Ti', 'RTX_4070_Super', 'RTX_4070', 'RTX_4060_Ti', 'RTX_4060', 'RTX_3090_Ti', 'RTX_3090', 'RTX_3080_Ti', 'RTX_3080', 'RTX_3070_Ti', 'RTX_3070', 'RTX_3060_Ti', 'RTX_3060', 'RTX_3060_Laptop', 'RTX_3050', 'RTX_2080_Ti', 'RTX_2080_Super', 'RTX_2080', 'RTX_2070_Super', 'RTX_2070', 'RTX_2060_Super', 'RTX_2060', 'GTX_1660_Ti', 'GTX_1660_Super', 'GTX_1660', 'GTX_1080_Ti', 'GTX_1080', 'GTX_1070_Ti', 'GTX_1070', 'GTX_1060', 'Titan_Xp', 'Titan_V', 'RTX_6000_Ada', 'RTX_5880_Ada', 'RTX_5000_Ada', 'RTX_PRO_6000', 'RTX_PRO_4000', 'RTX_PRO_4500', 'RTX_A6000', 'RTX_A5000', 'RTX_A5000_Pro', 'RTX_A4000', 'RTX_A2000', 'Quadro_RTX_8000', 'Quadro_RTX_6000', 'Quadro_RTX_5000', 'Quadro_RTX_4000', 'Quadro_P4000', 'MI300X', 'MI300B', 'MI300A', 'MI250X', 'MI250', 'MI210', 'MI100', 'MI50', 'RadeonPro_V710', 'RadeonPro_V520', 'Instinct_MI25', 'Trainium1', 'Trainium2', 'Trainium3', 'Inferentia1', 'Inferentia2', 'Gaudi', 'Gaudi2', 'Gaudi3', 'TPUv2', 'TPUv3', 'TPUv4', 'TPUv5e', 'TPUv5p', 'TPUv6', 'TPUv2_8', 'TPUv3_8', 'TPUv3_32', 'TPUv4_64', 'TPUv5e_4', 'TPUv5p_8']
module-attribute
¶
Accelerator
dataclass
¶
Immutable accelerator specification.
Represents a GPU/TPU/accelerator configuration with memory, count, and optional metadata (CUDA versions, etc.).
Args: name: Accelerator name (e.g., "H100", "A100", "T4"). memory: VRAM size (e.g., "80GB", "40GB"). count: Number of accelerators per node. metadata: Additional info (CUDA versions, form factor, etc.).
Examples: >>> Accelerator("H100") Accelerator(name='H100', memory='', count=1, metadata=None)
>>> Accelerator("A100", memory="40GB", count=4)
Accelerator(name='A100', memory='40GB', count=4, metadata=None)
>>> Accelerator.from_name("H100", count=8)
Accelerator(name='H100', memory='80GB', count=8, metadata={'cuda': ...})
name
instance-attribute
¶
memory = ''
class-attribute
instance-attribute
¶
count = 1
class-attribute
instance-attribute
¶
metadata = None
class-attribute
instance-attribute
¶
from_name(name, **overrides)
classmethod
¶
Create an Accelerator from the catalog with optional overrides.
Looks up the accelerator in the v1 catalog and applies any provided overrides.
Args: name: Accelerator name from the catalog (e.g., "H100", "T4"). **overrides: Override memory, count, or metadata.
Returns: Accelerator instance with defaults from catalog.
Raises: ValueError: If the accelerator name is not in the catalog.
Examples: >>> Accelerator.from_name("T4") Accelerator(name='T4', memory='16GB', count=1, ...)
>>> Accelerator.from_name("H100", count=4)
Accelerator(name='H100', memory='80GB', count=4, ...)
with_count(count)
¶
Return a new Accelerator with a different count.
Args: count: Number of accelerators per node.
Returns: New Accelerator instance with updated count.
Example: >>> h100 = Accelerator.from_name("H100") >>> h100_x4 = h100.with_count(4)
__str__()
¶
Human-readable representation.
__repr__()
¶
Detailed representation for debugging.
__init__(name, memory='', count=1, metadata=None)
¶
A2(*, count=1)
¶
NVIDIA A2 - Entry-level Ampere for inference.
16GB GDDR6 memory, low power consumption.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for A2.
A10(*, count=1)
¶
NVIDIA A10 - Inference-optimized Ampere GPU.
24GB GDDR6 memory, popular for inference workloads.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for A10.
A10G(*, count=1)
¶
NVIDIA A10G - AWS-specific A10 variant.
24GB GDDR6 memory, available on g5 instances.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for A10G.
A16(*, count=1)
¶
NVIDIA A16 - Ampere multi-GPU inference card (2021).
16GB GDDR6 memory per GPU, designed for virtual desktop and inference workloads with up to 4 GPUs per card.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for A16.
A40(*, count=1)
¶
NVIDIA A40 - Professional visualization + compute (2020).
48GB GDDR6 memory, optimized for virtual workstations.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for A40.
A100(*, memory='80GB', count=1)
¶
NVIDIA A100 - Ampere architecture (2020).
The A100 was the first GPU with TF32 and structural sparsity. Available in 40GB PCIe and 80GB SXM variants.
Args: memory: VRAM size - 40GB or 80GB. count: Number of GPUs per node.
Returns: Accelerator specification for A100.
A800(*, count=1)
¶
NVIDIA A800 - China-specific A100 variant.
Reduced NVLink bandwidth to comply with export restrictions. Same 80GB HBM2e memory as A100.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for A800.
B100(*, count=1)
¶
NVIDIA B100 - Blackwell architecture (2024).
Second-generation transformer engine with FP4 support. 192GB HBM3e memory.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for B100.
B200(*, count=1)
¶
NVIDIA B200 - Blackwell architecture (2024).
Flagship Blackwell GPU with 192GB HBM3e. Up to 2.5x inference performance vs H100.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for B200.
GB200(*, count=1)
¶
NVIDIA Grace Blackwell Superchip (2024).
Combines Grace CPU with two Blackwell GPUs. 384GB combined HBM3e memory.
Args: count: Number of superchips per node.
Returns: Accelerator specification for GB200.
GH200(*, count=1)
¶
NVIDIA Grace Hopper Superchip (2023).
Combines Grace CPU with Hopper GPU via NVLink-C2C. 96GB HBM3 for GPU, 480GB LPDDR5X for CPU.
Args: count: Number of superchips per node.
Returns: Accelerator specification for GH200.
GTX_1060(*, count=1)
¶
NVIDIA GTX 1060 - Pascal entry (2016).
GTX_1070(*, count=1)
¶
NVIDIA GTX 1070 - Pascal (2016).
GTX_1080(*, count=1)
¶
NVIDIA GTX 1080 - Pascal high-end (2016).
GTX_1660(*, count=1)
¶
NVIDIA GTX 1660 - Turing without RT cores (2019).
H100(*, memory='80GB', form_factor=None, count=1)
¶
NVIDIA H100 - Hopper architecture (2022).
The H100 is NVIDIA's flagship datacenter GPU for AI/ML workloads. Supports FP8, with massive improvements for transformer models.
Args: memory: VRAM size - 40GB (PCIe) or 80GB (SXM/NVL). form_factor: SXM (high bandwidth), PCIe, or NVL (2x GPU module). count: Number of GPUs per node.
Returns: Accelerator specification for H100.
H200(*, memory='141GB', form_factor=None, count=1)
¶
NVIDIA H200 - Hopper architecture with HBM3e (2024).
H200 features 141GB HBM3e memory with 4.8TB/s bandwidth. Drop-in replacement for H100 with 1.4-1.9x inference speedup.
Args: memory: VRAM size (141GB for standard, varies for NVL). form_factor: SXM or NVL variant. count: Number of GPUs per node.
Returns: Accelerator specification for H200.
K80(*, count=1)
¶
NVIDIA K80 - Kepler architecture (2014).
12GB GDDR5 memory per GPU (24GB total, dual-GPU card). Legacy GPU, CUDA support ends at 10.2.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for K80.
L4(*, count=1)
¶
NVIDIA L4 - Ada Lovelace inference GPU (2023).
24GB GDDR6 memory, replaces T4 for inference. Excellent price/performance for video and LLM inference.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for L4.
L40(*, count=1)
¶
NVIDIA L40 - Ada Lovelace professional GPU (2023).
48GB GDDR6 memory, for visualization and compute.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for L40.
L40S(*, count=1)
¶
NVIDIA L40S - Ada Lovelace compute GPU (2023).
48GB GDDR6 memory, optimized for AI/ML workloads. Higher power limit than L40 for sustained compute.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for L40S.
MI50(*, count=1)
¶
AMD Instinct MI50 - Vega (2018).
MI100(*, count=1)
¶
AMD Instinct MI100 - CDNA 1 (2020).
MI210(*, count=1)
¶
AMD Instinct MI210 - CDNA 2 (2022).
MI250(*, count=1)
¶
AMD Instinct MI250 - CDNA 2 (2021).
MI250X(*, count=1)
¶
AMD Instinct MI250X - CDNA 2 flagship (2021).
MI300A(*, count=1)
¶
AMD Instinct MI300A - CDNA 3 APU (2023).
Integrated CPU + GPU on single package.
MI300B(*, count=1)
¶
AMD Instinct MI300B - CDNA 3 (2023).
MI300X(*, count=1)
¶
AMD Instinct MI300X - CDNA 3 flagship (2023).
192GB HBM3 memory, designed for large language models.
P4(*, count=1)
¶
NVIDIA P4 - Pascal inference GPU (2016).
8GB GDDR5 memory, low-power inference.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for P4.
P40(*, count=1)
¶
NVIDIA P40 - Pascal inference GPU (2016).
24GB GDDR5 memory, INT8 inference support.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for P40.
P100(*, count=1)
¶
NVIDIA P100 - Pascal architecture (2016).
16GB HBM2 memory, first HBM GPU for deep learning.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for P100.
RTX_2060(*, count=1)
¶
NVIDIA RTX 2060 - Turing entry (2019).
RTX_2070(*, count=1)
¶
NVIDIA RTX 2070 - Turing (2018).
RTX_2080(*, count=1)
¶
NVIDIA RTX 2080 - Turing high-end (2018).
RTX_3050(*, count=1)
¶
NVIDIA RTX 3050 - Ampere entry (2022).
RTX_3060(*, count=1)
¶
NVIDIA RTX 3060 - Ampere (2021).
RTX_3070(*, count=1)
¶
NVIDIA RTX 3070 - Ampere (2020).
RTX_3080(*, count=1)
¶
NVIDIA RTX 3080 - Ampere high-end (2020).
RTX_3090(*, count=1)
¶
NVIDIA RTX 3090 - Ampere flagship (2020).
RTX_4060(*, count=1)
¶
NVIDIA RTX 4060 - Ada Lovelace entry (2023).
RTX_4070(*, count=1)
¶
NVIDIA RTX 4070 - Ada Lovelace (2023).
RTX_4080(*, count=1)
¶
NVIDIA RTX 4080 - Ada Lovelace high-end (2022).
RTX_4090(*, count=1)
¶
NVIDIA RTX 4090 - Ada Lovelace consumer flagship (2022).
RTX_4090D(*, count=1)
¶
NVIDIA RTX 4090D - China-specific variant.
RTX_5060(*, count=1)
¶
NVIDIA RTX 5060 - Blackwell consumer entry (2025).
RTX_5070(*, count=1)
¶
NVIDIA RTX 5070 - Blackwell consumer (2025).
RTX_5080(*, count=1)
¶
NVIDIA RTX 5080 - Blackwell consumer high-end (2025).
RTX_5090(*, count=1)
¶
NVIDIA RTX 5090 - Blackwell consumer flagship (2025).
RTX_A2000(*, count=1)
¶
NVIDIA RTX A2000 - Ampere workstation entry (2021).
RTX_A4000(*, count=1)
¶
NVIDIA RTX A4000 - Ampere workstation (2021).
RTX_A5000(*, count=1)
¶
NVIDIA RTX A5000 - Ampere workstation (2021).
RTX_A6000(*, count=1)
¶
NVIDIA RTX A6000 - Ampere workstation flagship (2020).
RTX_PRO_4000(*, count=1)
¶
NVIDIA RTX PRO 4000 - Blackwell workstation (2025).
RTX_PRO_4500(*, count=1)
¶
NVIDIA RTX PRO 4000 - Blackwell workstation (2025).
RTX_PRO_6000(*, count=1)
¶
NVIDIA RTX PRO 6000 - Blackwell workstation (2025).
T4(*, count=1)
¶
NVIDIA T4 - Turing inference GPU (2018).
16GB GDDR6 memory, excellent cost/performance for inference. Widely available on all major clouds (x86_64).
Args: count: Number of GPUs per node.
Returns: Accelerator specification for T4.
T4G(*, count=1)
¶
NVIDIA T4G - Turing inference GPU for ARM64 (2021).
16GB GDDR6 memory, same as T4 but for Graviton instances. Available on AWS g5g instances.
Args: count: Number of GPUs per node.
Returns: Accelerator specification for T4G.
V100(*, memory='32GB', count=1)
¶
NVIDIA V100 - Volta architecture (2017).
First GPU with Tensor Cores. Available in 16GB and 32GB variants. Still widely used for training workloads.
Args: memory: VRAM size - 16GB or 32GB. count: Number of GPUs per node.
Returns: Accelerator specification for V100.
Custom(name, *, memory='', count=1, cuda_min=None, cuda_max=None)
¶
Create a custom accelerator specification.
Use this for accelerators not in the catalog, or when you need to override the default specifications.
Args: name: Accelerator name (e.g., "Custom-GPU", "My-TPU"). memory: VRAM size (e.g., "48GB"). count: Number of accelerators per node. cuda_min: Minimum CUDA version (e.g., "11.8"). cuda_max: Maximum CUDA version (e.g., "13.1").
Returns: Accelerator specification with custom parameters.
Examples: >>> Custom("My-GPU", memory="48GB") Accelerator(name='My-GPU', memory='48GB')
>>> Custom("Experimental-TPU", memory="128GB", count=8)
Accelerator(name='Experimental-TPU', memory='128GB', count=8)
>>> Custom("H100-Custom", memory="80GB", cuda_min="12.0", cuda_max="13.1")
Accelerator(name='H100-Custom', memory='80GB', ...)
Gaudi(*, count=1)
¶
Habana Gaudi - First-gen training accelerator (2019).
Gaudi2(*, count=1)
¶
Habana Gaudi2 - Second-gen training accelerator (2022).
96GB HBM2e memory, 2x performance vs Gaudi.
Gaudi3(*, count=1)
¶
Habana Gaudi3 - Third-gen training accelerator (2024).
128GB HBM2e memory.
GTX_1070_Ti(*, count=1)
¶
NVIDIA GTX 1070 Ti - Pascal (2017).
GTX_1080_Ti(*, count=1)
¶
NVIDIA GTX 1080 Ti - Pascal flagship (2017).
GTX_1660_Super(*, count=1)
¶
NVIDIA GTX 1660 Super - Turing refresh (2019).
GTX_1660_Ti(*, count=1)
¶
NVIDIA GTX 1660 Ti - Turing without RT cores (2019).
Inferentia1(*, count=1)
¶
AWS Inferentia v1 - First-gen inference accelerator.
8GB memory, available on inf1 instances.
Inferentia2(*, count=1)
¶
AWS Inferentia v2 - Second-gen inference accelerator.
32GB HBM memory, available on inf2 instances.
Instinct_MI25(*, count=1)
¶
AMD Instinct MI25 - Vega (2017).
Quadro_P4000(*, count=1)
¶
NVIDIA Quadro P4000 - Pascal workstation (2016).
Quadro_RTX_4000(*, count=1)
¶
NVIDIA Quadro RTX 4000 - Turing workstation (2018).
Quadro_RTX_5000(*, count=1)
¶
NVIDIA Quadro RTX 5000 - Turing workstation (2018).
Quadro_RTX_6000(*, count=1)
¶
NVIDIA Quadro RTX 6000 - Turing workstation (2018).
Quadro_RTX_8000(*, count=1)
¶
NVIDIA Quadro RTX 8000 - Turing workstation (2018).
RadeonPro_V520(*, count=1)
¶
AMD Radeon Pro V520 - Streaming GPU.
RadeonPro_V710(*, count=1)
¶
AMD Radeon Pro V710 - Streaming GPU.
RTX_2060_Super(*, count=1)
¶
NVIDIA RTX 2060 Super - Turing refresh (2019).
RTX_2070_Super(*, count=1)
¶
NVIDIA RTX 2070 Super - Turing refresh (2019).
RTX_2080_Super(*, count=1)
¶
NVIDIA RTX 2080 Super - Turing refresh (2019).
RTX_2080_Ti(*, count=1)
¶
NVIDIA RTX 2080 Ti - Turing flagship (2018).
RTX_3060_Laptop(*, count=1)
¶
NVIDIA RTX 3060 Laptop - Mobile Ampere.
RTX_3060_Ti(*, count=1)
¶
NVIDIA RTX 3060 Ti - Ampere (2020).
RTX_3070_Ti(*, count=1)
¶
NVIDIA RTX 3070 Ti - Ampere (2021).
RTX_3080_Ti(*, count=1)
¶
NVIDIA RTX 3080 Ti - Ampere high-end (2021).
RTX_3090_Ti(*, count=1)
¶
NVIDIA RTX 3090 Ti - Ampere flagship (2022).
RTX_4060_Ti(*, count=1)
¶
NVIDIA RTX 4060 Ti - Ada Lovelace (2023).
RTX_4070_Super(*, count=1)
¶
NVIDIA RTX 4070 Super - Ada Lovelace refresh (2024).
RTX_4070_Ti(*, count=1)
¶
NVIDIA RTX 4070 Ti - Ada Lovelace (2023).
RTX_4070_Ti_Super(*, count=1)
¶
NVIDIA RTX 4070 Ti Super - Ada Lovelace refresh (2024).
RTX_4080_Super(*, count=1)
¶
NVIDIA RTX 4080 Super - Ada Lovelace refresh (2024).
RTX_5000_Ada(*, count=1)
¶
NVIDIA RTX 5000 Ada - Workstation (2023).
RTX_5060_Ti(*, count=1)
¶
NVIDIA RTX 5060 Ti - Blackwell consumer (2025).
RTX_5070_Ti(*, count=1)
¶
NVIDIA RTX 5070 Ti - Blackwell consumer (2025).
RTX_5880_Ada(*, count=1)
¶
NVIDIA RTX 5880 Ada - Workstation (2023).
RTX_6000_Ada(*, count=1)
¶
NVIDIA RTX 6000 Ada - Workstation flagship (2022).
RTX_A5000_Pro(*, count=1)
¶
NVIDIA RTX A5000 Pro - Ampere workstation variant (2021).
Titan_V(*, count=1)
¶
NVIDIA Titan V - Volta consumer (2017).
Titan_Xp(*, count=1)
¶
NVIDIA Titan Xp - Pascal workstation (2017).
TPUv2(*, count=1)
¶
Google TPU v2 - Second-gen tensor processing unit (2017).
TPUv2_8(*, count=1)
¶
Google TPU v2-8 - 8-chip slice.
TPUv3(*, count=1)
¶
Google TPU v3 - Third-gen TPU (2018).
TPUv3_8(*, count=1)
¶
Google TPU v3-8 - 8-chip slice.
TPUv3_32(*, count=1)
¶
Google TPU v3-32 - 32-chip slice.
TPUv4(*, count=1)
¶
Google TPU v4 - Fourth-gen TPU (2021).
TPUv4_64(*, count=1)
¶
Google TPU v4-64 - 64-chip slice.
TPUv5e(*, count=1)
¶
Google TPU v5e - Efficiency-optimized TPU (2023).
TPUv5e_4(*, count=1)
¶
Google TPU v5e-4 - 4-chip slice.
TPUv5p(*, count=1)
¶
Google TPU v5p - Performance-optimized TPU (2023).
TPUv5p_8(*, count=1)
¶
Google TPU v5p-8 - 8-chip slice.
TPUv6(*, count=1)
¶
Google TPU v6 - Sixth-gen TPU (2024).
Trainium1(*, count=1)
¶
AWS Trainium v1 - First-gen training accelerator.
32GB HBM memory, available on trn1 instances.
Trainium2(*, count=1)
¶
AWS Trainium v2 - Second-gen training accelerator.
64GB HBM memory, 4x performance vs Trainium1.
Trainium3(*, count=1)
¶
AWS Trainium v3 - Third-gen training accelerator.
128GB HBM memory.