Cloud providers¶

Skyward supports thirteen providers. Twelve are cloud services — AWS, GCP, Hyperstack, JarvisLabs, Massed Compute, Novita, RunPod, Scaleway, TensorDock, Verda, VastAI, Vultr — and one is local containers for development and CI. All implement the same Provider protocol, so the orchestration layer (actor system, SSH tunnels, bootstrap, task dispatch) works identically regardless of which provider you choose. The difference is in how instances are provisioned, what hardware is available, and how authentication works.

Provider configs are lightweight frozen dataclasses. They hold configuration — region, API keys, disk sizes — but don't import any cloud SDK at module level. The SDK is loaded lazily when the pool starts, so import skyward stays fast regardless of which providers are installed.

Disk size¶

You can set disk size uniformly across providers using disk_gb on Spec or directly on Compute:

sky.Compute(provider=sky.AWS(), disk_gb=500)

When set, disk_gb overrides the provider's own default. When omitted (None), each provider uses its built-in default (e.g., 100 GB for AWS, 200 GB for GCP, 50 GB for RunPod). Providers where disk is determined by the instance plan (Vultr, Hyperstack) ignore this parameter.

Provider comparison¶

Feature	AWS	GCP	Hyperstack	JarvisLabs	Massed Compute	Novita	RunPod	Scaleway	TensorDock	Verda	VastAI	Vultr	Container
GPUs	H100, A100, T4, L4, Trainium, Inferentia	H100, A100, T4, L4, V100, H200	A100, H100, RTX series	H200, H100, A100, A100-80GB, A6000, RTX6000Ada, L4	H200 NVL, H100, A100, RTX PRO 6000 Blackwell, RTX A6000, L40S, L40, RTX 6000 Ada, A30	H100, A100, RTX series (dynamic catalog)	H100, A100, A40, RTX series	L4, L40S, H100, H100 SXM, B300	H100, A100, L40, RTX series, V100	H100, A100, H200, GB200	Marketplace (varies)	A16, A40, A100, L40S (cloud); H100, B200, MI300X (bare metal)	None (CPU)
Spot Instances	Yes (60-90% savings)	Yes (preemptible/spot)	No (on-demand only)	No (on-demand only)	Yes (17-20% savings)	Yes	Yes	No (on-demand only)	No (on-demand only)	Yes	Yes (bid-based)	No (on-demand only)	N/A
Regions	20+	40+ zones	Canada, Norway, US	IN1, IN2 (India), EU1 (Finland)	US (Kansas City, Des Moines, Wichita, Beltsville, Omaha)	Cluster-based (dynamic)	Global (Secure + Community)	fr-par-1, fr-par-2, fr-par-3, nl-ams, pl-waw	100+ locations, 20+ countries	FIN, ICL, ISR	Global marketplace	EWR, ORD, DFW, LAX + more	Local
Auth	AWS credentials	Application Default Credentials	API key	API token	API key	API key	API key	Secret key	API key + token	Client ID + Secret	API key	API key	None
Billing	Per-second	Per-second	Per-second	Per-minute	Per-minute	Per-hour	Per-second	Per-hour	Per-second	Per-second	Per-minute	Hourly	Free

AWS¶

AWS uses EC2 Fleet for provisioning, with automatic spot-to-on-demand fallback. Instances are launched in a VPC with security groups managed by Skyward (or you can provide your own). SSH keys are created per cluster and cleaned up on teardown.

AMI resolution happens automatically via SSM Parameter Store — Skyward looks up the latest Ubuntu AMI for your chosen version and architecture. You can override this with a custom AMI.

Setup¶

export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1

Or use the AWS CLI:

aws configure

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.AWS(region="us-east-1"),
    accelerator=sky.accelerators.A100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`region`	`str`	`"us-east-1"`	AWS region
`ami`	`str or None`	`None`	Custom AMI ID. Auto-resolved via SSM if not set.
`ubuntu_version`	`str`	`"24.04"`	Ubuntu LTS version for auto-resolved AMIs
`subnet_id`	`str or None`	`None`	VPC subnet. Uses default VPC if not set.
`security_group_id`	`str or None`	`None`	Security group. Auto-created if not set.
`instance_profile_arn`	`str or None`	`None`	IAM instance profile. Auto-created if not set.
`username`	`str or None`	`None`	SSH user. Auto-detected from AMI if not set.
`instance_timeout`	`int`	`300`	Safety timeout in seconds (auto-shutdown timer)
`request_timeout`	`int`	`30`	HTTP request timeout in seconds
`allocation_strategy`	`str`	`"price-capacity-optimized"`	EC2 Fleet spot allocation strategy
`exclude_burstable`	`bool`	`False`	Exclude burstable instances (t3, t4g)

Required IAM permissions¶

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:RunInstances",
                "ec2:TerminateInstances",
                "ec2:DescribeInstances",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeImages",
                "ec2:CreateSecurityGroup",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:DescribeSecurityGroups",
                "ec2:CreateKeyPair",
                "ec2:DescribeKeyPairs",
                "ec2:CreateFleet",
                "ec2:DescribeFleets",
                "ssm:GetParameter"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::*:role/*"
        }
    ]
}

GCP¶

GCP uses Compute Engine with instance templates and bulk_insert for fleet-style provisioning. Skyward resolves the best machine type dynamically — for GPUs like T4 and V100, it uses N1 machines with guest accelerators; for A100 and H100, it picks the matching A2/A3 machine family with built-in GPUs. Spot instances use the SPOT provisioning model with automatic deletion on preemption.

SSH keys are injected via instance metadata. The project is auto-detected from Application Default Credentials or GOOGLE_CLOUD_PROJECT. GCP API calls use sync clients dispatched to a dedicated thread pool (configurable via thread_pool_size).

Skyward creates an instance template and a firewall rule per cluster, both cleaned up on teardown. Instances use Google's Deep Learning VM images (CUDA 12.x, NVIDIA 570 drivers) for GPU workloads.

Setup¶

gcloud auth application-default login

Or set the project explicitly:

export GOOGLE_CLOUD_PROJECT=your_project_id

GPU Quotas

Listing available accelerator types does not mean you have quota. Check your quotas with:

gcloud compute regions describe <region> --format="table(quotas.metric,quotas.limit,quotas.usage)" | grep GPU

Request quota increases in the Cloud Console.

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.GCP(zone="us-central1-a"),
    accelerator=sky.accelerators.T4(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`project`	`str or None`	`None`	GCP project ID. Auto-detected from ADC or `GOOGLE_CLOUD_PROJECT`.
`zone`	`str`	`"us-central1-a"`	Compute Engine zone
`network`	`str`	`"default"`	VPC network name
`subnet`	`str or None`	`None`	Specific subnet. Uses auto-mode subnet if not set.
`disk_size_gb`	`int`	`200`	Boot disk size in GB
`disk_type`	`str`	`"pd-balanced"`	Boot disk type
`instance_timeout`	`int`	`300`	Safety timeout in seconds (self-destruction timer)
`service_account`	`str or None`	`None`	GCE service account email
`thread_pool_size`	`int`	`8`	Thread pool size for blocking GCP API calls

Required permissions¶

The authenticated principal needs the following roles (or equivalent permissions):

compute.instances.create, compute.instances.delete, compute.instances.list, compute.instances.get
compute.instanceTemplates.create, compute.instanceTemplates.delete
compute.firewalls.create, compute.firewalls.delete, compute.firewalls.get
compute.machineTypes.list, compute.acceleratorTypes.list
compute.images.getFromFamily

The simplest approach is the Compute Admin role (roles/compute.admin).

Install¶

uv add "skyward[gcp]"

Novita¶

Novita.ai is a GPU cloud where instances are Docker containers with configurable GPU count and root filesystem size. SSH access is provided through Novita's proxy — no openssh-server or key injection is needed inside the container. Skyward reads the SSH connection details from the instance metadata and connects through the proxy automatically.

Novita resolves CUDA compatibility dynamically. When provisioning, Skyward queries the instance's maximum supported CUDA version and tries descending versions until it finds a host with availability. If you provide a custom Docker image, it's used as-is.

Setup¶

export NOVITA_API_KEY=your_api_key

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.Novita(),
    accelerator=sky.accelerators.A100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API key. Falls back to `NOVITA_API_KEY` env var.
`cluster_id`	`str or None`	`None`	Target cluster/region ID. `None` for auto-selection.
`rootfs_size`	`int`	`50`	Root filesystem size in GB.
`docker_image`	`str or None`	`None`	Base Docker image. Defaults to `nvcr.io/nvidia/cuda:12.9.1-runtime-ubuntu24.04`.
`min_cuda_version`	`str or None`	`None`	Minimum CUDA version requirement (e.g., `"12.4"`).
`request_timeout`	`int`	`30`	HTTP request timeout in seconds.

Novita also provides a helper for building NVIDIA CUDA base images:

image_name = sky.Novita.ubuntu(version="24.04", cuda="12.9.1")
# → "nvcr.io/nvidia/cuda:12.9.1-runtime-ubuntu24.04"

RunPod¶

RunPod offers GPU pods in two tiers: Secure Cloud (enterprise-grade, dedicated hardware) and Community Cloud (lower-cost, peer-hosted). Skyward provisions pods via RunPod's GraphQL API, configures SSH access, and manages the full lifecycle.

SSH keys are auto-detected from ~/.ssh/id_ed25519.pub or ~/.ssh/id_rsa.pub and registered on your RunPod account.

Setup¶

export RUNPOD_API_KEY=your_api_key

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.RunPod(),
    accelerator=sky.accelerators.A100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Standalone mode¶

RunPod's individual pods don't share a private network — each pod gets its own IP, but pods can't reach each other directly. RunPod therefore defaults to standalone mode, so multi-node workloads that don't require inter-node communication (hyperparameter sweeps, batch inference) work without extra configuration:

with sky.Compute(
    provider=sky.RunPod(),
    accelerator=sky.accelerators.A100(),
    nodes=4,
) as compute:
    results = sky.gather(*tasks) >> compute

The client connects to each worker independently via SSH. Distributed collections and distributed training are not available in this mode. If you've enabled RunPod global networking and want cluster mode, override with options=sky.Options(cluster=True). See the Standalone Workers guide for details.

Parameters¶

Parameter	Type	Default	Description
`cluster_mode`	`ClusterMode`	`"individual"`	Cluster mode (`"instant"` or `"individual"`)
`global_networking`	`bool or None`	`None`	Enable global networking
`api_key`	`str or None`	`None`	API key (falls back to `RUNPOD_API_KEY` env var)
`cloud_type`	`Literal["community", "secure"]`	`"secure"`	Cloud type: `"secure"` or `"community"`
`ubuntu`	`str`	`"newest"`	Ubuntu version (`"20.04"`, `"22.04"`, `"24.04"`, `"newest"`)
`container_disk_gb`	`int`	`50`	Container disk size in GB
`volume_gb`	`int`	`20`	Persistent volume size in GB
`volume_mount_path`	`str`	`"/workspace"`	Volume mount path
`data_center_ids`	`tuple or "global"`	`"global"`	Preferred data centers or `"global"` for auto-selection
`ports`	`tuple[str, ...]`	`("22/tcp",)`	Port mappings
`provision_timeout`	`float`	`300.0`	Instance provision timeout in seconds
`bootstrap_timeout`	`float`	`600.0`	Bootstrap timeout in seconds
`instance_timeout`	`int`	`300`	Auto-shutdown safety timeout in seconds
`request_timeout`	`int`	`30`	HTTP request timeout in seconds
`cpu_clock`	`str`	`"3c"`	CPU clock tier (`"3c"` or `"5c"`)
`bid_multiplier`	`float`	`1`	Multiplier for spot bid price
`registry_auth`	`str or None`	`"docker hub"`	Container registry credential name. `None` to skip.

Verda¶

Verda is a GPU cloud with data centers in Europe and the Middle East. It uses OAuth2 authentication with a client ID and secret — not a single API key.

SSH keys are auto-detected and registered on Verda if needed. If region is not specified (the default is "FIN-01"), Verda uses its default region. The provider also supports auto-region discovery: if the requested GPU isn't available in the configured region, Skyward finds another region with availability.

Setup¶

export VERDA_CLIENT_ID=your_client_id
export VERDA_CLIENT_SECRET=your_client_secret

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.Verda(),
    accelerator=sky.accelerators.H100(),
    nodes=4,
) as compute:
    results = train() @ compute

Parameters¶

Parameter	Type	Default	Description
`region`	`str`	`"FIN-01"`	Preferred region
`client_id`	`str or None`	`None`	OAuth2 client ID (falls back to `VERDA_CLIENT_ID`)
`client_secret`	`str or None`	`None`	OAuth2 client secret (falls back to `VERDA_CLIENT_SECRET`)
`ssh_key_id`	`str or None`	`None`	Specific SSH key ID to use
`instance_timeout`	`int`	`300`	Safety timeout in seconds
`request_timeout`	`int`	`30`	HTTP request timeout in seconds

Available regions¶

Region	Location	GPUs
`FIN-01`	Finland	H100, A100, H200, GB200
`ICL-01`	Iceland	H100, A100
`ISR-01`	Israel	H100, A100

VastAI¶

VastAI is a GPU marketplace — instances are Docker containers running on hosts from independent providers worldwide. Pricing is dynamic, and reliability varies by host. Skyward filters offers by reliability score, CUDA version, and optional geolocation, then provisions containers via the VastAI API.

SSH keys are auto-detected from ~/.ssh/id_ed25519.pub or ~/.ssh/id_rsa.pub and registered on VastAI if needed. For multi-node clusters, VastAI supports overlay networks for NCCL communication between instances.

Setup¶

export VAST_API_KEY=your_api_key

Get your API key at: https://cloud.vast.ai/account/

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.VastAI(geolocation="US"),
    accelerator=sky.accelerators.RTX_4090(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API key (falls back to `VAST_API_KEY`)
`min_reliability`	`float`	`0.95`	Minimum host reliability score (0.0-1.0)
`verified_only`	`bool`	`True`	Only select offers from verified hosts
`min_cuda`	`float`	`12.0`	Minimum CUDA version
`geolocation`	`str or None`	`None`	Filter by region/country (e.g., `"US"`, `"EU"`)
`bid_multiplier`	`float`	`1.2`	Multiplier for spot bid price
`instance_timeout`	`int`	`300`	Auto-shutdown safety timeout in seconds
`request_timeout`	`int`	`30`	HTTP request timeout in seconds
`docker_image`	`str or None`	`None`	Base Docker image for containers
`disk_gb`	`int`	`100`	Disk space in GB
`overlay_timeout`	`int`	`120`	Timeout for overlay operations in seconds
`require_direct_port`	`bool`	`False`	Only select offers with direct port access

VastAI also provides a helper for building NVIDIA CUDA base images:

image_name = sky.VastAI.ubuntu(version="24.04", cuda="12.9.1")
# → "nvcr.io/nvidia/cuda:12.9.1-runtime-ubuntu24.04"

Hyperstack¶

Hyperstack provides bare-metal GPU instances via their InfraHub API. Resources are organized into environments that group VMs, keypairs, and volumes within a region. Environments are created per cluster and cascade-deleted on teardown. All instances are on-demand — no spot pricing.

Setup¶

export HYPERSTACK_API_KEY=your_api_key

Get your API key at the Hyperstack Console.

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.Hyperstack(region="CANADA-1"),
    accelerator=sky.accelerators.A100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API key (falls back to `HYPERSTACK_API_KEY` env var)
`region`	`str or tuple or None`	`None`	Deployment region(s). A single string (e.g. `"CANADA-1"`), a tuple (e.g. `("CANADA-1", "NORWAY-1")`), or `None` to search all regions.
`image`	`str or None`	`None`	OS image name override. Auto-selects newest Ubuntu + CUDA image if not set.
`network_optimised`	`bool`	`False`	Require network-optimised environments with SR-IOV support
`network_optimised_regions`	`tuple[str, ...]`	`("CANADA-1", "US-1")`	Regions known to support network-optimised environments
`object_storage_region`	`str`	`"CANADA-1"`	Region for S3-compatible object storage (volume mounts)
`object_storage_endpoint`	`str`	`"https://ca1.obj.nexgencloud.io"`	Endpoint URL for S3-compatible object storage
`instance_timeout`	`int`	`300`	Auto-shutdown safety timeout in seconds
`request_timeout`	`int`	`30`	HTTP request timeout in seconds
`teardown_timeout`	`int`	`120`	Timeout for teardown operations in seconds
`teardown_poll_interval`	`float`	`2.0`	Poll interval during teardown in seconds

Available regions¶

Region	Location
`CANADA-1`	Canada
`NORWAY-1`	Norway
`US-1`	United States

TensorDock¶

TensorDock is a GPU marketplace with bare-metal VMs across 100+ locations in 20+ countries. Per-second billing, on-demand only (no spot). Skyward queries available hostnodes, selects the cheapest matching your GPU requirements, and deploys VMs with cloud-init for SSH key injection.

SSH keys are injected per-instance via cloud-init (TensorDock has no SSH key registration API). The SSH user is user (not root). Port forwarding maps internal ports to random external ports — Skyward handles this automatically.

Setup¶

export TENSORDOCK_API_KEY=your_api_key
export TENSORDOCK_API_TOKEN=your_api_token

Get your credentials at: https://console.tensordock.com/api

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.TensorDock(location="us"),
    accelerator=sky.accelerators.RTX_4090(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API key (falls back to `TENSORDOCK_API_KEY`)
`api_token`	`str or None`	`None`	API token (falls back to `TENSORDOCK_API_TOKEN`)
`location`	`str or None`	`None`	Country filter (e.g., `"United States"`, `"Germany"`). Global if not set.
`tier`	`int or None`	`None`	Hostnode tier (0-4). `None` for any tier.
`storage_gb`	`int`	`100`	Disk storage per VM in GB
`operating_system`	`str`	`"ubuntu2404"`	OS image ID (e.g., `"ubuntu2404"`, `"ubuntu2204"`)
`instance_timeout`	`int`	`300`	Auto-shutdown in seconds
`request_timeout`	`int`	`120`	HTTP request timeout in seconds
`min_ram_gb`	`int or None`	`None`	Minimum RAM per VM in GB
`min_vcpus`	`int or None`	`None`	Minimum vCPUs per VM

Port forwarding

TensorDock maps internal ports to random external ports. SSH is never on port 22 externally. Skyward reads the port mapping from the deploy response and configures SSH tunnels accordingly — no manual port configuration needed.

Vultr¶

Vultr offers GPU instances in two modes: Cloud GPU (virtual instances with vGPU/passthrough, faster provisioning, fractional GPU support) and Bare Metal (dedicated physical servers with no virtualization overhead). Cloud GPU is the default.

Cloud GPU supports NVIDIA A16, A40, A100, and L40S. Bare Metal adds H100, GH200, HGX B200, and AMD MI300X/MI355X. All instances are billed hourly.

Setup¶

export VULTR_API_KEY=your_api_key

Generate an API key from the Vultr customer portal.

Usage¶

import skyward as sky

# Cloud GPU (default)
with sky.Compute(
    provider=sky.Vultr(region="ewr"),
    accelerator=sky.accelerators.A100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

# Bare Metal
with sky.Compute(
    provider=sky.Vultr(mode="bare-metal", region="ewr"),
    accelerator=sky.accelerators.H100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API key. Falls back to `VULTR_API_KEY` env var.
`mode`	`"cloud" or "bare-metal"`	`"cloud"`	Cloud GPU (virtual) or bare metal (dedicated).
`region`	`str`	`"ewr"`	Vultr region ID (e.g., `"ewr"`, `"ord"`, `"dfw"`).
`os_id`	`int`	`2284`	OS image ID. Default is Ubuntu 24.04.
`instance_timeout`	`int`	`300`	Safety timeout in seconds.
`request_timeout`	`int`	`30`	HTTP request timeout in seconds.

Scaleway¶

Scaleway provides GPU instances in European data centers (Paris, Amsterdam, Warsaw). GPU instances range from L4 (24 GB) to H100 SXM (80 GB) and B300 (288 GB). Pricing is per-hour, on-demand only (no spot).

Setup¶

export SCW_SECRET_KEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SCW_DEFAULT_PROJECT_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.Scaleway(zone="fr-par-2"),
    accelerator=sky.accelerators.H100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`secret_key`	`str \\| None`	`None`	API secret key. Falls back to `SCW_SECRET_KEY` env var.
`project_id`	`str \\| None`	`None`	Project ID. Falls back to `SCW_DEFAULT_PROJECT_ID` env var.
`zone`	`str \\| None`	`None`	Availability zone. `None` searches all GPU zones automatically.
`image`	`str \\| None`	`None`	OS image UUID override. Auto-selects Ubuntu GPU image when None.
`instance_timeout`	`int`	`300`	Auto-shutdown safety timeout in seconds.
`request_timeout`	`int`	`30`	HTTP request timeout in seconds.

Jarvis Labs¶

Jarvis Labs is a GPU cloud platform offering instances in India (IN1, IN2) and Europe/Finland (EU1). Per-minute billing with a prepaid wallet model. SSH keys are auto-registered by Skyward via the SDK. The provider uses the jarvislabs Python SDK (sync calls dispatched to a thread pool).

EU1 region only supports H100 and H200 GPUs with either 1 or 8 GPUs, and requires minimum 100GB storage.

Setup¶

export JL_API_KEY=your_api_token

Get your token from jarvislabs.ai/settings/api-keys.

Install the SDK:

uv add "skyward[jarvis]"

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.JarvisLabs(region="IN2"),
    accelerator=sky.accelerators.L4(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API token. Falls back to `JL_API_KEY` env var.
`region`	`str or None`	`None`	Region: `IN1`, `IN2`, `EU1`. Auto-selects if not set.
`template`	`str`	`"pytorch"`	Framework template: `pytorch`, `tensorflow`, `jax`, `vm`.
`storage_gb`	`int`	`50`	Disk storage in GB. Minimum 100 for EU1/VM.
`instance_timeout`	`int`	`300`	Auto-shutdown safety timer in seconds.
`thread_pool_size`	`int`	`8`	Max threads for SDK calls.

GPU availability¶

GPU	VRAM	Price/hr	Regions
H200 SXM	141 GB	$3.80	EU1
H100 SXM	80 GB	$2.99	EU1
A100-80GB	80 GB	$1.49	IN2
A100	40 GB	$1.29	IN1, IN2
RTX 6000 Ada	48 GB	$0.99	IN1
A6000	48 GB	$0.79	IN1
L4	24 GB	$0.44	IN2

Lambda Cloud¶

Lambda Cloud offers on-demand GPU instances with a straightforward API. Instances run Ubuntu with NVIDIA drivers pre-installed. SSH keys are auto-registered and cleaned up by Skyward. Per-minute billing.

If no region is specified, Lambda Cloud auto-selects the first region with available capacity.

Setup¶

export LAMBDA_API_KEY=your_api_key

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.LambdaCloud(),
    accelerator=sky.accelerators.H100(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Pick a specific region:

sky.LambdaCloud(region="us-east-3")

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API key. Falls back to `LAMBDA_API_KEY` env var.
`region`	`str or None`	`None`	Preferred region (e.g., `"us-east-3"`). Auto-selects region with capacity if not set.
`request_timeout`	`int`	`30`	HTTP request timeout in seconds.

Massed Compute¶

Massed Compute is a bare-metal GPU cloud with data centers across the US. Instances run Ubuntu with NVIDIA drivers pre-installed, SSH access via key or password, and all ports open by default (no firewall configuration needed). SSH keys are auto-registered and cleaned up by Skyward.

Spot instances are available on select GPU types (H100, A6000, L40, H200 NVL) at 17-20% discount. Region is auto-placed — Massed Compute assigns the best available data center.

Setup¶

export MASSED_API_KEY=your_api_key

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.MassedCompute(),
    accelerator=sky.accelerators.RTX_A6000(),
    nodes=2,
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`api_key`	`str or None`	`None`	API key. Falls back to `MASSED_API_KEY` env var.
`image_id`	`int`	`184`	OS image ID. `184` = Ubuntu 24.04, `84` = Ubuntu 22.04 w/ drivers.
`request_timeout`	`int`	`30`	HTTP request timeout in seconds.

GPU availability¶

GPU	VRAM	Price/hr	Spot Price/hr
RTX PRO 6000 Blackwell	96 GB	$1.74	—
H200 NVL	141 GB	$2.83	—
H100	80 GB	$2.40	$1.98
H100 NVL	94 GB	$3.11	—
A100 SXM4	80 GB	$1.28	—
DGX A100	80 GB	$1.28	—
L40S	48 GB	$0.88	—
L40	48 GB	$0.84	$0.67
RTX 6000 Ada	48 GB	$0.79	—
RTX A6000	48 GB	$0.57	$0.45
RTX A5000	24 GB	$0.44	—
A30	24 GB	$0.35	—

Prices are per GPU. Multi-GPU configurations (2x, 4x, 8x) scale linearly.

Container¶

The Container provider runs compute nodes as local containers — Docker, podman, nerdctl, or Apple's container CLI. No cloud credentials, no costs. Useful for development, CI testing, and validating your code before deploying to real hardware.

Containers are launched with SSH access, joined to a shared network, and bootstrapped the same way cloud instances are. From the pool's perspective, they look like any other nodes.

Usage¶

import skyward as sky

with sky.Compute(
    provider=sky.Container(),
    nodes=2,
    image=sky.Image(pip=["numpy"]),
) as compute:
    result = train(data) >> compute

Parameters¶

Parameter	Type	Default	Description
`image`	`str`	`"ghcr.io/gabfssilva/skyward:py{python_version}"`	Docker image (Python version auto-detected)
`ssh_user`	`str`	`"root"`	SSH user inside the container
`binary`	`str`	`"docker"`	Container runtime (`"docker"`, `"podman"`, `"nerdctl"`)
`container_prefix`	`str or None`	`None`	Prefix for container names
`network`	`str or None`	`None`	Docker network name. Auto-created if not set.

Choosing a provider¶

AWS — When you need specific hardware (H100, Trainium, Inferentia), spot instance savings, or enterprise reliability. Best if you're already in the AWS ecosystem.

GCP — Deep integration with Google Cloud. Deep Learning VM images with pre-installed CUDA drivers, dynamic machine type resolution, fleet-style provisioning via bulk_insert. Supports T4, L4, V100, A100, H100, H200.

Novita — Docker-based GPU cloud with automatic CUDA version resolution. SSH through Novita's proxy — no key injection or openssh-server setup. Spot instances available.

RunPod — Fast provisioning, competitive pricing, minimal setup. Both Secure Cloud (dedicated) and Community Cloud (cheaper) tiers. Good for A100/H100/RTX workloads.

Hyperstack — Bare-metal GPU cloud with environment-scoped resource management. On-demand only, regions in Canada, Norway, and US.

JarvisLabs — GPU cloud with data centers in India and Finland. Per-minute billing with a prepaid wallet model. Good for A100, H100, H200 workloads. On-demand only.

Massed Compute — Bare-metal GPU cloud across US data centers. Wide GPU range from A30 to H200 NVL and RTX PRO 6000 Blackwell. Spot instances on H100, A6000, L40. Per-minute billing, auto-placed regions, all ports open by default.

Verda — European data residency (Finland, Iceland, Israel). H100/A100/H200/GB200 availability with automatic region selection.

TensorDock — Bare-metal VMs across 100+ locations with per-second billing. Good for RTX 4090, A100, H100 workloads without spot complexity. On-demand only.

VastAI — Maximum cost savings through marketplace pricing. Consumer GPUs (RTX 4090, 3090) available alongside datacenter hardware. Overlay networks for multi-node training.

Scaleway — European GPU cloud with instances in Paris, Amsterdam, and Warsaw. L4 through H100 SXM and B300. On-demand only, per-hour billing. Good for EU data residency requirements.

Vultr — Two modes in one provider: Cloud GPU for fast virtual instances with fractional GPU support, and Bare Metal for dedicated servers with H100, B200, and AMD MI300X/MI355X. Hourly billing, simple API key auth.

Container — Local development and CI. Zero cost, instant provisioning. Validates your code end-to-end before deploying to a real provider.

Common issues¶

GCP: "No GCP accelerator matches"¶

Check available accelerators in your zone: gcloud compute accelerator-types list --filter="zone:us-central1-a"
Try a different zone — GPU availability varies by zone
Request GPU quota increases in the Cloud Console

GCP: "Quota exceeded"¶

Check current quotas: gcloud compute regions describe <region> | grep -A2 GPU
Request increases for the specific GPU type (e.g., NVIDIA_T4_GPUS, NVIDIA_L4_GPUS)
Both on-demand and preemptible quotas are separate — check both

AWS: "No instances available"¶

Try a different region
Use allocation="spot-if-available" (the default) to fall back to on-demand
Request a service quota increase in the AWS console

Verda: "Region not available"¶

The default region is "FIN-01" — try a different one or let auto-discovery find capacity
Check your account's region access

TensorDock: "No hostnodes available"¶

Try a different location or remove the location filter
Try a different GPU type — hostnode availability is dynamic
Check marketplace availability at https://marketplace.tensordock.com

VastAI: "No offers available"¶

Lower min_reliability (e.g., 0.8)
Expand or remove the geolocation filter
Try a different accelerator type
Check marketplace availability at https://cloud.vast.ai/

Getting Started — Installation and credentials setup
Accelerators — Accelerator selection guide
API Reference — Complete API documentation

Cloud providers¶

Disk size¶

Provider comparison¶

AWS¶

Setup¶

Usage¶

Parameters¶

Required IAM permissions¶

GCP¶

Setup¶

Usage¶

Parameters¶

Required permissions¶

Install¶

Novita¶

Setup¶

Usage¶

Parameters¶

RunPod¶

Setup¶

Usage¶

Standalone mode¶

Parameters¶

Verda¶

Setup¶

Usage¶

Parameters¶

Available regions¶

VastAI¶

Setup¶

Usage¶

Parameters¶

Hyperstack¶

Setup¶

Usage¶

Parameters¶

Available regions¶

TensorDock¶

Setup¶

Usage¶

Parameters¶

Vultr¶

Setup¶

Usage¶

Parameters¶

Scaleway¶

Setup¶

Usage¶

Parameters¶

Jarvis Labs¶

Setup¶

Usage¶

Parameters¶

GPU availability¶

Lambda Cloud¶

Setup¶

Usage¶

Parameters¶

Massed Compute¶

Setup¶

Usage¶

Parameters¶

GPU availability¶

Container¶

Usage¶

Parameters¶

Choosing a provider¶

Common issues¶

GCP: "No GCP accelerator matches"¶

GCP: "Quota exceeded"¶

AWS: "No instances available"¶

Verda: "Region not available"¶

TensorDock: "No hostnodes available"¶

VastAI: "No offers available"¶

Related topics¶