Skip to content

Data Science Container

The data science container is a ready-to-use environment for machine learning, data analysis, and scientific computing. It includes a comprehensive Python stack, R, Julia, and GPU-accelerated libraries when provisioned with a GPU.

Python data science: polars, pandas, numpy, scipy, scikit-learn, statsmodels, xgboost, lightgbm

Deep learning: PyTorch, transformers, tokenizers, accelerate

Visualization: matplotlib, seaborn, plotly, altair

GPU acceleration (when GPU attached): CUDA toolkit, RAPIDS (cuDF, cuML), PyTorch with CUDA support

Languages: Python 3.12, R, Julia

Tools: Jupyter (available but not the primary interface), pip, conda, git

  • Model training on large datasets with dedicated GPU resources
  • Feature engineering and data wrangling with polars/pandas on high-memory instances
  • Persistent development environment you can SSH into from your IDE
  • Batch processing jobs that need significant compute
  • GPU-accelerated workloads (RAPIDS for dataframe operations, PyTorch for training)

Add a GPU at creation time for hardware-accelerated workloads:

Terminal window
curl -X POST https://console.carolinacloud.io/api/instance/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"resource_type": "container",
"flavor": "datascience",
"n_vcpus": 16,
"mem_gib": 64,
"disk_size_gib": 200,
"gpu_model": "RTX 5090"
}'

When a GPU is attached, CUDA, cuDNN, and GPU-accelerated Python libraries are available immediately. Verify with:

Terminal window
python3 -c "import torch; print(torch.cuda.is_available())"
Terminal window
# CPU-only
ccloud new container --cpus 8 --ram 32 --disk 100 --flavor datascience
# With GPU
ccloud new container --cpus 16 --ram 64 --disk 200 \
--flavor datascience --name ml-training

SSH only. No web interface. Connect your IDE via Remote-SSH for a full development experience.

Terminal window
ssh -p <port> ccloud@login.carolinacloud.io

If you want a browser-based notebook interface with the same data science stack, use the Data Science Notebook flavor instead.

The Claude Code CLI is pre-installed. Pass an Anthropic API key at creation time to pre-authenticate it for the ccloud user. See AI Integration for details.

Training and analysis containers spend most of their time idle between jobs. Enable auto-stop to have the container shut itself off after a timeout of idle CPU — compute billing halts, data persists, and you can restart on demand.