Data Science Container
The data science container is a ready-to-use environment for machine learning, data analysis, and scientific computing. It includes a comprehensive Python stack, R, Julia, and GPU-accelerated libraries when provisioned with a GPU.
Pre-installed packages
Section titled “Pre-installed packages”Python data science: polars, pandas, numpy, scipy, scikit-learn, statsmodels, xgboost, lightgbm
Deep learning: PyTorch, transformers, tokenizers, accelerate
Visualization: matplotlib, seaborn, plotly, altair
GPU acceleration (when GPU attached): CUDA toolkit, RAPIDS (cuDF, cuML), PyTorch with CUDA support
Languages: Python 3.12, R, Julia
Tools: Jupyter (available but not the primary interface), pip, conda, git
When to use it
Section titled “When to use it”- Model training on large datasets with dedicated GPU resources
- Feature engineering and data wrangling with polars/pandas on high-memory instances
- Persistent development environment you can SSH into from your IDE
- Batch processing jobs that need significant compute
- GPU-accelerated workloads (RAPIDS for dataframe operations, PyTorch for training)
GPU support
Section titled “GPU support”Add a GPU at creation time for hardware-accelerated workloads:
curl -X POST https://console.carolinacloud.io/api/instance/ \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "resource_type": "container", "flavor": "datascience", "n_vcpus": 16, "mem_gib": 64, "disk_size_gib": 200, "gpu_model": "RTX 5090" }'When a GPU is attached, CUDA, cuDNN, and GPU-accelerated Python libraries are available immediately. Verify with:
python3 -c "import torch; print(torch.cuda.is_available())"Creating a data science container
Section titled “Creating a data science container”# CPU-onlyccloud new container --cpus 8 --ram 32 --disk 100 --flavor datascience
# With GPUccloud new container --cpus 16 --ram 64 --disk 200 \ --flavor datascience --name ml-trainingAccess
Section titled “Access”SSH only. No web interface. Connect your IDE via Remote-SSH for a full development experience.
ssh -p <port> ccloud@login.carolinacloud.ioIf you want a browser-based notebook interface with the same data science stack, use the Data Science Notebook flavor instead.
AI integration
Section titled “AI integration”The Claude Code CLI is pre-installed. Pass an Anthropic API key at creation time to pre-authenticate it for the ccloud user. See AI Integration for details.
Auto-stop on idle
Section titled “Auto-stop on idle”Training and analysis containers spend most of their time idle between jobs. Enable auto-stop to have the container shut itself off after a timeout of idle CPU — compute billing halts, data persists, and you can restart on demand.