Status: Living Document Last Updated: 2025-11-30
Born ML Framework follows a production-first philosophy where models are "born" ready for deployment, not as an afterthought.
// ✅ Born: No CGO, no external dependencies
import "github.com/born-ml/born/tensor"
// ❌ Others: CGO dependencies
// OpenXLA, CUDA libraries, Python runtimeWhy it matters:
- Trivial cross-compilation: GOOS/GOARCH just works
- Single binary deployment: No containers required
- Fast cold start: < 100ms startup time
- Small memory footprint: Ideal for edge devices
// Compile-time guarantees via generics
type Tensor[T DType, B Backend] struct
// Invalid operations caught at compile-time, not runtime!Advantages:
- Entire class of bugs eliminated before runtime
- Better IDE autocomplete and refactoring
- Self-documenting APIs
- Modern Go 1.25+ idioms
Inspired by Burn (Rust), Born uses decorator composition:
base := cpu.New() // Base backend
withAutodiff := autodiff.New(base) // Add autodiff capability
optimized := fusion.New(withAutodiff) // Add kernel fusionBenefits:
- Swappable backends (CPU, CUDA, Vulkan, WebGPU)
- Layered functionality (autodiff, fusion, quantization)
- Testable components
- Flexible architecture
Traditional ML workflow:
Research (Python) → Rewrite (Go/C++) → Production
↑ Lost details, bugs introduced
Born workflow:
Research (Go) → Production (Go)
↑ Same codebase, same behavior!
Use cases:
- ✅ Go microservices + ML inference
- ✅ Edge deployment (IoT, embedded)
- ✅ Cloud-native ML serving (Kubernetes)
- ✅ ML Systems research (distributed learning, federated ML)
- ✅ Integration with Go ecosystem
Python problems for production:
- 🐌 Slow startup (import torch takes seconds)
- 📦 Dependency hell (pip, conda, virtualenv)
- 🐳 Large Docker images (GB sizes)
- 🔧 Integration friction with Go backends
- 🧵 GIL limitations for concurrency
Go advantages:
- ⚡ Fast startup (< 100ms)
- 📦 Single binary deployment
- 🐳 Minimal Docker images (from scratch)
- 🔧 Native integration with Go services
- 🧵 Excellent concurrency primitives
Burn (Rust ML framework) proved that:
- Backend abstraction works well
- Decorator pattern enables flexibility
- Type safety doesn't hurt expressiveness
- Production-focused design is viable
Born adapts these concepts for Go ecosystem.
PyTorch is excellent for:
- ❌ Research prototyping (if you're Python-first)
- ❌ Large-scale distributed training (with Python infrastructure)
- ❌ Access to massive pre-trained model zoo
Born is better for:
- ✅ Production deployment (single binary)
- ✅ Go-native integration (no FFI overhead)
- ✅ Edge inference (low resource usage)
- ✅ Reproducible research (deterministic builds)
- ✅ Type-safe ML (compile-time checks)
| Feature | Born | GoMLX |
|---|---|---|
| Dependencies | Pure Go ✅ | OpenXLA/PJRT (C++) ❌ |
| Cross-compilation | Trivial ✅ | Complex |
| Startup time | < 100ms ✅ | Slower |
| Generics | Go 1.25+ ✅ | Go 1.18+ ✅ |
| Maturity | Early development |
More mature ✅ |
| Feature | Born | Gorgonia |
|---|---|---|
| Generics | Type-safe ✅ | Pre-generics ❌ |
| API Design | Modern ✅ | Legacy |
| Backend Abstraction | Decorator pattern ✅ | Limited |
| Active Development | Active ✅ | Slower |
Hybrid approach:
PyTorch/TF (training) → ONNX export → Born (deployment)
Advantages:
- Use Python ecosystem for training (if preferred)
- Deploy as Go binary (production benefits)
- Best of both worlds
1. Go Microservices + ML
// Microservice with embedded ML model
func handler(w http.ResponseWriter, r *http.Request) {
prediction := model.Predict(parseRequest(r))
json.NewEncoder(w).Encode(prediction)
}
// One binary, no Python sidecar!2. Edge Deployment
- Raspberry Pi, IoT devices
- Limited resources (RAM, CPU)
- No internet connectivity
- Fast inference required
3. Kubernetes Operators
- ML model serving in K8s
- Native Go integration
- Cloud-native observability
- HPA integration
4. ML Systems Research
- Distributed learning algorithms
- Federated learning
- Systems + ML intersection
- Production-critical research
1. Large-Scale Training
- Distributed training not implemented (Phase 4)
- No multi-GPU support yet (Phase 2-3)
2. Complex Pre-Trained Models
- Model zoo not ready (Phase 4)
- ONNX import planned (Phase 3)
3. Pure Algorithm Research
- If you're Python-first ecosystem
- If you need latest transformers/diffusion models
- If ecosystem size > all else
- Pure Go tensor operations
- CPU backend
- Autodiff engine
- Basic NN modules (Linear, Conv2D, Activations)
- SGD/Adam optimizers
- WebGPU backend (zero-CGO via go-webgpu)
- WGSL compute shaders
- GPU buffer pooling & memory management
- 123x MatMul speedup, 10.9x inference speedup
- Math operations (Exp, Sqrt, Rsqrt, Cos, Sin)
- Reductions (SumDim, MeanDim)
- Manipulation (Cat, Chunk, Unsqueeze, Squeeze)
- Modern layers (SiLU, RMSNorm, Embedding)
- LLaMA/GPT/Mistral architecture support
- Multi-head attention (MHA)
- Scaled dot-product attention
- KV-cache for efficient inference
- Layer normalization variants
- Linux/macOS WebGPU support
- ONNX import (PyTorch/TF models)
- Model quantization (INT8, FP16)
- Pre-trained model loading
- Training utilities (BatchNorm, Dropout)
- Distributed training
- Advanced optimizations
- Model zoo
See ROADMAP.md for detailed timeline and milestones.
- Go generics available (1.18+, mature in 1.25+)
- Cloud-native deployment critical
- Python dependency hell is real problem
- goffi + go-webgpu enabling technologies
Production ML deployment is painful:
- Complex dependencies
- Large container images
- Slow startup times
- Integration friction
Born solves these problems.
Burn (Rust) proved the concept works. Born adapts proven patterns for Go ecosystem.
- Go dominates cloud-native (Kubernetes, Docker, etc.)
- Microservices architecture (Go's strength)
- Edge computing growth (IoT, embedded)
- ML inference > training in production
Goal: Born becomes the default choice for:
-
ML deployment in Go ecosystem
- Every Go service that needs ML uses Born
- "Train anywhere, deploy Born"
-
Edge ML inference
- Low-resource devices
- Fast startup required
- Offline inference
-
ML Systems research
- Distributed learning
- Federated ML
- Production-critical experiments
Not replacing PyTorch for everything - but becoming the standard for production ML in Go.
When contributing to Born, prioritize:
- Production-readiness > Feature count
- Type safety > Dynamic flexibility
- Zero dependencies > Convenience
- Performance > Ease of implementation
- Composability > Monolithic design
Every feature must answer: "Does this help production deployment?"
If yes → implement. If no → reconsider.
"Born Production-Ready" - это не слоган, это архитектурный принцип! 🚀