I built an Ollama alternative with TurboQuant, model groups, and multi-GPU support
📰 Dev.to · deharoalexandre-cyber
EIE is a policy-driven multi-model GGUF inference server. TurboQuant-native KV cache, CUDA + ROCm, parallel/sequential model groups, Apache 2.0.
DeepCamp AI