GPU is suitable not only for model inference, but also for ANN Indexing. Qdrant OSS supports GPU for faster HNSW Index construction since v1.13. Now clusters with GPUs are also available on cloud.qdrant.io. The difference? Index-building speedup by x5-x10, depending on the chosen hardware setup. For heavy-indexing use cases, it is significant. How to run Qdrant with GPU support ⤵️ https://lnkd.in/dgrX-4zg
A formidable leap in enterprise AI infrastructure, Andre. Across 4.5 decades of architecting global technology ecosystems, my absolute realization is that achieving a 10x acceleration in computational indexing is never merely a technical upgrade; it is a profound catalyst for operational agility and scalable market dominance. Championing this exact caliber of resilient, frontier-level innovation is precisely the legacy we revere at Manfriday.
This is a good reminder that infrastructure decisions in retrieval systems matter just as much as model choice.
I was thinking about this a few days ago - if we can accelerate. Glad to see it 😁
Feels like this is less about “agent infra” and more about cloud vendors realizing inference isn’t the bottleneck anymore — it’s the runtime layer around it. (Mojar AI) But I’m not sure a new data plane solves the harder problem — once workloads become non-deterministic, scheduling and resource semantics (GPU, context, latency) start leaking into that layer anyway.