OpenVINO™ toolkit – Medium

OpenVINO™ toolkit

Published in

Unlock Multi-Vendor AI: Running Workloads Across NVIDIA and Intel Hardware with Triton Inference…

Discover how to use Triton Inference Server with conventional AI models, both on NVIDIA GPU and on Intel built-in GPU and NPU.

Dec 8, 2025

Unlock Multi-Vendor AI: Running Workloads Across NVIDIA and Intel Hardware with Triton Inference…

Dec 8, 2025

Published in

OpenVINO ™ 2025.4: Faster Models, Smarter Agents

Accelerate AI with GenAI Pipelines, MoE Models, and Expanded NPU Support in OpenVINO™ 2025.4

Dec 1, 2025

OpenVINO ™ 2025.4: Faster Models, Smarter Agents

Dec 1, 2025

Published in

Never Parse Again: How OpenVINO™ GenAI Delivers Fast, Valid Structured Output Everywhere

OpenVINO™ GenAI has built-in Structured Output (SO) support, so you can run constrained decoding efficiently everywhere.

Nov 14, 2025

Never Parse Again: How OpenVINO™ GenAI Delivers Fast, Valid Structured Output Everywhere

Nov 14, 2025

Published in

Load, Run, and Accelerate: OpenVINO™ GenAI Introduces Direct GGUF Preview Support for LLMs

See how easy it is to run GGUF model inference using the OpenVINO GenAI!

Oct 14, 2025

Load, Run, and Accelerate: OpenVINO™ GenAI Introduces Direct GGUF Preview Support for LLMs

Oct 14, 2025

Published in

Early Look: Exploring Qwen3-VL and Qwen3-NEXT Day0 Model Integration for Enhanced AI PC Experiences

Exploring Qwen3-VL & Qwen3-NEXT Models on Intel AI PCs: Boosting Performance & Multimodal Understanding for Developers

Oct 6, 2025

Early Look: Exploring Qwen3-VL and Qwen3-NEXT Day0 Model Integration for Enhanced AI PC Experiences

Oct 6, 2025

Published in

Running your GenAI App locally on Intel GPU and NPU with OpenVINO™ Model Server

Get the best performance from GenAI models on different Intel hardware accelerators using OpenVINO™ Model Server.

Sep 30, 2025

Running your GenAI App locally on Intel GPU and NPU with OpenVINO™ Model Server

Sep 30, 2025

Published in

Deploying the Flux.1 Kontext Model on Intel® Arc™ Pro B60 Graphics GPU

How to use Optimum-Intel that leverages OpenVINO™ Runtime to deploy the Flux.1 Kontext dev model on the Intel® Arc™ Pro B60 Graphics GPU.

Sep 10, 2025

Deploying the Flux.1 Kontext Model on Intel® Arc™ Pro B60 Graphics GPU

Sep 10, 2025

Published in

Deploying the Qwen3-Embedding Model Series with Optimum-Intel

This article will share how to use Optimum-Intel to quickly deploy the Qwen3-Embedding series models on Intel platforms.

Sep 10, 2025

Deploying the Qwen3-Embedding Model Series with Optimum-Intel

Sep 10, 2025

Published in

OpenVINO™ 2025.3: More GenAI, More Possibilities

Discover OpenVINO 2025.3; new models, GenAI pipelines, and model server updates for faster, easier AI deployment on Intel hardware.

Sep 4, 2025

OpenVINO™ 2025.3: More GenAI, More Possibilities

Sep 4, 2025

Published in

Accelerate LLMs on Intel® GPUs: A Practical Guide to Dynamic Quantization

Optimize Transformer Inference on Intel® GPUs with Dynamic Quantization in OpenVINO™ 2025.2

Aug 4, 2025

Accelerate LLMs on Intel® GPUs: A Practical Guide to Dynamic Quantization

Aug 4, 2025

OpenVINO™ toolkit

OpenVINO™ toolkit

Deploy high-performance deep learning productively from edge to cloud with the OpenVINO™ toolkit.

Following

Help

Status

About

Careers

Press

Blog

Privacy

Rules

Terms

Text to speech