Published inOpenVINO-toolkitUnlock Multi-Vendor AI: Running Workloads Across NVIDIA and Intel Hardware with Triton Inference…Discover how to use Triton Inference Server with conventional AI models, both on NVIDIA GPU and on Intel built-in GPU and NPU.Dec 8, 2025Dec 8, 2025
Published inOpenVINO-toolkitOpenVINO ™ 2025.4: Faster Models, Smarter AgentsAccelerate AI with GenAI Pipelines, MoE Models, and Expanded NPU Support in OpenVINO™ 2025.4Dec 1, 2025A response icon1Dec 1, 2025A response icon1
Published inOpenVINO-toolkitNever Parse Again: How OpenVINO™ GenAI Delivers Fast, Valid Structured Output EverywhereOpenVINO™ GenAI has built-in Structured Output (SO) support, so you can run constrained decoding efficiently everywhere.Nov 14, 2025Nov 14, 2025
Published inOpenVINO-toolkitLoad, Run, and Accelerate: OpenVINO™ GenAI Introduces Direct GGUF Preview Support for LLMsSee how easy it is to run GGUF model inference using the OpenVINO GenAI!Oct 14, 2025Oct 14, 2025
Published inOpenVINO-toolkitEarly Look: Exploring Qwen3-VL and Qwen3-NEXT Day0 Model Integration for Enhanced AI PC ExperiencesExploring Qwen3-VL & Qwen3-NEXT Models on Intel AI PCs: Boosting Performance & Multimodal Understanding for DevelopersOct 6, 2025Oct 6, 2025
Published inOpenVINO-toolkitRunning your GenAI App locally on Intel GPU and NPU with OpenVINO™ Model ServerGet the best performance from GenAI models on different Intel hardware accelerators using OpenVINO™ Model Server.Sep 30, 2025Sep 30, 2025
Published inOpenVINO-toolkitDeploying the Flux.1 Kontext Model on Intel® Arc™ Pro B60 Graphics GPUHow to use Optimum-Intel that leverages OpenVINO™ Runtime to deploy the Flux.1 Kontext dev model on the Intel® Arc™ Pro B60 Graphics GPU.Sep 10, 2025Sep 10, 2025
Published inOpenVINO-toolkitDeploying the Qwen3-Embedding Model Series with Optimum-IntelThis article will share how to use Optimum-Intel to quickly deploy the Qwen3-Embedding series models on Intel platforms.Sep 10, 2025Sep 10, 2025
Published inOpenVINO-toolkitOpenVINO™ 2025.3: More GenAI, More PossibilitiesDiscover OpenVINO 2025.3; new models, GenAI pipelines, and model server updates for faster, easier AI deployment on Intel hardware.Sep 4, 2025Sep 4, 2025
Published inOpenVINO-toolkitAccelerate LLMs on Intel® GPUs: A Practical Guide to Dynamic QuantizationOptimize Transformer Inference on Intel® GPUs with Dynamic Quantization in OpenVINO™ 2025.2Aug 4, 2025Aug 4, 2025