Skip to content

Olive-ai 0.12.0

Latest

Choose a tag to compare

@xiaoyu-work xiaoyu-work released this 17 Apr 17:47
· 7 commits to main since this release

Olive 0.12.0

New Features

  • olive init interactive wizard (#2346, by @xiaoyu-work): Added a guided CLI experience to help users
    configure and generate Olive optimization commands more easily.
  • Olive MCP server (#2353, by @xiaoyu-work): Added an MCP server for tool and agent integrations around
    Olive workflows.
  • QAIRT ORT to Genie workflow (#2358, by @qti-kromero): Added an end-to-end Qualcomm workflow with new
    preparation, GenAI builder, and encapsulation passes.
  • Qwen3-VL and multi-image Qwen VL support (#2345, by @hanbitmyths): Added export and optimization
    support for Qwen3-VL and Qwen2.5-VL, plus new ONNX graph surgeries and 8-bit Gather quantization improvements.
  • AutoClip quantization pass (#2324, by @jambayk): Added automatic clipping search for linear layers
    before quantization.
  • Layer annotation support (#2361, by @yuslepukhin): Added CaptureLayerAnnotations and ONNX
    propagation so layer metadata can be preserved through conversion.
  • NVModelOptGraphSurgery pass (#2377, by @hthadicherla): Added NVIDIA ModelOpt graph surgery
    integration for ONNX models.

Improvements

  • AMD Quark quantization updates (#2364, by @poganesh): Updated the Quark pass for Quark 0.11, VitisAI
    LLM fusion, token fusion, and GPT-OSS pre-quantized models.
  • HQQ and RTN external data handling (#2380, by @Lidang-Jiang): Fixed ONNX quantization output
    correctness when input models store weights as external data.
  • Transformers 5.0+ compatibility (#2328, by @xiaoyu-work): Updated export and training flows for the
    new DynamicCache format and related argument handling.
  • Telemetry robustness (#2405, by @bmehta001): Fixed Linux and macOS device ID handling, auto-disabled
    telemetry in CI, and improved cache and exporter reliability.

Security

  • PyTorch model loading hardening (#2389, by @jambayk): Removed unsafe legacy torch.load(..., weights_only=False) loading paths, removed PYTORCH_ENTIRE_MODEL, and now require model_loader for PyTorch models.
  • Pydantic v2 migration (#2330, by @shaahji): Migrated Olive to Pydantic v2 across the codebase and
    updated validators, config patterns, and model serialization.