Skip to content

Product

ISIRO Runtime™

AI inference efficiency layer, powered by proprietary TIC™ (Tensor Inference Core) technology. Reduces memory traffic during inference for associated cost and energy savings while preserving exact model output. TIC Shield™ add-on provides security and control for protected deployments.

Performance

Lower memory traffic. Exact model output.

Representative results from scoped evaluations.

30%

Lower memory traffic on BF16 LLM workloads

Exact

Model output preserved bit for bit (no quantization)

Up to 2×

Lower latency vs cuBLAS baseline (evaluated workloads)

Overview

Efficiency layer for your existing inference stack

ISIRO Runtime sits between models and the existing inference stack. Compile your model once into a compact, execution-native .tic representation, then deploy through ISIRO Runtime, which integrates the inference frameworks you already use as targets.

TIC Shield™ adds security and control for protected deployments.

A dashboard and an OpenAI-compatible API with ISIRO observability built in provide visibility into performance, resource utilization, and TIC Shield status.

Technical Overview

Original model.onnx · .safetensors · .pt · etcISIRO Runtime compilerone-time.tic artifactISIRO RuntimeInference frameworksvLLM · TensorRT · ROCm · OpenVINO · etcHardware platformsNVIDIA · AMD · Intel · etcISIRO Runtime efficiency layerOriginal inference stack

Supported today

NVIDIA GPUs · vLLM

ISIRO Runtime supports BF16 vLLM workloads on NVIDIA GPUs in on-prem and cloud deployments today. Support for additional inference frameworks and hardware platforms is on the roadmap.

Add-on

TIC Shield™

Security and control of .tic artifacts. KMS-gated cryptographic protection and execution-time control.

At rest & in transit

Envelope encryption

Software

KMS-backed envelope encryption protects .tic artifacts at rest and in transit. Content is encrypted under envelope keys so artifacts stay protected outside your trust boundary until policy allows release.

Signature

Software

Models are signed with KMS-backed keys so ISIRO Runtime can verify integrity and provenance before load.

In use

TIC Lock™

Software

TIC Lock™ binds the model's in-memory representation to each deployment, with a dedicated KMS lock key ID on record, so GPU memory is not a portable checkpoint across environments. Optional hardware-backed confidential computing stacks on top where you require it.

Hardware-backed confidential computing

When hardware isolation is required, TIC Shield supports attestation-gated key release from your KMS and integration with confidential-computing environments. Supported today: NVIDIA Confidential Computing and KMS attestation-gated key release. Additional TEE platforms on our roadmap.

SoftwareSoftware: Built-in software-based protections. No inference speed penalty when activated.

Ready to evaluate ISIRO Runtime?

Run in cloud or on-prem environment without sharing your model. Compare exact output, performance, and cost indicators against your baseline.