Nvidia launches NIM: Streamlining AI model deployment

Advertisements

Nvidia has just introduced a software platform, Nvidia NIM, at its GTC conference. This platform is meticulously crafted to expedite the integration of custom and pre-trained AI models into production environments. By amalgamating optimized inferencing engines with specific models and packaging them into accessible microservices, NIM heralds a new era of efficiency and accessibility in AI deployment.

Traditionally, the process of shipping such containers could consume weeks, if not months, of developers’ time — a challenge exacerbated by the scarcity of in-house AI expertise. Nvidia NIM addresses this pain point directly by fostering an ecosystem of AI-ready containers. These containers leverage Nvidia’s hardware as the foundational layer, with meticulously curated microservices serving as the core software layer. The objective? Accelerating companies’ AI roadmaps with minimal hassle.

The versatility of NIM is underscored by its support for models from a spectrum of renowned entities, including NVIDIA, A121, Adept, Cohere, Getty Images, and Shutterstock, alongside open models from tech giants like Google, Hugging Face, Meta, Microsoft, Mistral AI, and Stability AI.

NVIDIA NIM is a containerized inference microservice including industry-standard APIs, domain-specific code, optimized inference engines, and enterprise runtime

Nvidia’s strategic collaborations with industry heavyweights such as Amazon, Google, and Microsoft ensure seamless integration of NIM microservices into platforms like SageMaker, Kubernetes Engine, and Azure AI, respectively. Moreover, frameworks like Deepset, LangChain, and LlamaIndex are set to embrace these microservices, further amplifying their utility.

Core benefits of NIM

Deploy anywhere
Develop with industry-standard APIs
Leverage domain-specific models
Run on optimized inference engines
Support for enterprise-grade AI

According to Manuvir Das, Nvidia’s head of enterprise computing, the Nvidia GPU stands as the optimal platform for running inference on these models.

He underscores the efficiency and enterprise-grade nature of NVIDIA NIM, positioning it as the preferred software package for developers. With NIM, developers can redirect their focus towards enterprise applications, confident that Nvidia’s infrastructure will handle the heavy lifting of model production efficiently.

Nvidia launches NIM: Streamlining AI model deployment image 91

Underpinning Nvidia NIM is a robust suite of inference engines, including the Triton Inference Server, TensorRT, and TensorRT-LLM. Among the array of microservices available through NIM, notable offerings include Riva for speech and translation model customization, cuOpt for routing optimizations, and the Earth-2 model for weather and climate simulations.

The roadmap for NIM promises further enhancements, including the integration of the Nvidia RAG LLM operator, facilitating the development of generative AI chatbots.

The real-world impact of NIM is exemplified by its adoption among prominent enterprises such as Box, Cloudera, Cohesity, Datastax, Dropbox, and NetApp. Jensen Huang, Nvidia’s founder and CEO, envisions these containerized AI microservices as catalysts for enterprises across all industries to transition into AI-driven entities. The collaborative efforts of Nvidia and its partner ecosystem are poised to empower enterprises, enabling them to harness the latent potential of their data through generative AI capabilities.