Nvidia, renowned for encouraging purchases of its latest GPUs, is unveiling a novel tool named “Chat with RTX” to empower owners of GeForce RTX 30 Series and 40 Series cards. This tool enables an AI-powered chatbot to operate offline on Windows PCs, allowing users to tailor a GenAI model akin to OpenAI’s ChatGPT. By linking it to documents and notes, users can easily pose queries without the need to sift through saved content.
The default model for Chat with RTX is Mistral’s open source AI startup, though it supports other text-based models like Meta’s Llama 2. However, users should be prepared for substantial storage requirements—ranging from 50GB to 100GB, depending on the selected model(s).
Chat with RTX currently supports text, PDF, .doc, .docx, and .xml formats. Users can point the app to a folder containing supported files to load them into the model’s fine-tuning dataset. Additionally, the tool can utilize the URL of a YouTube playlist to load transcriptions, enabling the selected model to query the contents of the videos.
Despite its capabilities, Chat with RTX has some limitations, as Nvidia transparently outlines in a how-to guide. The tool doesn’t retain context, meaning it won’t consider previous questions when answering follow-up queries. For instance, asking about a common bird and following up with its colors won’t register as a connected inquiry.
Users can also include information from YouTube videos and playlists. Adding a video URL to Chat with RTX allows users to integrate this knowledge into their chatbot for contextual queries. For example, ask for travel recommendations based on content from favorite influencer videos, or get quick tutorials and how-to based on top educational resources.
Develop LLM-based applications with Chat with RTX
Chat with RTX shows the potential of accelerating LLMs with RTX GPUs. The app is built from the TensorRT-LLM RAG developer reference project, available on GitHub. Developers can use the reference project to develop and deploy their own RAG-based applications for RTX, accelerated by TensorRT-LLM. Learn more about building LLM-based applications.
Enter a generative AI-powered Windows app or plug-in to the NVIDIA Generative AI on NVIDIA RTX developer contest, running through Friday, Feb. 23, for a chance to win prizes such as a GeForce RTX 4090 GPU, a full, in-person conference pass to NVIDIA GTC and more.
While there is a risk of models being fine-tuned on toxic content, proponents argue that the advantages outweigh the drawbacks. The emergence of applications like Chat with RTX signifies a shift towards more accessible and private local AI models, with the long-term impact yet to unfold.