Elon Musk's X.ai launches Grok-1.5: The enhanced generative AI model

Features of Grok-1.5

Capabilities and Reasoning

One of the most notable improvements in Grok-1.5 is its performance in coding and math-related tasks. In our tests, Grok-1.5 achieved a 50.6% score on the MATH benchmark and a 90% score on the GSM8K benchmark, two math benchmarks covering a wide range of grade school to high school competition problems. Additionally, it scored 74.1% on the HumanEval benchmark, which evaluates code generation and problem-solving abilities.

Long Context Understanding

A new feature in Grok-1.5 is the capability to process long contexts of up to 128K tokens within its context window. This allows Grok to have an increased memory capacity of up to 16 times the previous context length, enabling it to utilize information from substantially longer documents.

According to X.ai, Grok-1.5 can now handle longer and more complex prompts while still maintaining its instruction-following capability, thanks to its expanded context window.

Elon Musk's X.ai launches Grok-1.5: The enhanced generative AI model image 136

In the Needle In A Haystack (NIAH) evaluation, Grok-1.5 demonstrated powerful retrieval capabilities for embedded text within contexts of up to 128K tokens in length, achieving perfect retrieval results.

Grok-1.5 Infra

Cutting-edge Large Language Model (LLMs) research that runs on massive GPU clusters demands robust and flexible infrastructure. Grok-1.5 is built on a custom distributed training framework based on JAX, Rust, and Kubernetes.

This training stack enables our team to prototype ideas and train new architectures at scale with minimal effort. A major challenge of training LLMs on large compute clusters is maximizing reliability and uptime of the training job.

Historically, X.ai’s Grok models have stood out for their willingness to engage with topics typically avoided by other models, such as conspiracies and controversial political ideas. Additionally, they exhibit a “rebellious streak,” responding with blunt language when requested.

It remains unclear if Grok-1.5 introduces any changes in these aspects, as X.ai’s blog post does not address them. The announcement of Grok-1.5 comes after X.ai open sourced Grok-1, without the code necessary to fine-tune or further train it.

Grok-1.5 will soon be available to early testers, and we look forward to receiving your feedback to help us improve Grok. As we gradually roll out Grok-1.5 to a wider audience, we are excited to introduce several new features over the coming days.

Please follow and like us: