Skip to content

Shelly Palmer - More LLMs Every Day

Each model has unique attributes that make them useful for different use cases.
shellypalmertuesday

While I was heading to SXSW last week, I saw an announcement from Inflection, an AI startup founded by DeepMind co-founder Mustafa Suleyman and LinkedIn co-founder Reid Hoffman, about a new foundational model called Inflection-2.5.

Did you hear about it? Should you care about it? Is it as good as OpenAI's GPT-4? What do I mean by good? Is it something you could use?

Like almost every week in 2024, last week was filled with announcements about improvements in AI models and their associated apps. According to Inflection AI, Inflection-2.5 performs at more than 94% the average performance of GPT-4, despite using only 40% of the training FLOPs (floating point operations per second). The company believes the efficiency of Inflection-2.5 demonstrates that powerful performance does not necessarily require massive amounts of compute.

On the MMLU benchmark, which tests performance across a wide range of tasks from high school to professional level, Inflection-2.5 scored 85.5, just behind GPT-4's 87.3. Inflection-2.5 also performed competitively against GPT-4 in STEM exams, scoring 63 in the Hungarian Math exam (vs. GPT-4's 73), 68 in the Romanian Science exam (vs. GPT-4's 72), and 65 in the U.S. Biology exam (vs. GPT-4's 67). I guess Hungarian and Romanian math and science students should still consider GPT-4 the gold standard…

... but that's not the point. You can try out Inflection-2.5 at pi.ai. There, you will find a very friendly interface that can speak to you in various voices and in various languages. According to Inflection AI, Pi is designed to have a more conversational and friendly tone than its competitors. (Of course, you'll have to judge that for yourself.)

The key takeaway from this announcement is that you should spend a few minutes (every week or so) experimenting with AI models and apps that you don't use regularly. If you're a GPT-4 user, try Claude 3, or Pi (Inflection-2.5), or Mistral 7B large, or Llama 2, or any other model or app you've heard of but don't use every day. You'll quickly learn that each model has unique attributes that make them useful for different use cases.

As always, your thoughts and comments are both welcome and encouraged. Just reply to this email. -s

sp@shellypalmer.com

ABOUT SHELLY PALMER

Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications and CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named LinkedIn’s “Top Voice in Technology,” he covers tech and business for Good Day New York, is a regular commentator on CNN and writes a popular daily business blog. He's a bestselling author, and the creator of the popular, free online course, Generative AI for Execs. Follow @shellypalmer or visit shellypalmer.com