Macaw presents a view into the promising future of AI problem solving.
An arrow pointing leftHome

Question-answering AI Macaw outperforms GPT-3 by more than 10%


Macaw may be an order of magnitude smaller, but this is no parrot. Even better, Macaw is publicly available for free.

OpenAI’s GPT-3 system is good at answering your random questions, but it’s only available to approved users, and for a fee. Now it may have met its match — and this one doesn’t cost even a cracker. AI2 just released Macaw — short for multi-angle question-answering — a versatile, generative question-answering (QA) system that hits the bullseye on a wide range of question types. In fact, using a suite of 300 challenge questions, Macaw outperformed GPT-3 by more than 10%, even though Macaw is an order of magnitude smaller (11 billion vs. 175 billion parameters). It may not have the range of capabilities of GPT-3, but its QA ability is killer. (And did we mention, free?)

The team at AI2 fed this well-known riddle to Macaw: A young boy was rushed to the hospital emergency room, but the ER doctor saw the boy and refused to operate. “This boy is my son,” the doctor said. But the doctor wasn’t the boy’s father. How could this be?

The conventional answer to this is, of course, the doctor was the boy’s mother. Macaw, however, answered: He mistook the boy for his own son.

“Not what we were expecting, but a completely reasonable alternative!” says AI2 CEO Oren Etzioni. “And one which, at least on the surface, suggests remarkable ability to process questions.”

Here are some other amusing riddles Macaw was asked, along with its answers.

Q: How would you make a house conduct electricity?

A: Paint it with a metal paint

Q: What is an implication of a cat being an animal?

A: It has feelings

Q: What is an incorrect implication of a cat being an animal?

A: It is a human

Q: Could a newborn look over the top of a fully grown horseradish plant?

A: No, because the plant is so tall

Q: Why couldn’t Aristotle have used a cellphone?

A: He was born before cellphones

Q: Why can’t an amoeba design software?

A: It lacks the necessary neurons

Like all NLP systems, Macaw has its flaws. For one, it struggles with entity tracking.

Q: Imagine an empty cup. Now put a coin in the cup. Now take the coin out of the cup. How many coins are in the cup now?

A: 1

And it’s still working on its common sense. Q: Sally’s favorite cow died yesterday. When will the cow be alive again?

A: In a few days

But it’s free! You can download the model (plus the challenge dataset), or read the paper, General-Purpose Question-Answering with Macaw.