Trust large language models at your own peril

Trust large language models at your own peril

According to Meta, Galactica can “summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.” But soon after its launch, it was pretty easy for outsiders to prompt the model to provide “scientific research” on the benefits of homophobia, anti-Semitism, suicide, eating glass, being white, or being a man. Meanwhile, papers on AIDS or racism were blocked. Charming!  

As my colleague Will Douglas Heaven writes in his story about the debacle: “Meta’s misstep—and its hubris—show once again that Big Tech has a blind spot about the severe limitations of large language models.” 

Not only was Galactica’s launch premature, but it shows how insufficient AI researchers’ efforts  to make large language models safer have been. 

Meta might have been confident that Galactica outperformed competitors in generating scientific-sounding content. But its own testing of the model for bias and truthfulness should have deterred the company from releasing it into the wild. 

One common way researchers aim to make large language models less likely to spit out toxic content is to filter out certain keywords. But it’s hard to create a filter that can capture all the nuanced ways humans can be unpleasant. The company would have saved itself a world of trouble if it had conducted more adversarial testing of Galactica, in which the researchers would have tried to get it to regurgitate as many different biased outcomes as possible. 

Meta’s researchers measured the model for biases and truthfulness, and while it performed slightly better than competitors such as GPT-3 and Meta’s own OPT model, it did provide a lot of biased or incorrect answers. And there are also several other limitations. The model is trained on scientific resources that are open access, but many scientific papers and textbooks are restricted behind paywalls. This inevitably leads Galactica to use more sketchy secondary sources.

Galactica also seems to be an example of something we don’t really need AI to do. It doesn’t seem as though it would even achieve Meta’s stated goal of helping scientists work more quickly. In fact, it would require them to put in a lot of extra effort to verify whether the information from the model was accurate or not. 

It’s really disappointing (yet totally unsurprising) to see big AI labs, which should know better, hype up such flawed technologies. We know that language models have a tendency to reproduce prejudice and assert falsehoods as facts. We know they can “hallucinate” or make up content, such as wiki articles about the history of bears in space. But the debacle was useful for one thing, at least. It reminded us that the only thing large language models “know” for certain is how words and sentences are formed. Everything else is guesswork.