Despite their huge success, the inner workings of large language models such as OpenAI’s GPT model family and Google Bard remain a mystery, even to their developers. Researchers at ETH and Google have uncovered a potential key mechanism behind their ability to learn on-the-fly and fine-tune their answers based on interactions with their users. Johannes von Oswald is a doctoral student in the group headed by Angelika Steger, ETH Professor for Theoretical Computer Science, and researches learning algorithms for neural networks. His new external pagepapercall_made will be presented at the International Conference on Machine Learning (ICML) in late July.
The T in GPT stands for transformers. What are transformers and why did they become so prevalent in modern AI?
Johannes von Oswald: Transformers are a particular artificial neural network architecture. It is for example used by large language models such as ChatGPT, but was put on the map in 2017 by researchers at Google, where it led to state-of-the-art performance in language translation. Intriguingly, a slightly modified version of this architecture was already developed by the AI-Pioneer Jürgen Schmidhuber back in 1991.
And what distinguishes this architecture?
Before the recent breakthrough of Transformers, different tasks, e.g. image classification and language translation, had used different model architectures that were each specialised on these specific domains. A crucial aspect that sets transformers apart from these previous AI models is that they seem to work extremely well on any kind of task. Because of their widespread use, it is important to understand how they work.
What did your research reveal?
While neural networks are generally regarded a black box that spit out output when provided with input, we showed that transformers can learn on their own to implement algorithms within their architecture. We were able to show that they can implement a classic and powerful machine learning algorithm that learns from the recent information it receives.
Can you give an example when this type of learning can occur?
You might, for instance, provide the language model with several texts and the sentiment – either positive or negative – associated with each of them. You can go on to present the model with a text it hasn’t seen before, and it will predict whether it is positive or negative based on the examples you have provided.