Abstract
An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor. At the core of the Abstractor is a variant of attention called relational cross-attention. The approach is motivated by an architectural inductive bias for relational learning that disentangles relational information from object-level features. This enables explicit relational reasoning, supporting abstraction and generalization from limited data. The Abstractor is first evaluated on simple discriminative relational tasks and compared to existing relational architectures. Next, the Abstractor is evaluated on purely relational sequence-to-sequence tasks, where dramatic improvements are seen in sample efficiency compared to standard Transformers. Finally, Abstractors are evaluated on a collection of tasks based on mathematical problem solving, where consistent improvements in performance and sample efficiency are observed.
| Original language | American English |
|---|---|
| State | Published - 2024 |
| Event | 12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria Duration: May 7 2024 → May 11 2024 |
Conference
| Conference | 12th International Conference on Learning Representations, ICLR 2024 |
|---|---|
| Country/Territory | Austria |
| City | Hybrid, Vienna |
| Period | 5/7/24 → 5/11/24 |
ASJC Scopus subject areas
- Language and Linguistics
- Computer Science Applications
- Education
- Linguistics and Language
Fingerprint
Dive into the research topics of 'ABSTRACTORS AND RELATIONAL CROSS-ATTENTION: AN INDUCTIVE BIAS FOR EXPLICIT RELATIONAL REASONING IN TRANSFORMERS'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver