RMAAT - Bio-Inspired Transformers

RMAAT: Astrocyte-Inspired Memory Compression and Replay for Efficient Long-Context Transformers

Accepted at ICLR 2026

This research project introduces computational principles derived from astrocytes—glial cells critical for biological memory and synaptic modulation—to address the quadratic complexity bottleneck in Transformers.

Research Motivation

Traditional transformer models face significant computational challenges when processing long sequences due to the quadratic scaling of attention mechanisms. This project addresses these limitations by incorporating biological principles of attention and memory processing.

Core Innovations

🧠 Segment-Based Recurrent Processing: RMAAT processes input sequences in segments. Persistent memory tokens propagate contextual information across segments, maintaining a recurrent state.

🔄 Astrocyte-Inspired Retention Factor (LTP): An adaptive compression mechanism governs memory tokens. A novel retention factor, derived from simulated astrocyte Long-Term Plasticity (LTP), decides what information to keep or discard.

⚡ Linear-Complexity Attention (STP): Within each segment, attention is computed using an efficient, linear-complexity mechanism inspired by astrocyte Short-Term Plasticity (STP), avoiding the O(N^2) cost of standard attention.

📚 Astrocytic Memory Replay Backpropagation (AMRB): A novel training algorithm designed for memory efficiency in recurrent networks.

Results

Evaluations on the Long Range Arena (LRA) benchmark demonstrate RMAAT’s:

Competitive accuracy compared to standard Transformers.
Substantial improvements in computational efficiency.
Significant reduction in memory usage.

Research Team

Principal Investigator: Md Zesun Ahmed Mia

Collaborators:

Malyaban Bal
Abhronil Sengupta

Institution: Pennsylvania State University

Publication

Conference: International Conference on Learning Representations (ICLR) 2026

Links:

Future Directions

This research opens new avenues for:

Further bio-inspired AI architectures
Enhanced efficiency in large language models
Applications in real-time sequence processing
Integration with neuromorphic computing systems

The work contributes to the broader goal of developing more efficient and biologically plausible artificial intelligence systems.