Summary of Key Papers from Carnegie Mellon University at ICLR 2023
1. Mamba-3: Improved Sequence Modeling using State Space Principles
- Authors: Aakash Sunil Lahoti, Kevin Li, Berlin Chen, Caitlin Wang, Aviv Bick, Zico Kolter, Tri Dao, Albert Gu
- Abstract: Mamba-3 is a new model that enhances AI inference efficiency and performance. It addresses the challenge of tracking long-term information by improving state modeling and using a multi-input, multi-output design to boost accuracy without slowing down generation.
2. Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding
- Authors: Yuxuan Zhou, Fei Huang, Heng Li, Fengyi Wu, Tianyu Wang, Jianwei Zhang, Junyang Lin, Zhi-Qi Cheng
- Abstract: Hierarchical Speculative Decoding (HSD) is introduced to speed up large language model inference by improving the verification step in speculative decoding while preserving exact output distributions. It redistributes probability mass across branches, enabling more tokens to be accepted at once and achieving state-of-the-art efficiency.
**3. Distributional Equ
Read the full article at Machine Learning Blog | ML@CMU | Carnegie Mellon University
Want to create content about this topic? Use Nemati AI tools to generate articles, social posts, and more.

![[AINews] The Unreasonable Effectiveness of Closing the Loop](/_next/image?url=https%3A%2F%2Fmedia.nemati.ai%2Fmedia%2Fblog%2Fimages%2Farticles%2F600e22851bc7453b.webp&w=3840&q=75)



