• arxiv preprint - Learning to (Learn at Test Time): RNNs with Expressive Hidden States

  • Jul 12 2024
  • Duración: 5 m
  • Podcast

arxiv preprint - Learning to (Learn at Test Time): RNNs with Expressive Hidden States  Por  arte de portada

arxiv preprint - Learning to (Learn at Test Time): RNNs with Expressive Hidden States

  • Resumen

  • In this episode, we discuss Learning to (Learn at Test Time): RNNs with Expressive Hidden States by Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, Tatsunori Hashimoto, Carlos Guestrin. The paper introduces Test-Time Training (TTT) layers, a new type of sequence modeling layer combining the efficiency of RNNs with the long-context performance of self-attention mechanisms. TTT layers make use of a machine learning model as their hidden state, updated through self-supervised learning iterations even on test sequences. The proposed TTT-Linear and TTT-MLP models demonstrate competitive or superior performance to both advanced Transformers and modern RNNs like Mamba, with TTT-Linear proving more efficient in certain long-context scenarios.

    Más Menos
activate_primeday_promo_in_buybox_DT

Lo que los oyentes dicen sobre arxiv preprint - Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Calificaciones medias de los clientes

Reseñas - Selecciona las pestañas a continuación para cambiar el origen de las reseñas.