Transformers without Normalization
Proposes a novel approach to training Transformers without normalization layers, using a technique called Dynamic Tanh (DyT) which effectively replace...
see more
Find all the TopLanguage papers. Links to pdf, code repos and demos are provided.
Proposes a novel approach to training Transformers without normalization layers, using a technique called Dynamic Tanh (DyT) which effectively replace...
see more
Investigates the use of Sparse Autoencoders (SAEs) to enhance interpretability in Artificial Text Detection (ATD), focusing on feature extraction from...
see more
Introduces CoSTA*, a novel cost-sensitive toolpath agent for multi-turn image editing, combining large language models (LLMs) and A* search to optimiz...
see more
Investigates the understanding capabilities of large language models (LLMs) through a task called PHYSICO, designed to assess their comprehension of p...
see more
Introduces LLMVoX, a lightweight, autoregressive streaming text-to-speech (TTS) system designed to enhance speech-enabled dialogue systems without com...
see more
Presents STORM, a novel architecture for processing long video sequences in multimodal large language models (LLMs) by incorporating a temporal encode...
see more
MLGym is a novel framework and benchmark designed to evaluate and develop large language model (LLM) agents on diverse AI research tasks, providing a ...
see more
Introduces START, a novel long chain-of-thought (CoT) reasoning model that integrates external tools, specifically a Python interpreter, to enhance re...
see more
Introduces RuCCoD, a new dataset for automating ICD coding in Russian, and evaluates state-of-the-art models for this task. Using a combination of BER...
see more
Introduces EgoLife, an innovative project aimed at developing an egocentric AI life assistant using wearable AI-powered glasses. A comprehensive datas...
see more
EuroBERT is a new family of multilingual encoders designed to enhance general-purpose vector representations across European and widely spoken global ...
see more
Since the debut of Evolution Strategies (ES) as a tool for Reinforcement Learning by Salimans et al. 2017, there has been interest in determining the ...
see more