Agent S: An Open Agentic Framework that Uses Computers Like a Human
By Naomi Wilson
Posted on: October 14, 2024
We present Agent S, an open agentic framework that enables autonomous interaction with computers through a Graphical User Interface (GUI), aimed at transforming human-computer interaction by automating complex, multi-step tasks. Agent S aims to address three key challenges in automating computer tas...
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
By Javier Vásquez
Posted on: October 07, 2024
The transformer architecture predominates across various models. As the heart of the transformer, attention has a computational complexity of O(N^2), compared to O(N) for linear transformations. When handling large sequence lengths, attention becomes the primary time-consuming component. Although qu...
The last decade in deep learning has brought on increasingly capable systems that are deployed on a wide variety of applications. In natural language processing, the field has been transformed by a number of breakthroughs including large language models, which are used in increasingly many user-faci...
Lightning UQ Box: A Comprehensive Framework for Uncertainty Quantification in Deep Learning
By Kate Martin
Posted on: October 07, 2024
Uncertainty quantification (UQ) is an essential tool for applying deep neural networks (DNNs) to real world tasks, as it attaches a degree of confidence to DNN outputs. However, despite its benefits, UQ is often left out of the standard DNN workflow due to the additional technical knowledge required...
Exploring the Benefit of Activation Sparsity in Pre-training
By Kate Martin
Posted on: October 07, 2024
Pre-trained Transformers inherently possess the characteristic of sparse activation, where only a small fraction of the neurons are activated for each token. While sparse activation has been explored through post-training methods, its potential in pre-training remains untapped. In this work, we firs...
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration
By Naomi Wilson
Posted on: October 04, 2024
With expansive state-action spaces, efficient multi-agent exploration remains a longstanding challenge in reinforcement learning. Although pursuing novelty, diversity, or uncertainty attracts increasing attention, redundant efforts brought by exploration without proper guidance choices poses a pract...
CoT-ST: Enhancing LLM-based Speech Translation with Multimodal Chain-of-Thought
By Javier Vásquez
Posted on: October 02, 2024
Speech Language Models (SLMs) have demonstrated impressive performance on speech translation tasks. However, existing research primarily focuses on direct instruction fine-tuning and often overlooks the inherent reasoning capabilities of SLMs. In this paper, we introduce a three-stage training frame...
Diffusion-based generative models have demonstrated their powerful performance across various tasks, but this comes at a cost of the slow sampling speed. To achieve both efficient and high-quality synthesis, various distillation-based accelerated sampling methods have been developed recently. Howeve...