SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
By Kate Martin
Posted on: November 20, 2024
Although quantization for linear layers has been widely used, its application to accelerate the attention process remains limited. SageAttention utilizes 8-bit matrix multiplication, 16-bit matrix multiplication with 16-bit accumulator, and precision-enhancing methods, implementing an accurate and 2...
Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution
By Kate Martin
Posted on: November 20, 2024
Image super-resolution (SR) is a classical yet still active low-level vision problem that aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts, serving as a key technique for image enhancement. Current approaches to address SR tasks, such as transformer-based a...
Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph
By Kate Martin
Posted on: November 20, 2024
Real-world applications of stereo matching, such as autonomous driving, place stringent demands on both safety and accuracy. However, learning-based stereo matching methods inherently suffer from the loss of geometric structures in certain feature channels, creating a bottleneck in achieving precise...
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use
By Javier Vásquez
Posted on: November 18, 2024
The recently released model, Claude 3.5 Computer Use, stands out as the first frontier AI model to offer computer use in public beta as a graphical user interface (GUI) agent. As an early beta, its capability in the real-world complex environment remains unknown. In this case study to explore Claude...
FinRobot: AI Agent for Equity Research and Valuation with Large Language Models
By Javier Vásquez
Posted on: November 15, 2024
As financial markets grow increasingly complex, there is a rising need for automated tools that can effectively assist human analysts in equity research, particularly within sell-side research. While Generative AI (GenAI) has attracted significant attention in this field, existing AI solutions often...
A Short Note on Evaluating RepNet for Temporal Repetition Counting in Videos
By Naomi Wilson
Posted on: November 15, 2024
We discuss some consistent issues on how RepNet has been evaluated in various papers. As a way to mitigate these issues, we report RepNet performance results on different datasets, and release evaluation code and the RepNet checkpoint to obtain these results. Code URL: https://github.com/google-rese...
LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
By Kate Martin
Posted on: November 15, 2024
Automatically and rapidly understanding Earth's surface is fundamental to our grasp of the living environment and informed decision-making. This underscores the need for a unified system with comprehensive capabilities in analyzing Earth's surface to address a wide range of human needs. The emergenc...
Caravan MultiMet: Extending Caravan with Multiple Weather Nowcasts and Forecasts
By Kate Martin
Posted on: November 15, 2024
The Caravan large-sample hydrology dataset (Kratzert et al., 2023) was created to standardize and harmonize streamflow data from various regional datasets, combined with globally available meteorological forcing and catchment attributes. This community-driven project also allows researchers to conve...