+

Research Posts

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: November 20, 2024

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Although quantization for linear layers has been widely used, its application to accelerate the attention process remains limited. SageAttention utilizes 8-bit matrix multiplication, 16-bit matrix multiplication with 16-bit accumulator, and precision-enhancing methods, implementing an accurate and 2...

Read More

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: November 20, 2024

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution

Image super-resolution (SR) is a classical yet still active low-level vision problem that aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts, serving as a key technique for image enhancement. Current approaches to address SR tasks, such as transformer-based a...

Read More

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: November 20, 2024

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

Real-world applications of stereo matching, such as autonomous driving, place stringent demands on both safety and accuracy. However, learning-based stereo matching methods inherently suffer from the loss of geometric structures in certain feature channels, creating a bottleneck in achieving precise...

Read More

The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use

Papers with Code
Reporter Javier Vásquez

By Javier Vásquez

Posted on: November 18, 2024

The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use

The recently released model, Claude 3.5 Computer Use, stands out as the first frontier AI model to offer computer use in public beta as a graphical user interface (GUI) agent. As an early beta, its capability in the real-world complex environment remains unknown. In this case study to explore Claude...

Read More

FinRobot: AI Agent for Equity Research and Valuation with Large Language Models

Papers with Code
Reporter Javier Vásquez

By Javier Vásquez

Posted on: November 15, 2024

FinRobot: AI Agent for Equity Research and Valuation with Large Language Models

As financial markets grow increasingly complex, there is a rising need for automated tools that can effectively assist human analysts in equity research, particularly within sell-side research. While Generative AI (GenAI) has attracted significant attention in this field, existing AI solutions often...

Read More

A Short Note on Evaluating RepNet for Temporal Repetition Counting in Videos

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: November 15, 2024

A Short Note on Evaluating RepNet for Temporal Repetition Counting in Videos

We discuss some consistent issues on how RepNet has been evaluated in various papers. As a way to mitigate these issues, we report RepNet performance results on different datasets, and release evaluation code and the RepNet checkpoint to obtain these results. Code URL: https://github.com/google-rese...

Read More

LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: November 15, 2024

LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation

Automatically and rapidly understanding Earth's surface is fundamental to our grasp of the living environment and informed decision-making. This underscores the need for a unified system with comprehensive capabilities in analyzing Earth's surface to address a wide range of human needs. The emergenc...

Read More

Caravan MultiMet: Extending Caravan with Multiple Weather Nowcasts and Forecasts

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: November 15, 2024

Caravan MultiMet: Extending Caravan with Multiple Weather Nowcasts and Forecasts

The Caravan large-sample hydrology dataset (Kratzert et al., 2023) was created to standardize and harmonize streamflow data from various regional datasets, combined with globally available meteorological forcing and catchment attributes. This community-driven project also allows researchers to conve...

Read More