FunctionChat-Bench: Comprehensive Evaluation of Language Models' Generative Capabilities in Korean Tool-use Dialogs

By Javier Vásquez

Posted on: November 22, 2024

FunctionChat-Bench: Comprehensive Evaluation of Language Models' Generative Capabilities in Korean Tool-use Dialogs

This study investigates language models' generative capabilities in tool-use dialogs. We categorize the models' outputs in tool-use dialogs into four distinct types: Tool Call, Answer Completion, Slot Question, and Relevance Detection, which serve as aspects for evaluation. We introduce FunctionChat...

Read More →

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

By Kate Martin

Posted on: November 20, 2024

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Although quantization for linear layers has been widely used, its application to accelerate the attention process remains limited. SageAttention utilizes 8-bit matrix multiplication, 16-bit matrix multiplication with 16-bit accumulator, and precision-enhancing methods, implementing an accurate and 2...

Read More →

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution

By Kate Martin

Posted on: November 20, 2024

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution

Image super-resolution (SR) is a classical yet still active low-level vision problem that aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts, serving as a key technique for image enhancement. Current approaches to address SR tasks, such as transformer-based a...

Read More →

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

By Kate Martin

Posted on: November 20, 2024

Motif Channel Opened in a White-Box: Stereo Matching via Motif Correlation Graph

Real-world applications of stereo matching, such as autonomous driving, place stringent demands on both safety and accuracy. However, learning-based stereo matching methods inherently suffer from the loss of geometric structures in certain feature channels, creating a bottleneck in achieving precise...

Read More →

The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use

By Javier Vásquez

Posted on: November 18, 2024

The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use

The recently released model, Claude 3.5 Computer Use, stands out as the first frontier AI model to offer computer use in public beta as a graphical user interface (GUI) agent. As an early beta, its capability in the real-world complex environment remains unknown. In this case study to explore Claude...

Read More →

FinRobot: AI Agent for Equity Research and Valuation with Large Language Models

By Javier Vásquez

Posted on: November 15, 2024

FinRobot: AI Agent for Equity Research and Valuation with Large Language Models

As financial markets grow increasingly complex, there is a rising need for automated tools that can effectively assist human analysts in equity research, particularly within sell-side research. While Generative AI (GenAI) has attracted significant attention in this field, existing AI solutions often...

Read More →

A Short Note on Evaluating RepNet for Temporal Repetition Counting in Videos

By Naomi Wilson

Posted on: November 15, 2024

A Short Note on Evaluating RepNet for Temporal Repetition Counting in Videos

We discuss some consistent issues on how RepNet has been evaluated in various papers. As a way to mitigate these issues, we report RepNet performance results on different datasets, and release evaluation code and the RepNet checkpoint to obtain these results. Code URL: https://github.com/google-rese...

Read More →

LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation

By Kate Martin

Posted on: November 15, 2024

LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation

Automatically and rapidly understanding Earth's surface is fundamental to our grasp of the living environment and informed decision-making. This underscores the need for a unified system with comprehensive capabilities in analyzing Earth's surface to address a wide range of human needs. The emergenc...

Read More →