The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use
By Javier Vásquez
Posted on: November 18, 2024
The recently released model, Claude 3.5 Computer Use, stands out as the first frontier AI model to offer computer use in public beta as a graphical user interface (GUI) agent. As an early beta, its capability in the real-world complex environment remains unknown. In this case study to explore Claude...
FinRobot: AI Agent for Equity Research and Valuation with Large Language Models
By Javier Vásquez
Posted on: November 15, 2024
As financial markets grow increasingly complex, there is a rising need for automated tools that can effectively assist human analysts in equity research, particularly within sell-side research. While Generative AI (GenAI) has attracted significant attention in this field, existing AI solutions often...
A Short Note on Evaluating RepNet for Temporal Repetition Counting in Videos
By Naomi Wilson
Posted on: November 15, 2024
We discuss some consistent issues on how RepNet has been evaluated in various papers. As a way to mitigate these issues, we report RepNet performance results on different datasets, and release evaluation code and the RepNet checkpoint to obtain these results. Code URL: https://github.com/google-rese...
LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
By Kate Martin
Posted on: November 15, 2024
Automatically and rapidly understanding Earth's surface is fundamental to our grasp of the living environment and informed decision-making. This underscores the need for a unified system with comprehensive capabilities in analyzing Earth's surface to address a wide range of human needs. The emergenc...
Caravan MultiMet: Extending Caravan with Multiple Weather Nowcasts and Forecasts
By Kate Martin
Posted on: November 15, 2024
The Caravan large-sample hydrology dataset (Kratzert et al., 2023) was created to standardize and harmonize streamflow data from various regional datasets, combined with globally available meteorological forcing and catchment attributes. This community-driven project also allows researchers to conve...
Scaling Mesh Generation via Compressive Tokenization
By Kate Martin
Posted on: November 13, 2024
We propose a compressive yet effective mesh representation, Blocked and Patchified Tokenization (BPT), facilitating the generation of meshes exceeding 8k faces. BPT compresses mesh sequences by employing block-wise indexing and patch aggregation, reducing their length by approximately 75\% compared ...
The Surprising Effectiveness of Test-Time Training for Abstract Reasoning
By Naomi Wilson
Posted on: November 13, 2024
Language models have shown impressive performance on tasks within their training distribution, but often struggle with novel problems requiring complex reasoning. We investigate the effectiveness of test-time training (TTT) -- updating model parameters temporarily during inference using a loss deriv...
CDXFormer: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory
By Kate Martin
Posted on: November 13, 2024
In complex scenes and varied conditions, effectively integrating spatial-temporal context is crucial for accurately identifying changes. However, current RS-CD methods lack a balanced consideration of performance and efficiency. CNNs lack global context, Transformers have quadratic computational com...