+

Research Posts

SG-Reg: Generalizable and Efficient Scene Graph Registration

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: April 23, 2025

SG-Reg: Generalizable and Efficient Scene Graph Registration

This paper addresses the challenges of registering two rigid semantic scene graphs, an essential capability when an autonomous agent needs to register its map against a remote agent, or against a prior map. The hand-crafted descriptors in classical semantic-aided registration, or the ground-truth an...

Read More

UFO2: The Desktop AgentOS

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: April 23, 2025

UFO2: The Desktop AgentOS

Recent Computer-Using Agents (CUAs), powered by multimodal large language models (LLMs), offer a promising direction for automating complex desktop workflows through natural language. However, most existing CUAs remain conceptual prototypes, hindered by shallow OS integration, fragile screenshot-bas...

Read More

Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation

Papers with Code
Reporter Javier Vásquez

By Javier Vásquez

Posted on: April 23, 2025

Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation

Camera and human motion controls have been extensively studied for video generation, but existing approaches typically address them separately, suffering from limited data with high-quality annotations for both aspects. To overcome this, we present Uni3C, a unified 3D-enhanced framework for precise ...

Read More

Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: April 23, 2025

Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Process reward models (PRMs) have proven effective for test-time scaling of Large Language Models (LLMs) on challenging reasoning tasks. However, reward hacking issues with PRMs limit their successful application in reinforcement fine-tuning. In this paper, we identify the main cause of PRM-induced ...

Read More

EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: April 23, 2025

EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models

In this paper, we introduce EasyEdit2, a framework designed to enable plug-and-play adjustability for controlling Large Language Model (LLM) behaviors. EasyEdit2 supports a wide range of test-time interventions, including safety, sentiment, personality, reasoning patterns, factuality, and language f...

Read More

FlowReasoner: Reinforcing Query-Level Meta-Agents

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: April 23, 2025

FlowReasoner: Reinforcing Query-Level Meta-Agents

This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first en...

Read More

CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: January 20, 2025

CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation

The synthesis of high-quality 3D assets from textual or visual inputs has become a central objective in modern generative modeling. Despite the proliferation of 3D generation algorithms, they frequently grapple with challenges such as multi-view inconsistency, slow generation times, low fidelity, an...

Read More

FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization

Papers with Code
Reporter Javier Vásquez

By Javier Vásquez

Posted on: January 20, 2025

FiLo++: Zero-/Few-Shot Anomaly Detection by Fused Fine-Grained Descriptions and Deformable Localization

Anomaly detection methods typically require extensive normal samples from the target class for training, limiting their applicability in scenarios that require rapid adaptation, such as cold start. Zero-shot and few-shot anomaly detection do not require labeled samples from the target class in advan...

Read More