+

Research Posts

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: November 06, 2024

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens. We conduct a thorough evaluation of Hunyuan-Large's su...

Read More

xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Papers with Code
Reporter Kate Martin

By Kate Martin

Posted on: November 06, 2024

xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Diffusion models are pivotal for generating high-quality images and videos. Inspired by the success of OpenAI's Sora, the backbone of diffusion models is evolving from U-Net to Transformer, known as Diffusion Transformers (DiTs). However, generating high-quality content necessitates longer sequence ...

Read More

No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

Papers with Code
Reporter Javier Vásquez

By Javier Vásquez

Posted on: November 04, 2024

No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

We introduce NoPoSplat, a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from \textit{unposed} sparse multi-view images. Our model, trained exclusively with photometric loss, achieves real-time 3D Gaussian reconstruction during inference. To eliminate the need f...

Read More

Randomized Autoregressive Visual Generation

Papers with Code
Reporter Javier Vásquez

By Javier Vásquez

Posted on: November 04, 2024

Randomized Autoregressive Visual Generation

This paper presents Randomized AutoRegressive modeling (RAR) for visual generation, which sets a new state-of-the-art performance on the image generation task while maintaining full compatibility with language modeling frameworks. The proposed RAR is simple: during a standard autoregressive training...

Read More

CALE: Continuous Arcade Learning Environment

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: November 01, 2024

CALE: Continuous Arcade Learning Environment

We introduce the Continuous Arcade Learning Environment (CALE), an extension of the well-known Arcade Learning Environment (ALE) [Bellemare et al., 2013]. The CALE uses the same underlying emulator of the Atari 2600 gaming system (Stella), but adds support for continuous actions. This enables the be...

Read More

A Joint Representation Using Continuous and Discrete Features for Cardiovascular Diseases Risk Prediction on Chest CT Scans

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: October 28, 2024

A Joint Representation Using Continuous and Discrete Features for Cardiovascular Diseases Risk Prediction on Chest CT Scans

Cardiovascular diseases (CVD) remain a leading health concern and contribute significantly to global mortality rates. While clinical advancements have led to a decline in CVD mortality, accurately identifying individuals who could benefit from preventive interventions remains an unsolved challenge i...

Read More

SMITE: Segment Me In TimE

Papers with Code
Reporter Javier Vásquez

By Javier Vásquez

Posted on: October 28, 2024

SMITE: Segment Me In TimE

Segmenting an object in a video presents significant challenges. Each pixel must be accurately labelled, and these labels must remain consistent across frames. The difficulty increases when the segmentation is with arbitrary granularity, meaning the number of segments can vary arbitrarily, and masks...

Read More

CoqPilot, a plugin for LLM-based generation of proofs

Papers with Code
Reporter Naomi Wilson

By Naomi Wilson

Posted on: October 28, 2024

CoqPilot, a plugin for LLM-based generation of proofs

We present CoqPilot, a VS Code extension designed to help automate writing of Coq proofs. The plugin collects the parts of proofs marked with the admit tactic in a Coq file, i.e., proof holes, and combines LLMs along with non-machine-learning methods to generate proof candidates for the holes. Then,...

Read More