2025-12-19

Title: A Unified Generative-Predictive Framework for Deterministic Inverse Design

Title: D3G: Diverse Demographic Data Generation Increases Zero-Shot Image Classification Accuracy within Multimodal Models

Title: GLOW: Graph-Language Co-Reasoning for Agentic Workflow Performance Prediction

Title: TAO-Net: Two-stage Adaptive OOD Classification Network for Fine-grained Encrypted Traffic Classification

Title: ReactorFold: Generative discovery of nuclear reactor cores via emergent physical reasoning

Title: Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real

Title: A Unification of Discrete, Gaussian, and Simplicial Diffusion

Title: DSO: Direct Steering Optimization for Bias Mitigation

Title: R4: Retrieval-Augmented Reasoning for Vision-Language Models in 4D Spatio-Temporal Space

Title: AIE4ML: An End-to-End Framework for Compiling Neural Networks for the Next Generation of AMD AI Engines

Title: CoVAR: Co-generation of Video and Action for Robotic Manipulation via Multi-Modal Diffusion

Title: Auto-Vocabulary 3D Object Detection

Title: TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Title: C-DGPA: Class-Centric Dual-Alignment Generative Prompt Adaptation

Title: Visual Alignment of Medical Vision-Language Models for Grounded Radiology Report Generation

Title: Learning High-Quality Initial Noise for Single-View Synthesis with Diffusion Models

Title: ARMFlow: AutoRegressive MeanFlow for Online 3D Human Reaction Generation

Title: Coarse-to-Fine Open-Set Graph Node Classification with Large Language Models

Title: Pixel Super-Resolved Fluorescence Lifetime Imaging Using Deep Learning

Title: TextEditBench: Evaluating Reasoning-aware Text Editing Beyond Rendering

Title: GFLAN: Generative Functional Layouts

Title: PixelArena: A benchmark for Pixel-Precision Visual Intelligence

Title: LaverNet: Lightweight All-in-one Video Restoration via Selective Propagation

Title: GMODiff: One-Step Gain Map Refinement with Diffusion Priors for HDR Reconstruction

Title: Factorized Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models

Title: Geometric Disentanglement of Text Embeddings for Subject-Consistent Text-to-Image Generation using A Single Prompt

Title: Prime and Reach: Synthesising Body Motion for Gaze-Primed Object Reach

Title: StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models

Title: Guiding Perception-Reasoning Closer to Human in Blind Image Quality Assessment

Title: Yuan-TecSwin: A text conditioned Diffusion model with Swin-transformer blocks

Title: Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers

Title: DeContext as Defense: Safe Image Editing in Diffusion Transformers

Title: FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering

Title: DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Title: Detecting Localized Deepfakes: How Well Do Synthetic Image Detectors Handle Inpainting?

Title: Task-Oriented Data Synthesis and Control-Rectify Sampling for Remote Sensing Semantic Segmentation

Title: NRGPT: An Energy-based Alternative for GPT

Title: Make-It-Poseable: Feed-forward Latent Posing Model for 3D Humanoid Character Animation

Title: FlowDet: Unifying Object Detection and Generative Transport Flows

Title: Kling-Omni Technical Report

Title: DenseBEV: Transforming BEV Grid Cells into 3D Objects

Title: MEPIC: Memory Efficient Position Independent Caching for LLM Serving

Title: Next-Generation License Plate Detection and Recognition System using YOLOv8

Title: Radiology Report Generation with Layer-Wise Anatomical Attention

Title: LinkedOut: Linking World Knowledge Representation Out of Video LLM for Next-Generation Video Recommendation

Title: Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Title: Flowing from Reasoning to Motion: Learning 3D Hand Trajectory Prediction from Egocentric Human Interaction Videos

Title: SFTok: Bridging the Performance Gap in Discrete Tokenizers

Title: Posterior Behavioral Cloning: Pretraining BC Policies for Efficient RL Finetuning

Title: StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

Title: Next-Embedding Prediction Makes Strong Vision Learners

Title: Generative Refocusing: Flexible Defocus Control from a Single Image

Title: The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text