2025-12-19

Title: LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Title: A Unified Generative-Predictive Framework for Deterministic Inverse Design

Title: D3G: Diverse Demographic Data Generation Increases Zero-Shot Image Classification Accuracy within Multimodal Models

Title: ReactorFold: Generative discovery of nuclear reactor cores via emergent physical reasoning

Title: Cross-Sample Augmented Test-Time Adaptation for Personalized Intraoperative Hypotension Prediction

Title: Data-Chain Backdoor: Do You Trust Diffusion Models as Generative Data Supplier?

Title: TS-DP: Reinforcement Speculative Decoding For Temporal Adaptive Diffusion Policy Acceleration

Title: Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real

Title: Cybercrime and Computer Forensics in Epoch of Artificial Intelligence in India

Title: Seeing Beyond Words: Self-Supervised Visual Learning for Multimodal Large Language Models

Title: A Unification of Discrete, Gaussian, and Simplicial Diffusion

Title: DSO: Direct Steering Optimization for Bias Mitigation

Title: BarcodeMamba+: Advancing State-Space Models for Fungal Biodiversity Research

Title: In-Context Semi-Supervised Learning

Title: The Perceptual Observatory Characterizing Robustness and Grounding in MLLMs

Title: Are vision-language models ready to zero-shot replace supervised classification models in agriculture?

Title: CoVAR: Co-generation of Video and Action for Robotic Manipulation via Multi-Modal Diffusion

Title: Explainable AI in Big Data Fraud Detection

Title: ContextLeak: Auditing Leakage in Private In-Context Learning Methods

Title: In-Context Multi-Operator Learning with DeepOSets

Title: FOD-Diff: 3D Multi-Channel Patch Diffusion Model for Fiber Orientation Distribution

Title: TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Title: SegGraph: Leveraging Graphs of SAM Segments for Few-Shot 3D Part Segmentation

Title: C-DGPA: Class-Centric Dual-Alignment Generative Prompt Adaptation

Title: Learning High-Quality Initial Noise for Single-View Synthesis with Diffusion Models

Title: LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding

Title: Sigma-Moe-Tiny Technical Report

Title: Pixel Super-Resolved Fluorescence Lifetime Imaging Using Deep Learning

Title: TextEditBench: Evaluating Reasoning-aware Text Editing Beyond Rendering

Title: GFLAN: Generative Functional Layouts

Title: In-Context Probing for Membership Inference in Fine-Tuned Language Models

Title: PixelArena: A benchmark for Pixel-Precision Visual Intelligence

Title: Pretrained Battery Transformer (PBT): A battery life prediction foundation model

Title: GMODiff: One-Step Gain Map Refinement with Diffusion Priors for HDR Reconstruction

Title: Factorized Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models

Title: Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs

Title: CountZES: Counting via Zero-Shot Exemplar Selection

Title: Multi-Fidelity Delayed Acceptance: hierarchical MCMC sampling for Bayesian inverse problems combining multiple solvers through deep neural networks

Title: Geometric Disentanglement of Text Embeddings for Subject-Consistent Text-to-Image Generation using A Single Prompt

Title: Prime and Reach: Synthesising Body Motion for Gaze-Primed Object Reach

Title: Skeleton-Snippet Contrastive Learning with Multiscale Feature Fusion for Action Localization

Title: Causal-Tune: Mining Causal Factors from Vision Foundation Models for Domain Generalized Semantic Segmentation

Title: Abacus: Self-Supervised Event Counting-Aligned Distributional Pretraining for Sequential User Modeling

Title: Yuan-TecSwin: A text conditioned Diffusion model with Swin-transformer blocks

Title: Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers

Title: DeContext as Defense: Safe Image Editing in Diffusion Transformers

Title: SARMAE: Masked Autoencoder for SAR Representation Learning

Title: REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion

Title: FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering

Title: Detecting Localized Deepfakes: How Well Do Synthetic Image Detectors Handle Inpainting?

Title: OMG-Bench: A New Challenging Benchmark for Skeleton-based Online Micro Hand Gesture Recognition

Title: Task-Oriented Data Synthesis and Control-Rectify Sampling for Remote Sensing Semantic Segmentation

Title: NRGPT: An Energy-based Alternative for GPT

Title: FlowDet: Unifying Object Detection and Generative Transport Flows

Title: Kling-Omni Technical Report

Title: Radiology Report Generation with Layer-Wise Anatomical Attention

Title: Meta-RL Induces Exploration in Language Agents

Title: RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

Title: Instant Expressive Gaussian Head Avatar via 3D-Aware Expression Distillation

Title: FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction

Title: In-Context Algebra

Title: Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Title: VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization

Title: SFTok: Bridging the Performance Gap in Discrete Tokenizers

Title: Posterior Behavioral Cloning: Pretraining BC Policies for Efficient RL Finetuning

Title: Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation

Title: StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

Title: Next-Embedding Prediction Makes Strong Vision Learners

Title: Generative Refocusing: Flexible Defocus Control from a Single Image