2025-06-06

Title: DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience

Title: Backbone Augmented Training for Adaptations

Title: Softlog-Softmax Layers and Divergences Contribute to a Computationally Dependable Ensemble Learning

Title: HuGeDiff: 3D Human Generation via Diffusion with Gaussian Splatting

Title: ReXVQA: A Large-scale Visual Question Answering Benchmark for Generalist Chest X-ray Understanding

Title: WorldPrediction: A Benchmark for High-level World Modeling and Long-horizon Procedural Planning

Title: Visualizing and Controlling Cortical Responses Using Voxel-Weighted Activation Maximization

Title: Is Perturbation-Based Image Protection Disruptive to Image Editing?

Title: HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation

Title: RETRO SYNFLOW: Discrete Flow Matching for Accurate and Diverse Single-Step Retrosynthesis

Title: AuthGuard: Generalizable Deepfake Detection via Language Guidance

Title: EECD-Net: Energy-Efficient Crack Detection with Spiking Neural Networks and Gated Attention

Title: NOBLE -- Neural Operator with Biologically-informed Latent Embeddings to Capture Experimental Variability in Biological Neuron Models

Title: Enhancing Frequency for Single Image Super-Resolution with Learnable Separable Kernels

Title: Follow-Your-Creation: Empowering 4D Creation through Video Inpainting

Title: Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets

Title: SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents

Title: Exploring bidirectional bounds for minimax-training of Energy-based models

Title: Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders

Title: Inference economics of language models

Title: FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion

Title: Gen-n-Val: Agentic Image Data Generation and Validation

Title: MARS: Radio Map Super-resolution and Reconstruction Method under Sparse Channel Measurements

Title: Explicit Density Approximation for Neural Implicit Samplers Using a Bernstein-Based Convex Divergence

Title: UNO: Unlearning via Orthogonalization in Generative models

Title: Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model

Title: SRD: Reinforcement-Learned Semantic Perturbation for Backdoor Defense in VLMs

Title: DualX-VSR: Dual Axial Spatial$\times$Temporal Transformer for Real-World Video Super-Resolution without Motion Compensation

Title: OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language Model

Title: Geological Field Restoration through the Lens of Image Inpainting

Title: Invisible Backdoor Triggers in Image Editing Model via Deep Watermarking

Title: Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer

Title: Time-Lapse Video-Based Embryo Grading via Complementary Spatial-Temporal Pattern Mining

Title: Robustness as Architecture: Designing IQA Models to Withstand Adversarial Perturbations

Title: FEAT: Full-Dimensional Efficient Attention Transformer for Medical Video Generation

Title: Physical Annotation for Automated Optical Inspection: A Concept for In-Situ, Pointer-Based Trainingdata Generation

Title: SeedEdit 3.0: Fast and High-Quality Generative Image Editing

Title: Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers

Title: Privacy Amplification Through Synthetic Data: Insights from Linear Regression

Title: DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models

Title: Practical Manipulation Model for Robust Deepfake Detection

Title: Associative Memory and Generative Diffusion in the Zero-noise Limit

Title: OGGSplat: Open Gaussian Growing for Generalizable Reconstruction with Expanded Field-of-View

Title: Follow-Your-Motion: Video Motion Transfer via Efficient Spatial-Temporal Decoupled Finetuning

Title: Towards Vision-Language-Garment Models For Web Knowledge Garment Understanding and Generation

Title: DSG-World: Learning a 3D Gaussian World Model from Dual State Videos

Title: Evaluating Sparse Autoencoders: From Shallow Design to Matching Pursuit

Title: Aligning Latent Spaces with Flow Priors

Title: Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning

Title: From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos

Title: How to Unlock Time Series Editing? Diffusion-Driven Approach with Multi-Grained Control

Title: Rectified Point Flow: Generic Point Cloud Pose Estimation

Title: Video World Models with Long-term Spatial Memory

Title: AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model

Title: Power Law Guided Dynamic Sifting for Efficient Attention

Title: SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

Title: Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos

Title: Learning normalized image densities via dual score matching

Title: LSM-2: Learning from Incomplete Wearable Sensor Data

Title: MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Title: Kinetics: Rethinking Test-Time Scaling Laws

Title: Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning

Title: ContentV: Efficient Training of Video Generation Models with Limited Compute

Title: SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs

Title: Inference-Time Hyper-Scaling with KV Cache Compression

Title: Contrastive Flow Matching