2025-07-08

Title: Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions

Title: Controllable diffusion-based generation for multi-channel biological data

Title: Efficient Certified Reasoning for Binarized Neural Networks

Title: Large Language Model Agent for Modular Task Execution in Drug Discovery

Title: GameTileNet: A Semantic Dataset for Low-Resolution Game Art in Procedural Content Generation

Title: Iterative Zoom-In: Temporal Interval Exploration for Long Video Understanding

Title: CS-VLM: Compressed Sensing Attention for Efficient Vision-Language Representation Learning

Title: Concept-based Adversarial Attack: a Probabilistic Perspective

Title: Mimesis, Poiesis, and Imagination: Exploring Text-to-Image Generation of Biblical Narratives

Title: InvisibleInk: High-Utility and Low-Cost Text Generation with Differential Privacy

Title: Introducing Answered with Evidence -- a framework for evaluating whether LLM responses to biomedical questions are founded in evidence

Title: FreqCross: A Multi-Modal Frequency-Spatial Fusion Network for Robust Detection of Stable Diffusion 3.5 Generated Images

Title: Rethinking Data Protection in the (Generative) Artificial Intelligence Era

Title: Cycle-Consistent Helmholtz Machine: Goal-Seeded Simulation via Inverted Inference

Title: SymMatika: Structure-Aware Symbolic Discovery

Title: BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers

Title: HGCA: Hybrid GPU-CPU Attention for Long Context LLM Inference

Title: Latent Thermodynamic Flows: Unified Representation Learning and Generative Modeling of Temperature-Dependent Behaviors from Limited Data

Title: LACONIC: A 3D Layout Adapter for Controllable Image Creation

Title: Dual-frequency Selected Knowledge Distillation with Statistical-based Sample Rectification for PolSAR Image Classification

Title: ConceptMix++: Leveling the Playing Field in Text-to-Image Benchmarking via Iterative Prompt Optimization

Title: Global Variational Inference Enhanced Robust Domain Adaptation

Title: CPKD: Clinical Prior Knowledge-Constrained Diffusion Models for Surgical Phase Recognition in Endoscopic Submucosal Dissection

Title: Personalized Image Generation from an Author Writing Style

Title: Mirror in the Model: Ad Banner Image Generation via Reflective Multi-LLM and Multi-modal Agents

Title: Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling

Title: Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos

Title: Pose-Star: Anatomy-Aware Editing for Open-World Fashion Images

Title: Reinforcement Learning-based Feature Generation Algorithm for Scientific Data

Title: Generating Synthetic Relational Tabular Data via Structural Causal Models

Title: Beyond Accuracy: Metrics that Uncover What Makes a `Good' Visual Descriptor

Title: Kinetic Langevin Diffusion for Crystalline Materials Generation

Title: Scientific Machine Learning of Chaotic Systems Discovers Governing Equations for Neural Populations

Title: Plugging Attention into Power Grids: Towards Transparent Forecasting

Title: FAROS: Fair Graph Generation via Attribute Switching Mechanisms

Title: Flow-Anchored Consistency Models

Title: ChestGPT: Integrating Large Language Models and Vision Transformers for Disease Detection and Localization in Chest X-Rays

Title: StreamDiT: Real-Time Streaming Text-to-Video Generation

Title: GenAI-Powered Inference

Title: Transformer Model for Alzheimer's Disease Progression Prediction Using Longitudinal Visit Sequences

Title: Taming Anomalies with Down-Up Sampling Networks: Group Center Preserving Reconstruction for 3D Anomaly Detection

Title: EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation

Title: Bridging Vision and Language: Optimal Transport-Driven Radiology Report Generation via LLMs

Title: Return of the Latent Space COWBOYS: Re-thinking the use of VAEs for Bayesian Optimisation of Structured Spaces

Title: DNF-Intrinsic: Deterministic Noise-Free Diffusion for Indoor Inverse Rendering

Title: Evaluating Adversarial Protections for Diffusion Personalization: A Comprehensive Study

Title: Robust Low-light Scene Restoration via Illumination Transition

Title: LEHA-CVQAD: Dataset To Enable Generalized Video Quality Assessment of Compression Artifacts

Title: NRSeg: Noise-Resilient Learning for BEV Semantic Segmentation via Driving World Models

Title: PresentAgent: Multimodal Agent for Presentation Video Generation

Title: Breaking Imitation Bottlenecks: Reinforced Diffusion Powers Diverse Trajectory Generation

Title: Generate, Refine, and Encode: Leveraging Synthesized Novel Samples for On-the-Fly Fine-Grained Category Discovery

Title: PromptSR: Cascade Prompting for Lightweight Image Super-Resolution

Title: Unlocking Compositional Control: Self-Supervision for LVLM-Based Image Generation

Title: LVLM-Composer's Explicit Planning for Image Generation

Title: Voyaging into Unbounded Dynamic Scenes from a Single View

Title: An explicit formulation of the learned noise predictor $ε_θ({\bf x}_t, t)$ via the forward-process noise $ε_{t}$ in denoising diffusion probabilistic models (DDPMs)

Title: Quick Bypass Mechanism of Zero-Shot Diffusion-Based Image Restoration

Title: DreamPoster: A Unified Framework for Image-Conditioned Generative Poster Design

Title: Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs

Title: Zero-Shot Cyclic Peptide Design with Composable Geometric Conditions

Title: MoReMouse: Monocular Reconstruction of Laboratory Mouse

Title: An Explainable Transformer Model for Alzheimer's Disease Detection Using Retinal Imaging

Title: Towards Lightest Low-Light Image Enhancement Architecture for Mobile Devices

Title: SeqTex: Generate Mesh Textures in Video Sequence

Title: MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation

Title: Multi-Modal Semantic Parsing for the Interpretation of Tombstone Inscriptions

Title: Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion

Title: Multimedia Verification Through Multi-Agent Deep Research Multimodal Large Language Models

Title: Tail-aware Adversarial Attacks: A Distributional Approach to Efficient LLM Jailbreaking

Title: DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

Title: CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step

Title: Source Attribution in Retrieval-Augmented Generation

Title: A Training-Free Style-Personalization via Scale-wise Autoregressive Model

Title: Grounded Gesture Generation: Language, Motion, and Space

Title: MambaVideo for Discrete Video Tokenization with Channel-Split Quantization

Title: S$^2$Edit: Text-Guided Image Editing with Precise Semantic and Spatial Control

Title: VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents

Title: QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation

Title: any4: Learned 4-bit Numeric Representation for LLMs

Title: Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences

Title: Learn 3D VQA Better with Active Selection and Reannotation

Title: A Cycle-Consistency Constrained Framework for Dynamic Solution Space Reduction in Noninjective Regression

Title: Hybrid Adversarial Spectral Loss Conditional Generative Adversarial Networks for Signal Data Augmentation in Ultra-precision Machining Surface Roughness Prediction

Title: ChangeBridge: Spatiotemporal Image Generation with Multimodal Controls for Remote Sensing

Title: TeethGenerator: A two-stage framework for paired pre- and post-orthodontic 3D dental data generation

Title: Structure-Guided Diffusion Models for High-Fidelity Portrait Shadow Removal

Title: Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code Generation

Title: A Visual Leap in CLIP Compositionality Reasoning through Generation of Counterfactual Sets

Title: Identity-Preserving Text-to-Video Generation Guided by Simple yet Effective Spatial-Temporal Decoupled Representations

Title: UrbanMind: Towards Urban General Intelligence via Tool-Enhanced Retrieval-Augmented Generation and Multilevel Optimization

Title: Spooky Action at a Distance: Normalization Layers Enable Side-Channel Spatial Communication

Title: GraphBrep: Learning B-Rep in Graph Structure for Efficient CAD Generation

Title: From Vision To Language through Graph of Events in Space and Time: An Explainable Self-supervised Approach

Title: Semantically Consistent Discrete Diffusion for 3D Biological Graph Modeling

Title: HV-MMBench: Benchmarking MLLMs for Human-Centric Video Understanding

Title: RainShift: A Benchmark for Precipitation Downscaling Across Geographies

Title: Taming the Tri-Space Tension: ARC-Guided Hallucination Modeling and Control for Text-to-Image Generation

Title: DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Title: Hear-Your-Click: Interactive Video-to-Audio Generation via Object-aware Contrastive Audio-Visual Fine-tuning

Title: Estimating Object Physical Properties from RGB-D Vision and Depth Robot Sensors Using Deep Learning

Title: ICAS: Detecting Training Data from Autoregressive Image Generative Models

Title: MoDiT: Learning Highly Consistent 3D Motion Coefficients with Diffusion Transformer for Talking Head Generation

Title: DICE: Discrete inverse continuity equation for learning population dynamics

Title: Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration

Title: VERITAS: Verification and Explanation of Realness in Images for Transparency in AI Systems

Title: 4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture

Title: Critiques of World Models

Title: Semantic Frame Interpolation

Title: $φ$-Adapt: A Physics-Informed Adaptation Learning Approach to 2D Quantum Material Discovery

Title: Logit Reweighting for Topic-Focused Summarization

Title: From Marginal to Joint Predictions: Evaluating Scene-Consistent Trajectory Prediction Approaches for Automated Driving

Title: SegmentDreamer: Towards High-fidelity Text-to-3D Synthesis with Segmented Consistency Trajectory Distillation