2025-07-08

Title: Advancing Talking Head Generation: A Comprehensive Survey of Multi-Modal Methodologies, Datasets, Evaluation Metrics, and Loss Functions

Title: Controllable diffusion-based generation for multi-channel biological data

Title: Hyperbolic Kernel Graph Neural Networks for Neurocognitive Decline Analysis from Multimodal Brain Imaging

Title: DiceHuBERT: Distilling HuBERT with a Self-Supervised Learning Objective

Title: PlaceFM: A Training-free Geospatial Foundation Model of Places

Title: OBSER: Object-Based Sub-Environment Recognition for Zero-Shot Environmental Inference

Title: GameTileNet: A Semantic Dataset for Low-Resolution Game Art in Procedural Content Generation

Title: Concept-based Adversarial Attack: a Probabilistic Perspective

Title: Leveraging the Structure of Medical Data for Improved Representation Learning

Title: FreqCross: A Multi-Modal Frequency-Spatial Fusion Network for Robust Detection of Stable Diffusion 3.5 Generated Images

Title: PDFMathTranslate: Scientific Document Translation Preserving Layouts

Title: Beyond Overcorrection: Evaluating Diversity in T2I Models with DIVBENCH

Title: Rethinking Data Protection in the (Generative) Artificial Intelligence Era

Title: Intelligent Histology for Tumor Neurosurgery

Title: LATTE: Latent Trajectory Embedding for Diffusion-Generated Image Detection

Title: Cycle-Consistent Helmholtz Machine: Goal-Seeded Simulation via Inverted Inference

Title: Expert-level validation of AI-generated medical text with scalable language models

Title: Adopting a human developmental visual diet yields robust, shape-based AI vision

Title: Latent Thermodynamic Flows: Unified Representation Learning and Generative Modeling of Temperature-Dependent Behaviors from Limited Data

Title: Subject Invariant Contrastive Learning for Human Activity Recognition

Title: LACONIC: A 3D Layout Adapter for Controllable Image Creation

Title: ConceptMix++: Leveling the Playing Field in Text-to-Image Benchmarking via Iterative Prompt Optimization

Title: Global Variational Inference Enhanced Robust Domain Adaptation

Title: Zero-shot Inexact CAD Model Alignment from a Single Image

Title: CPKD: Clinical Prior Knowledge-Constrained Diffusion Models for Surgical Phase Recognition in Endoscopic Submucosal Dissection

Title: Personalized Image Generation from an Author Writing Style

Title: Mirror in the Model: Ad Banner Image Generation via Reflective Multi-LLM and Multi-modal Agents

Title: Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling

Title: De-Fake: Style based Anomaly Deepfake Detection

Title: Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos

Title: Pose-Star: Anatomy-Aware Editing for Open-World Fashion Images

Title: Generating Synthetic Relational Tabular Data via Structural Causal Models

Title: Foundation versus Domain-specific Models: Performance Comparison, Fusion, and Explainability in Face Recognition

Title: Beyond Accuracy: Metrics that Uncover What Makes a `Good' Visual Descriptor

Title: SciVid: Cross-Domain Evaluation of Video Models in Scientific Applications

Title: Kinetic Langevin Diffusion for Crystalline Materials Generation

Title: From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Visual Concepts in Brain Signal Analysis

Title: SecureT2I: No More Unauthorized Manipulation on AI Generated Images from Prompts

Title: When There Is No Decoder: Removing Watermarks from Stable Diffusion Models in a No-box Setting

Title: When Network Architecture Meets Physics: Deep Operator Learning for Coupled Multiphysics

Title: SAMed-2: Selective Memory Enhanced Medical Segment Anything Model

Title: FAROS: Fair Graph Generation via Attribute Switching Mechanisms

Title: Flow-Anchored Consistency Models

Title: ChestGPT: Integrating Large Language Models and Vision Transformers for Disease Detection and Localization in Chest X-Rays

Title: StreamDiT: Real-Time Streaming Text-to-Video Generation

Title: Interpretable Diffusion Models with B-cos Networks

Title: Enhanced accuracy through ensembling of randomly initialized auto-regressive models for time-dependent PDEs

Title: GenAI-Powered Inference

Title: Transformer Model for Alzheimer's Disease Progression Prediction Using Longitudinal Visit Sequences

Title: Taming Anomalies with Down-Up Sampling Networks: Group Center Preserving Reconstruction for 3D Anomaly Detection

Title: Return of the Latent Space COWBOYS: Re-thinking the use of VAEs for Bayesian Optimisation of Structured Spaces

Title: DNF-Intrinsic: Deterministic Noise-Free Diffusion for Indoor Inverse Rendering

Title: Evaluating Adversarial Protections for Diffusion Personalization: A Comprehensive Study

Title: Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data

Title: CoT-Segmenter: Enhancing OOD Detection in Dense Road Scenes via Chain-of-Thought Reasoning

Title: Fast Re-Trainable Attention Autoencoder for Liquid Sensor Anomaly Detection at the Edge

Title: Breaking Imitation Bottlenecks: Reinforced Diffusion Powers Diverse Trajectory Generation

Title: Generate, Refine, and Encode: Leveraging Synthesized Novel Samples for On-the-Fly Fine-Grained Category Discovery

Title: Consistent and Invariant Generalization Learning for Short-video Misinformation Detection

Title: Token Level Hallucination Detection via Variance in Language Models

Title: Pedestrian Intention Prediction via Vision-Language Foundation Models

Title: Unlocking Compositional Control: Self-Supervision for LVLM-Based Image Generation

Title: LVLM-Composer's Explicit Planning for Image Generation

Title: ML-Enhanced AES Anomaly Detection for Real-Time Embedded Security

Title: An explicit formulation of the learned noise predictor $ε_θ({\bf x}_t, t)$ via the forward-process noise $ε_{t}$ in denoising diffusion probabilistic models (DDPMs)

Title: Quick Bypass Mechanism of Zero-Shot Diffusion-Based Image Restoration

Title: DreamPoster: A Unified Framework for Image-Conditioned Generative Poster Design

Title: Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs

Title: Context Tuning for In-Context Optimization

Title: Zero-Shot Cyclic Peptide Design with Composable Geometric Conditions

Title: Scaling Context Requires Rethinking Attention

Title: Domain Generalizable Portrait Style Transfer

Title: An Explainable Transformer Model for Alzheimer's Disease Detection Using Retinal Imaging

Title: ZERO: Multi-modal Prompt-based Visual Grounding

Title: SeqTex: Generate Mesh Textures in Video Sequence

Title: MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation

Title: Attention Slipping: A Mechanistic Understanding of Jailbreak Attacks and Defenses in LLMs

Title: Time2Agri: Temporal Pretext Tasks for Agricultural Monitoring

Title: Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion

Title: A View-consistent Sampling Method for Regularized Training of Neural Radiance Fields

Title: DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

Title: CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step

Title: Dealing with Uncertainty in Contextual Anomaly Detection

Title: Unveiling the Potential of Diffusion Large Language Model in Controllable Generation

Title: MambaVideo for Discrete Video Tokenization with Channel-Split Quantization

Title: Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts

Title: S$^2$Edit: Text-Guided Image Editing with Precise Semantic and Spatial Control

Title: QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation

Title: Information-Guided Diffusion Sampling for Dataset Distillation

Title: Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences

Title: Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts

Title: Hybrid Adversarial Spectral Loss Conditional Generative Adversarial Networks for Signal Data Augmentation in Ultra-precision Machining Surface Roughness Prediction

Title: ChangeBridge: Spatiotemporal Image Generation with Multimodal Controls for Remote Sensing

Title: TeethGenerator: A two-stage framework for paired pre- and post-orthodontic 3D dental data generation

Title: Structure-Guided Diffusion Models for High-Fidelity Portrait Shadow Removal

Title: Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code Generation

Title: A Visual Leap in CLIP Compositionality Reasoning through Generation of Counterfactual Sets

Title: Spooky Action at a Distance: Normalization Layers Enable Side-Channel Spatial Communication

Title: Geometric-Guided Few-Shot Dental Landmark Detection with Human-Centric Foundation Model

Title: Losing Control: Data Poisoning Attack on Guided Diffusion via ControlNet

Title: Word stress in self-supervised speech models: A cross-linguistic comparison

Title: GraphBrep: Learning B-Rep in Graph Structure for Efficient CAD Generation

Title: From Vision To Language through Graph of Events in Space and Time: An Explainable Self-supervised Approach

Title: Discrete Diffusion Trajectory Alignment via Stepwise Decomposition

Title: Semantically Consistent Discrete Diffusion for 3D Biological Graph Modeling

Title: Fine-tuning on simulated data outperforms prompting for agent tone of voice

Title: Leveraging Self-Supervised Features for Efficient Flooded Region Identification in UAV Aerial Images

Title: Object-centric Denoising Diffusion Models for Physical Reasoning

Title: RainShift: A Benchmark for Precipitation Downscaling Across Geographies

Title: Taming the Tri-Space Tension: ARC-Guided Hallucination Modeling and Control for Text-to-Image Generation

Title: DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Title: ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation

Title: InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior

Title: Parameterized Diffusion Optimization enabled Autoregressive Ordinal Regression for Diabetic Retinopathy Grading

Title: TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation

Title: Multi-modal Representations for Fine-grained Multi-label Critical View of Safety Recognition

Title: Verified Language Processing with Hybrid Explainability: A Technical Report

Title: Meta-Learning Transformers to Improve In-Context Generalization

Title: AI-Driven Cytomorphology Image Synthesis for Medical Diagnostics

Title: ICAS: Detecting Training Data from Autoregressive Image Generative Models

Title: Exploring Semantic Clustering and Similarity Search for Heterogeneous Traffic Scenario Graph

Title: MoDiT: Learning Highly Consistent 3D Motion Coefficients with Diffusion Transformer for Talking Head Generation

Title: DICE: Discrete inverse continuity equation for learning population dynamics

Title: VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting

Title: An Evaluation of Large Language Models on Text Summarization Tasks Using Prompt Engineering Techniques

Title: Interpretable Mnemonic Generation for Kanji Learning via Expectation-Maximization

Title: VERITAS: Verification and Explanation of Realness in Images for Transparency in AI Systems

Title: AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models

Title: 4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture

Title: Critiques of World Models

Title: CTA: Cross-Task Alignment for Better Test Time Training

Title: Self-Supervised Real-Time Tracking of Military Vehicles in Low-FPS UAV Footage

Title: Physics-Guided Dual Implicit Neural Representations for Source Separation

Title: From Marginal to Joint Predictions: Evaluating Scene-Consistent Trajectory Prediction Approaches for Automated Driving

Title: Beyond Simple Edits: X-Planner for Complex Instruction-Based Image Editing