2025-03-12

Title: Is Pre-training Applicable to the Decoder for Dense Prediction?

Title: BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification

Title: The day-ahead scenario generation method for new energy based on an improved conditional generative diffusion model

Title: TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

Title: Data Foundations for Large Scale Multimodal Clinical Foundation Models

Title: PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Title: A Time Series Multitask Framework Integrating a Large Language Model, Pre-Trained Time Series Model, and Knowledge Graph

Title: RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories

Title: Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model

Title: SIRE: SE(3) Intrinsic Rigidity Embeddings

Title: Self-supervised Normality Learning and Divergence Vector-guided Model Merging for Zero-shot Congenital Heart Disease Detection in Fetal Ultrasound Videos

Title: TwinTURBO: Semi-Supervised Fine-Tuning of Foundation Models via Mutual Information Decompositions for Downstream Task and Latent Spaces

Title: CIMAGE: Exploiting the Conditional Independence in Masked Graph Auto-encoders

Title: Video Action Differencing

Title: Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?

Title: Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Title: CAD-VAE: Leveraging Correlation-Aware Latents for Comprehensive Fair Disentanglement

Title: STRMs: Spatial Temporal Reasoning Models for Vision-Based Localization Rivaling GPS Precision

Title: STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive Applications

Title: Pre-trained Models Succeed in Medical Imaging with Representation Similarity Degradation

Title: Recent Advances in Hypergraph Neural Networks

Title: Regulatory DNA sequence Design with Reinforcement Learning

Title: DiffEGG: Diffusion-Driven Edge Generation as a Pixel-Annotation-Free Alternative for Instance Annotation

Title: CDI3D: Cross-guided Dense-view Interpolation for 3D Reconstruction

Title: Exploring Bias in over 100 Text-to-Image Generative Models

Title: GPT-PPG: A GPT-based Foundation Model for Photoplethysmography Signals

Title: Partial differential equation system for binarization of degraded document images

Title: Learning to Search Effective Example Sequences for In-Context Learning

Title: A General Framework to Evaluate Methods for Assessing Dimensions of Lexical Semantic Change Using LLM-Generated Synthetic Data

Title: Adapting Large Language Models for Parameter-Efficient Log Anomaly Detection

Title: SphOR: A Representation Learning Perspective on Open-set Recognition for Identifying Unknown Classes in Deep Learning Models

Title: Unmasking the Unknown: Facial Deepfake Detection in the Open-Set Paradigm

Title: Seeing Beyond Haze: Generative Nighttime Image Dehazing

Title: Degradation Self-Supervised Learning for Lithium-ion Battery Health Diagnostics

Title: PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models

Title: MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution

Title: ACE: Concept Editing in Diffusion Models without Performance Degradation

Title: Convergence Dynamics and Stabilization Strategies of Co-Evolving Generative Models

Title: Uni$\textbf{F}^2$ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models

Title: Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments

Title: MGHanD: Multi-modal Guidance for authentic Hand Diffusion

Title: ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Title: FlowDPS: Flow-Driven Posterior Sampling for Inverse Problems

Title: FilmComposer: LLM-Driven Music Production for Silent Film Clips

Title: Few-Shot Class-Incremental Model Attribution Using Learnable Representation From CLIP-ViT Features

Title: U-StyDiT: Ultra-high Quality Artistic Style Transfer Using Diffusion Transformers

Title: Concept-Driven Deep Learning for Enhanced Protein-Specific Molecular Generation

Title: Multimodal Generation of Animatable 3D Human Models with AvatarForge

Title: TSCnet: A Text-driven Semantic-level Controllable Framework for Customized Low-Light Image Enhancement

Title: Towards All-in-One Medical Image Re-Identification

Title: Scale-Aware Pre-Training for Human-Centric Visual Perception: Enabling Lightweight and Generalizable Models

Title: A Theoretical Framework for Preventing Class Collapse in Supervised Contrastive Learning

Title: S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction

Title: MVD-HuGaS: Human Gaussians from a Single Image via 3D Human Multi-view Diffusion Prior

Title: Aligning Text to Image in Diffusion Models is Easier Than You Think

Title: SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models

Title: DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness

Title: PromptLNet: Region-Adaptive Aesthetic Enhancement via Prompt Guidance in Low-Light Enhancement Net

Title: OminiControl2: Efficient Conditioning for Diffusion Transformers

Title: A systematic literature review of unsupervised learning algorithms for anomalous traffic detection based on flows

Title: D3PO: Preference-Based Alignment of Discrete Diffusion Models

Title: $^R$FLAV: Rolling Flow matching for infinite Audio Video generation

Title: Diffusion Transformer Meets Random Masks: An Advanced PET Reconstruction Framework

Title: Pathology-Aware Adaptive Watermarking for Text-Driven Medical Image Synthesis

Title: Robust Latent Matters: Boosting Image Generation with Sampling Error

Title: nnInteractive: Redefining 3D Promptable Segmentation

Title: Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens

Title: Recognition-Synergistic Scene Text Editing

Title: DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank

Title: OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning

Title: Fact-checking with Generative AI: A Systematic Cross-Topic Examination of LLMs Capacity to Detect Veracity of Political Information

Title: Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing

Title: Controlling Latent Diffusion Using Latent CLIP

Title: NullFace: Training-Free Localized Face Anonymization

Title: Generalizable AI-Generated Image Detection Based on Fractal Self-Similarity in the Spectrum

Title: TT-GaussOcc: Test-Time Compute for Self-Supervised Occupancy Prediction via Spatio-Temporal Gaussian Splatting

Title: Learning to Match Unpaired Data with Minimum Entropy Coupling

Title: DISTINGUISH Workflow: A New Paradigm of Dynamic Well Placement Using Generative Machine Learning

Title: SAS: Segment Any 3D Scene with Integrated 2D Priors

Title: High-Quality 3D Head Reconstruction from Any Single Portrait Image

Title: SignRep: Enhancing Self-Supervised Sign Representations

Title: ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Title: Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation

Title: 3D Point Cloud Generation via Autoregressive Up-sampling

Title: Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling

Title: Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention

Title: Exploiting Instruction-Following Retrievers for Malicious Information Retrieval

Title: MF-VITON: High-Fidelity Mask-Free Virtual Try-On with Minimal Input

Title: MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention

Title: REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder

Title: Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields

Title: Language-Depth Navigated Thermal and Visible Image Fusion

Title: OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Title: "Principal Components" Enable A New Language of Images