2025-03-04

Title: Streaming Looking Ahead with Token-level Self-reward

Title: Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation

Title: PaliGemma-CXR: A Multi-task Multimodal Model for TB Chest X-ray Interpretation

Title: PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion

Title: AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies

Title: Flow Matching for Medical Image Synthesis: Bridging the Gap Between Speed and Quality

Title: Learning to Animate Images from A Few Videos to Portray Delicate Human Actions

Title: Remasking Discrete Diffusion Models with Inference-Time Scaling

Title: Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding

Title: Jointly Understand Your Command and Intention:Reciprocal Co-Evolution between Scene-Aware 3D Human Motion Synthesis and Analysis

Title: EigenActor: Variant Body-Object Interaction Generation Evolved from Invariant Action Basis Reasoning

Title: CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering

Title: Auto-encoding Molecules: Graph-Matching Capabilities Matter

Title: DashCop: Automated E-ticket Generation for Two-Wheeler Traffic Violations Using Dashcam Videos

Title: Using Machine Learning for move sequence visualization and generation in climbing

Title: Periodic Materials Generation using Text-Guided Joint Diffusion Model

Title: GaussianSeal: Rooting Adaptive Watermarks for 3D Gaussian Generation Model

Title: What Makes a Good Diffusion Planner for Decision Making?

Title: Unbiased Video Scene Graph Generation via Visual and Semantic Dual Debiasing

Title: AesthetiQ: Enhancing Graphic Layout Design via Aesthetic-Aware Preference Alignment of Multi-modal Large Language Models

Title: SolidMark: Evaluating Image Memorization in Generative Models

Title: Synergy Between Sufficient Changes and Sparse Mixing Procedure for Disentangled Representation Learning

Title: Development of an Unpaired Deep Neural Network for Synthesizing X-ray Fluoroscopic Images from Digitally Reconstructed Tomography in Image Guided Radiotherapy

Title: Dur360BEV: A Real-world Single 360-degree Camera Dataset and Benchmark for Bird-Eye View Mapping in Autonomous Driving

Title: Proteina: Scaling Flow-based Protein Structure Generative Models

Title: OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records

Title: Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models

Title: MR-EIT: Multi-Resolution Reconstruction for Electrical Impedance Tomography via Data-Driven and Unsupervised Dual-Mode Neural Networks

Title: Evaluating and Predicting Distorted Human Body Parts for Generated Images

Title: Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models

Title: PSRGS:Progressive Spectral Residual of 3D Gaussian for High-Frequency Recovery

Title: Zero-Shot Head Swapping in Real-World Scenarios

Title: From Poses to Identity: Training-Free Person Re-Identification via Feature Centralization

Title: Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think

Title: Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models

Title: Using Synthetic Images to Augment Small Medical Image Datasets

Title: Molecule Generation for Target Protein Binding with Hierarchical Consistency Diffusion Model

Title: Underdamped Diffusion Bridges with Applications to Sampling

Title: MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations

Title: All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning

Title: Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Title: VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors

Title: WeGen: A Unified Model for Interactive Multimodal Generation as We Chat

Title: ACCORD: Alleviating Concept Coupling through Dependence Regularization for Text-to-Image Diffusion Personalization

Title: CoInD: Enabling Logical Compositions in Diffusion Models

Title: Split Gibbs Discrete Diffusion Posterior Sampling

Title: Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data

Title: HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation

Title: DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution

Title: Enhancing Retinal Vessel Segmentation Generalization via Layout-Aware Generative Modelling

Title: A Multi-Sensor Fusion Approach for Rapid Orthoimage Generation in Large-Scale UAV Mapping

Title: Architectural and Inferential Inductive Biases For Exchangeable Sequence Modeling

Title: Tera-MIND: Tera-scale mouse brain simulation via spatial mRNA-guided diffusion

Title: Retrieval-Augmented Perception: High-Resolution Image Perception Meets Visual RAG

Title: Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Title: Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual

Title: SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Title: Fine-Grained Controllable Apparel Showcase Image Generation via Garment-Centric Outpainting

Title: MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation

Title: CacheQuant: Comprehensively Accelerated Diffusion Models

Title: Group Relative Policy Optimization for Image Captioning

Title: Wavelet-Enhanced Desnowing: A Novel Single Image Restoration Approach for Traffic Surveillance under Adverse Weather Conditions

Title: Learning to Generate Long-term Future Narrations Describing Activities of Daily Living

Title: DLF: Extreme Image Compression with Dual-generative Latent Fusion

Title: Generative Human Geometry Distribution

Title: InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization

Title: AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning

Title: MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting

Title: Advancing vision-language models in front-end development via data synthesis

Title: DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models

Title: ToLo: A Two-Stage, Training-Free Layout-To-Image Generation Framework For High-Overlap Layouts

Title: Using (Not so) Large Language Models for Generating Simulation Models in a Formal DSL -- A Study on Reaction Networks

Title: SAGE: A Framework of Precise Retrieval for RAG

Title: KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation

Title: Quality Measures for Dynamic Graph Generative Models

Title: VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation

Title: Enhancing Multi-hop Reasoning in Vision-Language Models via Self-Distillation with Multi-Prompt Ensembling

Title: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation