2025-03-17

Title: Text-to-3D Generation using Jensen-Shannon Score Distillation

Title: VRMDiff: Text-Guided Video Referring Matting Generation of Diffusion

Title: Context-guided Responsible Data Augmentation with Diffusion Models

Title: Neighboring Autoregressive Modeling for Efficient Visual Generation

Title: Zero-Shot Subject-Centric Generation for Creative Application Using Entropy Fusion

Title: TA-V2A: Textually Assisted Video-to-Audio Generation

Title: Team NYCU at Defactify4: Robust Detection and Source Identification of AI-Generated Images Using CNN and CLIP-Based Models

Title: Long-Video Audio Synthesis with Multi-Agent Collaboration

Title: Numerical and statistical analysis of NeuralODE with Runge-Kutta time integration

Title: Leveraging Vision-Language Embeddings for Zero-Shot Learning in Histopathology Images

Title: Visual Polarization Measurement Using Counterfactual Image Generation

Title: FlowTok: Flowing Seamlessly Across Text and Image Tokens

Title: Large-scale Pre-training for Grounded Video Caption Generation

Title: Taxonomic Reasoning for Rare Arthropods: Combining Dense Image Captioning and RAG for Interpretable Classification

Title: Memory-Efficient 3D High-Resolution Medical Image Synthesis Using CRF-Guided GANs

Title: OuroMamba: A Data-Free Quantization Framework for Vision Mamba Models

Title: Comparative Analysis of Advanced AI-based Object Detection Models for Pavement Marking Quality Assessment during Daytime

Title: Weakly Supervised Contrastive Adversarial Training for Learning Robust Features from Semi-supervised Data

Title: ACMo: Attribute Controllable Motion Generation

Title: InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences

Title: PSF-4D: A Progressive Sampling Framework for View Consistent 4D Editing

Title: Measuring Similarity in Causal Graphs: A Framework for Semantic and Structural Analysis

Title: Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization

Title: Generative Modelling for Mathematical Discovery

Title: Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models

Title: Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models

Title: Understanding Flatness in Generative Models: Its Role and Benefits

Title: Augmenting Image Annotation: A Human-LMM Collaborative Framework for Efficient Object Selection and Label Generation

Title: DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation

Title: MUSS: Multilevel Subset Selection for Relevance and Diversity

Title: Direction-Aware Diagonal Autoregressive Image Generation

Title: SpaceSeg: A High-Precision Intelligent Perception Segmentation Method for Multi-Spacecraft On-Orbit Targets

Title: GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior

Title: Multi-Stage Generative Upscaler: Reconstructing Football Broadcast Images via Diffusion Models

Title: Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption

Title: Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards

Title: Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Title: Federated Koopman-Reservoir Learning for Large-Scale Multivariate Time-Series Anomaly Detection

Title: Noise Synthesis for Low-Light Image Denoising with Diffusion Models

Title: CyclePose -- Leveraging Cycle-Consistency for Annotation-Free Nuclei Segmentation in Fluorescence Microscopy

Title: OPTIMUS: Predicting Multivariate Outcomes in Alzheimer's Disease Using Multi-modal Data amidst Missing Values

Title: Leveraging Diffusion Knowledge for Generative Image Compression with Fractal Frequency-Aware Band Learning

Title: PBR3DGen: A VLM-guided Mesh Generation with High-quality PBR Texture

Title: Watch and Learn: Leveraging Expert Knowledge and Language for Surgical Video Understanding

Title: A Neural Network Architecture Based on Attention Gate Mechanism for 3D Magnetotelluric Forward Modeling

Title: Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models

Title: Classifying Long-tailed and Label-noise Data via Disentangling and Unlearning

Title: From Generative AI to Innovative AI: An Evolutionary Roadmap

Title: TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

Title: D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning

Title: T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation

Title: HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models

Title: Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models

Title: AugGen: Synthetic Augmentation Can Improve Discriminative Models

Title: From Denoising Score Matching to Langevin Sampling: A Fine-Grained Error Analysis in the Gaussian Setting

Title: ReCamMaster: Camera-Controlled Generative Rendering from A Single Video