2025-04-30

Title: Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment

Title: VideoMultiAgents: A Multi-Agent Framework for Video Question Answering

Title: Integration Flow Models

Title: Physics-Informed Diffusion Models for SAR Ship Wake Generation from Text Prompts

Title: Generative Diffusion Models for Resource Allocation in Wireless Networks

Title: Image Interpolation with Score-based Riemannian Metrics of Diffusion Models

Title: A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning

Title: Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training

Title: MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report Generation

Title: Generative Learning for Slow Manifolds and Bifurcation Diagrams

Title: Inception: Jailbreak the Memory Mechanism of Text-to-Image Generation Systems

Title: FiLA-Video: Spatio-Temporal Compression for Fine-Grained Long Video Understanding

Title: FourierSpecNet: Neural Collision Operator Approximation Inspired by the Fourier Spectral Method for Solving the Boltzmann Equation

Title: GarmentX: Autoregressive Parametric Representations for High-Fidelity 3D Garment Generation

Title: ADiff4TPP: Asynchronous Diffusion Models for Temporal Point Processes

Title: GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection

Title: PixelHacker: Image Inpainting with Structural and Semantic Consistency

Title: Reviving Any-Subset Autoregressive Models with Principled Parallel Sampling and Speculative Decoding

Title: LMM4Gen3DHF: Benchmarking and Evaluating Multimodal 3D Human Face Generation with LMMs

Title: Antidote: A Unified Framework for Mitigating LVLM Hallucinations in Counterfactual Presupposition and Object Perception

Title: Geometry-aware Temporal Aggregation Network for Monocular 3D Lane Detection

Title: Autoencoder Models for Point Cloud Environmental Synthesis from WiFi Channel State Information: A Preliminary Study

Title: Digital Shielding for Cross-Domain Wi-Fi Signal Adaptation using Relativistic Average Generative Adversarial Network

Title: AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation

Title: Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation

Title: Advance Fake Video Detection via Vision Transformers

Title: Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion

Title: What's Wrong with Your Synthetic Tabular Data? Using Explainable AI to Evaluate Generative Models

Title: In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer

Title: DDPS: Discrete Diffusion Posterior Sampling for Paths in Layered Graphs

Title: JTreeformer: Graph-Transformer via Latent-Diffusion Model for Molecular Generation

Title: CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation

Title: Tabular Data Adapters: Improving Outlier Detection for Unlabeled Private Data

Title: AI-GenBench: A New Ongoing Benchmark for AI-Generated Image Detection

Title: Evaluating Generative Models for Tabular Data: Novel Metrics and Benchmarking

Title: Deep Learning Characterizes Depression and Suicidal Ideation from Eye Movements

Title: TesserAct: Learning 4D Embodied World Models

Title: X-Fusion: Introducing New Modality to Frozen Large Language Models

Title: YoChameleon: Personalized Vision and Language Generation