2025-01-24

Title: Graph Representation Learning with Diffusion Generative Models

Title: Scaling for Fairness? Analyzing Model Size, Data Composition, and Multilinguality in Vision-Language Bias

Title: AgentRec: Agent Recommendation Using Sentence Embeddings Aligned to Human Feedback

Title: Gradient-Free Adversarial Purification with Diffusion Models

Title: Retrievals Can Be Detrimental: A Contrastive Backdoor Attack Paradigm on Retrieval-Augmented Diffusion Models

Title: One Fits All: General Mobility Trajectory Modeling via Masked Conditional Diffusion

Title: MSF: Efficient Diffusion Model Via Multi-Scale Latent Factorize

Title: Contrast: A Hybrid Architecture of Transformers and State Space Models for Low-Level Vision

Title: From Images to Point Clouds: An Efficient Solution for Cross-media Blind Quality Assessment without Annotated Training

Title: Towards Intelligent Design: A Self-driven Framework for Collocated Clothing Synthesis Leveraging Fashion Styles and Textures

Title: Auto-Prompting SAM for Weakly Supervised Landslide Extraction

Title: GC-ConsFlow: Leveraging Optical Flow Residuals and Global Context for Robust Deepfake Detection

Title: EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion

Title: LDR-Net: A Novel Framework for AI-generated Image Detection via Localized Discrepancy Representation

Title: ReasVQA: Advancing VideoQA with Imperfect Reasoning Process

Title: One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt

Title: EventVL: Understand Event Streams via Multimodal Large Language Model

Title: A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation

Title: Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling

Title: Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes

Title: Generating Realistic Forehead-Creases for User Verification via Conditioned Piecewise Polynomial Curves

Title: Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning

Title: PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection

Title: Binary Diffusion Probabilistic Model

Title: Improving Video Generation with Human Feedback

Title: IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models

Title: GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing

Title: Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step