2024-12-03

Title: DiffGuard: Text-Based Safety Checker for Diffusion Models

Title: Addressing Vulnerabilities in AI-Image Detection: Challenges and Proposed Solutions

Title: Unpacking the Individual Components of Diffusion Policy

Title: Graph Canvas for Controllable 3D Scene Generation

Title: A Novel Approach to Image Steganography Using Generative Adversarial Networks

Title: Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

Title: Differential learning kinetics govern the transition from memorization to generalization during in-context learning

Title: Demographic Predictability in 3D CT Foundation Embeddings

Title: SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments

Title: OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation

Title: Bridging the Gap: Aligning Text-to-Image Diffusion Models with Specific Feedback

Title: Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads

Title: PP-SSL : Priority-Perception Self-Supervised Learning for Fine-Grained Recognition

Title: FonTS: Text Rendering with Typography and Style Controls

Title: Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers

Title: MPQ-Diff: Mixed Precision Quantization for Diffusion Models

Title: Knowledge-Augmented Explainable and Interpretable Learning for Anomaly Detection and Diagnosis

Title: Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise

Title: VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models

Title: AerialGo: Walking-through City View Generation from Aerial Perspectives

Title: Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning

Title: Art-Free Generative Models: Art Creation Without Graphic Art Knowledge

Title: LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Title: Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation

Title: Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment

Title: Towards Pixel-Level Prediction for Gaze Following: Benchmark and Approach

Title: Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic Survey

Title: DogLayout: Denoising Diffusion GAN for Discrete and Continuous Layout Generation

Title: On Foundation Models for Dynamical Systems from Purely Synthetic Data

Title: DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Title: FreeCond: Free Lunch in the Input Conditions of Text-Guided Inpainting

Title: A conditional Generative Adversarial network model for the Weather4Cast 2024 Challenge

Title: Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Title: Graph-to-SFILES: Control structure prediction from process topologies using generative artificial intelligence

Title: Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects

Title: Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

Title: Friend or Foe? Harnessing Controllable Overfitting for Anomaly Detection

Title: Continuous Concepts Removal in Text-to-image Diffusion Models

Title: Generative LiDAR Editing with Controllable Novel Object Layouts

Title: PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

Title: A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision

Title: Sketch-Guided Motion Diffusion for Stylized Cinemagraph Synthesis

Title: Improving Decoupled Posterior Sampling for Inverse Problems using Data Consistency Constraint

Title: Learning on Less: Constraining Pre-trained Model Learning for Generalizable Diffusion-Generated Image Detection

Title: FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation

Title: Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding

Title: Enhancing the Generalization Capability of Skin Lesion Classification Models with Active Domain Adaptation Methods

Title: Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks

Title: CtrlNeRF: The Generative Neural Radiation Fields for the Controllable Synthesis of High-fidelity 3D-Aware Images

Title: DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling

Title: Learning to Forget using Hypernetworks

Title: PGSO: Prompt-based Generative Sequence Optimization Network for Aspect-based Sentiment Analysis

Title: Explorations in Self-Supervised Learning: Dataset Composition Testing for Object Classification

Title: DIVD: Deblurring with Improved Video Diffusion Model

Title: Memories of Forgotten Concepts

Title: EDTformer: An Efficient Decoder Transformer for Visual Place Recognition

Title: Generative Model for Synthesizing Ionizable Lipids: A Monte Carlo Tree Search Approach

Title: Categorical Keypoint Positional Embedding for Robust Animal Re-Identification

Title: Particle-based 6D Object Pose Estimation from Point Clouds using Diffusion Models

Title: AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Title: Advanced Video Inpainting Using Optical Flow-Guided Efficient Diffusion

Title: Deep evolving semi-supervised anomaly detection

Title: Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration

Title: Exploring Large Vision-Language Models for Robust and Efficient Industrial Anomaly Detection

Title: A Deep Generative Model for the Design of Synthesizable Ionizable Lipids

Title: STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied Agents in Minecraft

Title: WAFFLE: Multimodal Floorplan Understanding in the Wild

Title: Competition Dynamics Shape Algorithmic Phases of In-Context Learning

Title: Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Title: Evaluating Automated Radiology Report Quality through Fine-Grained Phrasal Grounding of Clinical Findings

Title: CRISP: Object Pose and Shape Estimation with Test-Time Adaptation

Title: FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Title: DuoCast: Duo-Probabilistic Meteorology-Aware Model for Extended Precipitation Nowcasting

Title: One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Title: Multi-Scale Representation Learning for Protein Fitness Prediction

Title: DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding

Title: Look Ma, No Ground Truth! Ground-Truth-Free Tuning of Structure from Motion and Visual SLAM

Title: LoyalDiffusion: A Diffusion Model Guarding Against Data Replication

Title: Referring Video Object Segmentation via Language-aligned Track Selection

Title: TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Title: R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation

Title: Graph Community Augmentation with GMM-based Modeling in Latent Space

Title: Rectified Flow For Structure Based Drug Design

Title: OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?

Title: MeasureNet: Measurement Based Celiac Disease Identification

Title: MiningGPT -- A Domain-Specific Large Language Model for the Mining Industry

Title: TinyFusion: Diffusion Transformers Learned Shallow

Title: Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data

Title: PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Title: Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes

Title: Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation

Title: Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

Title: Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective

Title: EmojiDiff: Advanced Facial Expression Control with High Identity Preservation in Portrait Generation

Title: NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Title: Indexing Economic Fluctuation Narratives from Keiki Watchers Survey

Title: MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost

Title: MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model

Title: Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Title: Negative Token Merging: Image-based Adversarial Feature Guidance

Title: MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models

Title: An overview of diffusion models for generative artificial intelligence

Title: Hierarchical VAE with a Diffusion-based VampPrior

Title: Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data

Title: Machine Learning Analysis of Anomalous Diffusion

Title: HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving

Title: FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Title: CPA: Camera-pose-awareness Diffusion Transformer for Video Generation

Title: DiffPatch: Generating Customizable Adversarial Patches using Diffusion Model

Title: RaD: A Metric for Medical Image Distribution Comparison in Out-of-Domain Detection and Other Applications

Title: Structured 3D Latents for Scalable and Versatile 3D Generation

Title: HaGRIDv2: 1M Images for Static and Dynamic Hand Gesture Recognition

Title: GFreeDet: Exploiting Gaussian Splatting and Foundation Models for Model-free Unseen Object Detection in the BOP Challenge 2024

Title: Multi-objective Deep Learning: Taxonomy and Survey of the State of the Art

Title: 3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

Title: OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Title: AVS-Net: Audio-Visual Scale Net for Self-supervised Monocular Metric Depth Estimation

Title: Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning

Title: Diffusion Models with Anisotropic Gaussian Splatting for Image Inpainting

Title: Driving Scene Synthesis on Free-form Trajectories with Generative Prior

Title: LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

Title: XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation

Title: Hard Constraint Guided Flow Matching for Gradient-Free Generation of PDE Solutions

Title: Pretrained Reversible Generation as Unsupervised Visual Representation Learning

Title: CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion

Title: IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models

Title: SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

Title: COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training

Title: Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis

Title: Towards Universal Soccer Video Understanding

Title: World-consistent Video Diffusion with Explicit 3D Modeling

Title: X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models