2024-12-03

Title: LeMoLE: LLM-Enhanced Mixture of Linear Experts for Time Series Forecasting

Title: Deep Learning-Based Electricity Price Forecast for Virtual Bidding in Wholesale Electricity Market

Title: DiffGuard: Text-Based Safety Checker for Diffusion Models

Title: Addressing Vulnerabilities in AI-Image Detection: Challenges and Proposed Solutions

Title: Graph Canvas for Controllable 3D Scene Generation

Title: Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference

Title: Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

Title: BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis

Title: OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation

Title: Bridging the Gap: Aligning Text-to-Image Diffusion Models with Specific Feedback

Title: Auto-Encoded Supervision for Perceptual Image Super-Resolution

Title: Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads

Title: Open-Sora Plan: Open-Source Large Video Generation Model

Title: Event-based Tracking of Any Point with Motion-Robust Correlation Features

Title: Differentiable Topology Estimating from Curvatures for 3D Shapes

Title: Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers

Title: Motion Modes: What Could Happen Next?

Title: ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model

Title: VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models

Title: AerialGo: Walking-through City View Generation from Aerial Perspectives

Title: Origin-Destination Demand Prediction: An Urban Radiation and Attraction Perspective

Title: Art-Free Generative Models: Art Creation Without Graphic Art Knowledge

Title: LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Title: Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation

Title: Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment

Title: Towards Pixel-Level Prediction for Gaze Following: Benchmark and Approach

Title: Fusing Physics-Driven Strategies and Cross-Modal Adversarial Learning: Toward Multi-Domain Applications

Title: Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic Survey

Title: Approximate Fiber Product: A Preliminary Algebraic-Geometric Perspective on Multimodal Embedding Alignment

Title: DogLayout: Denoising Diffusion GAN for Discrete and Continuous Layout Generation

Title: On autoregressive deep learning models for day-ahead wind power forecasting with irregular shutdowns due to redispatching

Title: FreeCond: Free Lunch in the Input Conditions of Text-Guided Inpainting

Title: A conditional Generative Adversarial network model for the Weather4Cast 2024 Challenge

Title: Homeostazis and Sparsity in Transformer

Title: Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Title: Graph-to-SFILES: Control structure prediction from process topologies using generative artificial intelligence

Title: Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects

Title: Human Action CLIPS: Detecting AI-generated Human Motion

Title: Motion Dreamer: Realizing Physically Coherent Video Generation through Scene-Aware Motion Reasoning

Title: Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

Title: Contextual Bandits in Payment Processing: Non-uniform Exploration and Supervised Learning at Adyen

Title: Continuous Concepts Removal in Text-to-image Diffusion Models

Title: Generative LiDAR Editing with Controllable Novel Object Layouts

Title: PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

Title: A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision

Title: Sketch-Guided Motion Diffusion for Stylized Cinemagraph Synthesis

Title: Learning on Less: Constraining Pre-trained Model Learning for Generalizable Diffusion-Generated Image Detection

Title: Explaining Object Detectors via Collective Contribution of Pixels

Title: FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation

Title: Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding

Title: Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation

Title: Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks

Title: CtrlNeRF: The Generative Neural Radiation Fields for the Controllable Synthesis of High-fidelity 3D-Aware Images

Title: Prompt as Free Lunch: Enhancing Diversity in Source-Free Cross-domain Few-shot Learning through Semantic-Guided Prompting

Title: DIVD: Deblurring with Improved Video Diffusion Model

Title: Memories of Forgotten Concepts

Title: Generative Model for Synthesizing Ionizable Lipids: A Monte Carlo Tree Search Approach

Title: EventGPT: Event Stream Understanding with Multimodal Large Language Models

Title: Particle-based 6D Object Pose Estimation from Point Clouds using Diffusion Models

Title: AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Title: Advanced Video Inpainting Using Optical Flow-Guided Efficient Diffusion

Title: Deep evolving semi-supervised anomaly detection

Title: Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification

Title: Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration

Title: A Deep Generative Model for the Design of Synthesizable Ionizable Lipids

Title: STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied Agents in Minecraft

Title: WAFFLE: Multimodal Floorplan Understanding in the Wild

Title: Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive

Title: Detecting Memorization in Large Language Models

Title: FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Title: Hiding Faces in Plain Sight: Defending DeepFakes by Disrupting Face Detection

Title: DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding

Title: Look Ma, No Ground Truth! Ground-Truth-Free Tuning of Structure from Motion and Visual SLAM

Title: LoyalDiffusion: A Diffusion Model Guarding Against Data Replication

Title: Object Tracking in a $360^o$ View: A Novel Perspective on Bridging the Gap to Biomedical Advancements

Title: SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

Title: Referring Video Object Segmentation via Language-aligned Track Selection

Title: TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Title: ControlFace: Harnessing Facial Parametric Control for Face Rigging

Title: Graph Community Augmentation with GMM-based Modeling in Latent Space

Title: Rectified Flow For Structure Based Drug Design

Title: MeasureNet: Measurement Based Celiac Disease Identification

Title: TinyFusion: Diffusion Transformers Learned Shallow

Title: Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data

Title: PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Title: Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes

Title: Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation

Title: Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

Title: Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective

Title: EmojiDiff: Advanced Facial Expression Control with High Identity Preservation in Portrait Generation

Title: MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model

Title: Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Title: Negative Token Merging: Image-based Adversarial Feature Guidance

Title: MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models

Title: An overview of diffusion models for generative artificial intelligence

Title: Hierarchical VAE with a Diffusion-based VampPrior

Title: Efficient LLM Inference using Dynamic Input Pruning and Cache-Aware Masking

Title: Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data

Title: HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving

Title: FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Title: CPA: Camera-pose-awareness Diffusion Transformer for Video Generation

Title: DiffPatch: Generating Customizable Adversarial Patches using Diffusion Model

Title: Phaseformer: Phase-based Attention Mechanism for Underwater Image Restoration and Beyond

Title: SerialGen: Personalized Image Generation by First Standardization Then Personalization

Title: RaD: A Metric for Medical Image Distribution Comparison in Out-of-Domain Detection and Other Applications

Title: Structured 3D Latents for Scalable and Versatile 3D Generation

Title: InfinityDrive: Breaking Time Limits in Driving World Models

Title: Tokenizing 3D Molecule Structure with Quantized Spherical Coordinates

Title: Multi-objective Deep Learning: Taxonomy and Survey of the State of the Art

Title: 3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

Title: OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Title: Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning

Title: Driving Scene Synthesis on Free-form Trajectories with Generative Prior

Title: LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

Title: XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation

Title: Hard Constraint Guided Flow Matching for Gradient-Free Generation of PDE Solutions

Title: Pretrained Reversible Generation as Unsupervised Visual Representation Learning

Title: IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models

Title: SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

Title: Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis

Title: Towards Universal Soccer Video Understanding

Title: World-consistent Video Diffusion with Explicit 3D Modeling

Title: X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Title: RandAR: Decoder-only Autoregressive Visual Generation in Random Orders