2024-11-27

Title: Leveraging Conversational Generative AI for Anomaly Detection in Digital Substations

Title: Conditional Text-to-Image Generation with Reference Guidance

Title: TPIE: Topology-Preserved Image Editing With Text Instructions

Title: Neuro-Symbolic Evaluation of Text-to-Video Models using Formalf Verification

Title: Importance-based Token Merging for Diffusion Models

Title: EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion

Title: Classifier-Free Guidance inside the Attraction Basin May Cause Memorization

Title: Gradient-Guided Parameter Mask for Multi-Scenario Image Restoration Under Adverse Weather

Title: Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents

Title: FollowGen: A Scaled Noise Conditional Diffusion Model for Car-Following Trajectory Prediction

Title: LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis

Title: AnySynth: Harnessing the Power of Image Synthetic Data Generation for Generalized Vision-Language Tasks

Title: Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy

Title: Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Title: Revisiting DDIM Inversion for Controlling Defect Generation by Disentangling the Background

Title: VidHal: Benchmarking Temporal Hallucinations in Vision LLMs

Title: SynDiff-AD: Improving Semantic Segmentation and End-to-End Autonomous Driving with Synthetic Data from Latent Diffusion Models

Title: NovelGS: Consistent Novel-view Denoising via Large Gaussian Reconstruction Model

Title: UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing

Title: From Diffusion to Resolution: Leveraging 2D Diffusion Models for 3D Super-Resolution Task

Title: Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image

Title: Controllable Human Image Generation with Personalized Multi-Garments

Title: InTraGen: Trajectory-controlled Video Generation for Object Interactions

Title: Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation

Title: Pathways on the Image Manifold: Image Editing via Video Generation

Title: DetailGen3D: Generative 3D Geometry Enhancement via Data-Dependent Flow

Title: Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Title: SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE

Title: Explainable AI Approach using Near Misses Analysis

Title: ZoomLDM: Latent Diffusion Model for multi-scale image generation

Title: Words Matter: Leveraging Individual Text Embeddings for Code Generation in CLIP Test-Time Adaptation

Title: TED-VITON: Transformer-Empowered Diffusion Models for Virtual Try-On

Title: g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks

Title: Free$^2$Guide: Gradient-Free Path Integral Control for Enhancing Text-to-Video Generation with Large Vision-Language Models

Title: Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation

Title: PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation

Title: A generalised novel loss function for computational fluid dynamics

Title: Contrastive Graph Condensation: Advancing Data Versatility through Self-Supervised Learning

Title: Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models

Title: {\Omega}SFormer: Dual-Modal {\Omega}-like Super-Resolution Transformer Network for Cross-scale and High-accuracy Terraced Field Vectorization Extraction

Title: Efficient LLM Inference with I/O-Aware Partial KV Cache Recomputation

Title: PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

Title: OSDFace: One-Step Diffusion Model for Face Restoration

Title: X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics Simulation

Title: ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting

Title: LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization

Title: Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment

Title: PhysMotion: Physics-Grounded Dynamics From a Single Image

Title: MAT: Multi-Range Attention Transformer for Efficient Image Super-Resolution

Title: Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning

Title: AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM

Title: DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting

Title: MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers

Title: From Graph Diffusion to Graph Classification

Title: Grounding-IQA: Multimodal Language Grounding Model for Image Quality Assessment

Title: Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration

Title: HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator

Title: Reward Incremental Learning in Text-to-Image Generation

Title: DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering

Title: AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation

Title: CoA: Chain-of-Action for Generative Semantic Labels

Title: DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters

Title: Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Title: VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

Title: FLEX-CLIP: Feature-Level GEneration Network Enhanced CLIP for X-shot Cross-modal Retrieval

Title: Adversarial Bounding Boxes Generation (ABBG) Attack against Visual Object Trackers

Title: Towards Precise Scaling Laws for Video Diffusion Transformers

Title: Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Title: Puzzle Similarity: A Perceptually-guided No-Reference Metric for Artifact Detection in 3D Scene Reconstructions

Title: Perceptually Optimized Super Resolution

Title: FTMoMamba: Motion Generation with Frequency and Text State Space Models

Title: IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation - An Enhanced Prototype-Guided Diffusion Framework

Title: Pre-training for Action Recognition with Automatically Generated Fractal Datasets

Title: VideoDirector: Precise Video Editing via Text-to-Video Models

Title: Accelerating Vision Diffusion Transformers with Skip Branches

Title: Synthetic Data Generation with LLM for Improved Depression Prediction

Title: SketchAgent: Language-Driven Sequential Sketch Generation

Title: GenDeg: Diffusion-Based Degradation Synthesis for Generalizable All-in-One Image Restoration

Title: ScribbleLight: Single Image Indoor Relighting with Scribbles

Title: Video-Guided Foley Sound Generation with Multimodal Controls