2024-12-17

Title: Personalized and Sequential Text-to-Image Generation

Title: CAP: Evaluation of Persuasive and Creative Image Generation

Title: GPTDrawer: Enhancing Visual Synthesis through ChatGPT

Title: Benchmarking Federated Learning for Semantic Datasets: Federated Scene Graph Generation

Title: SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion

Title: SweetTokenizer: Semantic-Aware Spatial-Temporal Tokenizer for Compact Visual Discretization

Title: Boundary Exploration of Next Best View Policy in 3D Robotic Scanning

Title: Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning

Title: Motion Generation Review: Exploring Deep Learning for Lifelike Animation with Manifold

Title: SVGBuilder: Component-Based Colored SVG Generation with Text-Guided Autoregressive Transformers

Title: CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information

Title: SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation

Title: SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Title: The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion

Title: Solving the Inverse Alignment Problem for Efficient RLHF

Title: Towards Using Machine Learning to Generatively Simulate EV Charging in Urban Areas

Title: SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner

Title: RAGServe: Fast Quality-Aware RAG Systems with Configuration Adaptation

Title: Adaptive Sampling to Reduce Epistemic Uncertainty Using Prediction Interval-Generation Neural Networks

Title: PanSR: An Object-Centric Mask Transformer for Panoptic Segmentation

Title: Towards Unified Benchmark and Models for Multi-Modal Perceptual Metrics

Title: EvalGIM: A Library for Evaluating Generative Image Models

Title: UCDR-Adapter: Exploring Adaptation of Pre-Trained Vision-Language Models for Universal Cross-Domain Retrieval

Title: Control of Overfitting with Physics

Title: GRID: Visual Layout Generation

Title: OmniHD-Scenes: A Next-Generation Multimodal Dataset for Autonomous Driving

Title: NeuralPLexer3: Physio-Realistic Biomolecular Complex Structure Prediction with Flow Models

Title: VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation

Title: Video Diffusion Transformers are In-Context Learners

Title: StyleDiT: A Unified Framework for Diverse Child and Partner Faces Synthesis with Style Latent Diffusion Transformer

Title: Optimizing Few-Step Sampler for Diffusion Probabilistic Model

Title: Reliable and superior elliptic Fourier descriptor normalization and its application software ElliShape with efficient image processing

Title: Medical Manifestation-Aware De-Identification

Title: Diffusion Model from Scratch

Title: Unbiased General Annotated Dataset Generation

Title: RWKV-edge: Deeply Compressed RWKV for Resource-Constrained Devices

Title: Zigzag Diffusion Sampling: The Path to Success Is Zigzag

Title: Multi-Class and Multi-Task Strategies for Neural Directed Link Prediction

Title: Video Representation Learning with Joint-Embedding Predictive Architectures

Title: Progressive Compression with Universally Quantized Diffusion Models

Title: A Staged Deep Learning Approach to Spatial Refinement in 3D Temporal Atmospheric Transport

Title: SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer

Title: FlowDock: Geometric Flow Matching for Generative Protein-Ligand Docking and Affinity Prediction

Title: Towards Context-aware Convolutional Network for Image Restoration

Title: PromptV: Leveraging LLM-powered Multi-Agent Prompting for High-quality Verilog Generation

Title: Exploring Diffusion and Flow Matching Under Generator Matching

Title: From Simple to Professional: A Combinatorial Controllable Image Captioning Agent

Title: SceneLLM: Implicit Language Reasoning in LLM for Dynamic Scene Graph Generation

Title: AURORA: Automated Unleash of 3D Room Outlines for VR Applications

Title: Understanding and Mitigating Memorization in Diffusion Models for Tabular Data

Title: RAC3: Retrieval-Augmented Corner Case Comprehension for Autonomous Driving with Vision-Language Models

Title: DisCo-DSO: Coupling Discrete and Continuous Optimization for Efficient Generative Design in Hybrid Spaces

Title: Overview of TREC 2024 Medical Video Question Answering (MedVidQA) Track

Title: HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation

Title: Edge Contrastive Learning: An Augmentation-Free Graph Contrastive Learning Model

Title: DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

Title: Empowering LLMs to Understand and Generate Complex Vector Graphics

Title: A Comprehensive Survey of Action Quality Assessment: Method and Benchmark

Title: OTLRM: Orthogonal Learning-based Low-Rank Metric for Multi-Dimensional Inverse Problems

Title: Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation

Title: OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation

Title: Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation

Title: GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control

Title: GenLit: Reformulating Single-Image Relighting as Video Generation

Title: On the Generalizability of Iterative Patch Selection for Memory-Efficient High-Resolution Image Classification

Title: Wasserstein Bounds for generative diffusion models with Gaussian tail targets

Title: Detecting Daily Living Gait Amid Huntington's Disease Chorea using a Foundation Deep Learning Model

Title: Grassmannian Geometry Meets Dynamic Mode Decomposition in DMD-GEN: A New Metric for Mode Collapse in Time Series Generative Models

Title: One-Shot Multilingual Font Generation Via ViT

Title: Adapting Segment Anything Model (SAM) to Experimental Datasets via Fine-Tuning on GAN-based Simulation: A Case Study in Additive Manufacturing

Title: Leveraging Retrieval-Augmented Tags for Large Vision-Language Understanding in Complex Scenes

Title: Quantization of Climate Change Impacts on Renewable Energy Generation Capacity: A Super-Resolution Recurrent Diffusion Model

Title: Scaled Conjugate Gradient Method for Nonconvex Optimization in Deep Neural Networks

Title: An Enhanced Classification Method Based on Adaptive Multi-Scale Fusion for Long-tailed Multispectral Point Clouds

Title: Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech

Title: Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models

Title: Towards Scientific Discovery with Generative AI: Progress, Opportunities, and Challenges

Title: Learning Implicit Features with Flow Infused Attention for Realistic Virtual Try-On

Title: Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces

Title: FedCAR: Cross-client Adaptive Re-weighting for Generative Models in Federated Learning

Title: HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection

Title: IGR: Improving Diffusion Model for Garment Restoration from Person Image

Title: LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model

Title: Sequence Matters: Harnessing Video Models in Super-Resolution

Title: MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models

Title: StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors

Title: VersaGen: Unleashing Versatile Visual Control for Text-to-Image Synthesis

Title: MeshArt: Generating Articulated Meshes with Structure-guided Transformers

Title: 3D$^2$-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling

Title: CLIP-SR: Collaborative Linguistic and Image Processing for Super-Resolution

Title: VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting

Title: Predicting the Original Appearance of Damaged Historical Documents

Title: IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation

Title: EGP3D: Edge-guided Geometric Preserving 3D Point Cloud Super-resolution for RGB-D camera

Title: AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration

Title: Transferable Adversarial Face Attack with Text Controlled Attribute

Title: Generative Inbetweening through Frame-wise Conditions-Driven Video Generation

Title: IDEA-Bench: How Far are Generative Models from Professional Designing?

Title: Fast and Slow Gradient Approximation for Binary Neural Network Optimization

Title: Impact of Face Alignment on Face Image Quality

Title: InterDyn: Controllable Interactive Dynamics with Video Diffusion Models

Title: AMI-Net: Adaptive Mask Inpainting Network for Industrial Anomaly Detection and Localization

Title: ColorFlow: Retrieval-Augmented Image Sequence Colorization

Title: UnMA-CapSumT: Unified and Multi-Head Attention-driven Caption Summarization Transformer

Title: Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data

Title: Industrial-scale Prediction of Cement Clinker Phases using Machine Learning

Title: A LoRA is Worth a Thousand Pictures

Title: Wonderland: Navigating 3D Scenes from a Single Image

Title: CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Title: Causal Diffusion Transformers for Generative Modeling