diffusion

Title: StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D. (arXiv:2312.02189v1 [cs.CV])

Title: Diffusion Handles: Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D. (arXiv:2312.02190v1 [cs.CV])

Title: Exploiting Diffusion Priors for All-in-One Image Restoration. (arXiv:2312.02197v1 [cs.CV])

Title: ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation. (arXiv:2312.02201v1 [cs.CV])

Title: Portrait Diffusion: Training-free Face Stylization with Chain-of-Painting. (arXiv:2312.02212v1 [cs.CV])

Title: Slice3D: Multi-Slice, Occlusion-Revealing, Single View 3D Reconstruction. (arXiv:2312.02221v1 [cs.CV])

Title: MedXChat: Bridging CXR Modalities with a Unified Multimodal Large Model. (arXiv:2312.02233v1 [cs.CV])

Title: X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model. (arXiv:2312.02238v1 [cs.CV])

Title: Conditional Variational Diffusion Models. (arXiv:2312.02246v1 [cs.CV])

Title: Large Language Models as Consistent Story Visualizers. (arXiv:2312.02252v1 [cs.CV])

Title: Diversify, Don't Fine-Tune: Scaling Up Visual Recognition Training with Synthetic Images. (arXiv:2312.02253v1 [cs.CV])

Title: EMDM: Efficient Motion Diffusion Model for Fast, High-Quality Motion Generation. (arXiv:2312.02256v1 [cs.CV])

Title: Towards Granularity-adjusted Pixel-level Semantic Annotation. (arXiv:2312.02420v1 [cs.CV])

Title: Orthogonal Adaptation for Modular Customization of Diffusion Models. (arXiv:2312.02432v1 [cs.CV])

Title: Retrieving Conditions from Reference Images for Diffusion Models. (arXiv:2312.02521v1 [cs.CV])

Title: GeNIe: Generative Hard Negative Images Through Diffusion. (arXiv:2312.02548v1 [cs.CV])

Title: Prompt2NeRF-PIL: Fast NeRF Generation via Pretrained Implicit Latent. (arXiv:2312.02568v1 [cs.CV])

Title: Projection Regret: Reducing Background Bias for Novelty Detection via Diffusion Models. (arXiv:2312.02615v1 [cs.LG])

Title: DreaMo: Articulated 3D Reconstruction From A Single Casual Video. (arXiv:2312.02617v1 [cs.CV])

Title: Diffusion Noise Feature: Accurate and Fast Generated Image Detection. (arXiv:2312.02625v1 [cs.CV])

Title: TPA3D: Triplane Attention for Fast Text-to-3D Generation. (arXiv:2312.02647v1 [cs.CV])

Title: Analyzing and Improving the Training Dynamics of Diffusion Models. (arXiv:2312.02696v1 [cs.CV])

Title: Neural Sign Actors: A diffusion model for 3D sign language production from text. (arXiv:2312.02702v1 [cs.CV])

Title: A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling. (arXiv:2312.02719v1 [cs.CV])

Title: Generating Fine-Grained Human Motions Using ChatGPT-Refined Descriptions. (arXiv:2312.02772v1 [cs.CV])

Title: BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models. (arXiv:2312.02813v1 [cs.CV])

Title: Deterministic Guidance Diffusion Model for Probabilistic Weather Forecasting. (arXiv:2312.02819v1 [cs.CV])

self-supervised

Title: TailorMe: Self-Supervised Learning of an Anatomically Constrained Volumetric Human Shape Model. (arXiv:2312.02173v1 [cs.CV])

Title: Local Masking Meets Progressive Freezing: Crafting Efficient Vision Transformers for Self-Supervised Learning. (arXiv:2312.02194v1 [cs.CV])

Title: USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery. (arXiv:2312.02199v1 [cs.CV])

Title: Disentangling the Effects of Data Augmentation and Format Transform in Self-Supervised Learning of Image Representations. (arXiv:2312.02205v1 [cs.CV])

Title: A Data-efficient Framework for Robotics Large-scale LiDAR Scene Parsing. (arXiv:2312.02208v1 [cs.CV])

Title: Class-Discriminative Attention Maps for Vision Transformers. (arXiv:2312.02364v1 [cs.CV])

Title: AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation. (arXiv:2312.02512v1 [cs.CV])

Title: Rethinking and Simplifying Bootstrapped Graph Latents. (arXiv:2312.02619v1 [cs.LG])

foundation model

Title: Towards General Purpose Vision Foundation Models for Medical Image Analysis: An Experimental Study of DINOv2 on Radiology Benchmarks. (arXiv:2312.02366v1 [cs.CV])

generative

Title: The SVHN Dataset Is Deceptive for Probabilistic Generative Models Due to a Distribution Mismatch. (arXiv:2312.02168v1 [cs.CV])

Title: InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars. (arXiv:2312.02222v1 [cs.CV])

Title: Tracing Hyperparameter Dependencies for Model Parsing via Learnable Graph Pooling Network. (arXiv:2312.02224v1 [cs.CV])

Title: GenEM: Physics-Informed Generative Cryo-Electron Microscopy. (arXiv:2312.02235v1 [cs.CV])

Title: PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation. (arXiv:2312.02284v1 [cs.CV])

Title: How Generative-AI can be Effectively used in Government Chatbots. (arXiv:2312.02181v1 [cs.CL])

Title: An Evaluation Framework for Mapping News Headlines to Event Classes in a Knowledge Graph. (arXiv:2312.02334v1 [cs.CL])

Title: Visually Grounded Language Learning: a review of language games, datasets, tasks, and models. (arXiv:2312.02431v1 [cs.CL])

Title: MKA: A Scalable Medical Knowledge Assisted Mechanism for Generative Models on Medical Conversation Tasks. (arXiv:2312.02496v1 [cs.CL])

Title: H-GAP: Humanoid Control with a Generalist Planner. (arXiv:2312.02682v1 [cs.LG])

Title: Toward autocorrection of chemical process flowsheets using large language models. (arXiv:2312.02873v1 [cs.LG])

anomaly

Title: A Unified Simulation Framework for Visual and Behavioral Fidelity in Crowd Analysis. (arXiv:2312.02613v1 [cs.CV])

Title: Pseudo Replay-based Class Continual Learning for Online New Category Anomaly Detection in Additive Manufacturing. (arXiv:2312.02491v1 [cs.LG])

Title: MEMTO: Memory-guided Transformer for Multivariate Time Series Anomaly Detection. (arXiv:2312.02530v1 [cs.LG])

Title: A Self-Commissioning Edge Computing Method for Data-Driven Anomaly Detection in Power Electronic Systems. (arXiv:2312.02661v1 [cs.LG])

Title: Semi-Supervised Health Index Monitoring with Feature Generation and Fusion. (arXiv:2312.02867v1 [cs.LG])

in-context

Title: Towards More Unified In-context Visual Understanding. (arXiv:2312.02520v1 [cs.CV])

Title: Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning. (arXiv:2312.02546v1 [cs.CV])

Title: Prompt Optimization via Adversarial In-Context Learning. (arXiv:2312.02614v1 [cs.LG])