diffusion

Title: Unlocking Spatial Comprehension in Text-to-Image Diffusion Models. (arXiv:2311.17937v1 [cs.CV])

Title: DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback. (arXiv:2311.17946v1 [cs.CV])

Title: PEAN: A Diffusion-based Prior-Enhanced Attention Network for Scene Text Image Super-Resolution. (arXiv:2311.17955v1 [cs.CV])

Title: HandRefiner: Refining Malformed Hands in Generated Images by Diffusion-based Conditional Inpainting. (arXiv:2311.17957v1 [cs.CV])

Title: ChatIllusion: Efficient-Aligning Interleaved Generation ability with Visual Instruction Model. (arXiv:2311.17963v1 [cs.CV])

Title: GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation. (arXiv:2311.17971v1 [cs.CV])

Title: Improving Faithfulness for Vision Transformers. (arXiv:2311.17983v1 [cs.CV])

Title: 4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling. (arXiv:2311.17984v1 [cs.CV])

Title: Turn Down the Noise: Leveraging Diffusion Models for Test-time Adaptation via Pseudo-label Ensembling. (arXiv:2311.18071v1 [cs.CV])

Title: Zooming Out on Zooming In: Advancing Super-Resolution for Remote Sensing. (arXiv:2311.18082v1 [cs.CV])

Title: HiPA: Enabling One-Step Text-to-Image Diffusion Models via High-Frequency-Promoting Adaptation. (arXiv:2311.18158v1 [cs.CV])

Title: SMaRt: Improving GANs with Score Matching Regularity. (arXiv:2311.18208v1 [cs.LG])

Title: Diffusion Models Without Attention. (arXiv:2311.18257v1 [cs.CV])

Title: Prompt-Based Exemplar Super-Compression and Regeneration for Class-Incremental Learning. (arXiv:2311.18266v1 [cs.CV])

Title: On Exact Inversion of DPM-Solvers. (arXiv:2311.18387v1 [cs.CV])

Title: CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model. (arXiv:2311.18405v1 [cs.CV])

Title: Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis. (arXiv:2311.18435v1 [cs.CV])

Title: Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing. (arXiv:2311.18608v1 [cs.CV])

Title: DiffCAD: Weakly-Supervised Probabilistic CAD Model Retrieval and Alignment from an RGB Image. (arXiv:2311.18610v1 [cs.CV])

Title: DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars. (arXiv:2311.18635v1 [cs.CV])

Title: Detailed Human-Centric Text Description-Driven Large Scene Synthesis. (arXiv:2311.18654v1 [cs.CV])

Title: C3Net: Compound Conditioned ControlNet for Multimodal Content Generation. (arXiv:2311.17951v1 [cs.LG])

self-supervised

Title: Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames. (arXiv:2311.17940v1 [cs.CV])

Title: Object-based (yet Class-agnostic) Video Domain Adaptation. (arXiv:2311.17942v1 [cs.CV])

Title: Perceptual Group Tokenizer: Building Perception with Iterative Grouping. (arXiv:2311.18296v1 [cs.CV])

Title: Multilevel Saliency-Guided Self-Supervised Learning for Image Anomaly Detection. (arXiv:2311.18332v1 [cs.CV])

Title: A Lightweight Clustering Framework for Unsupervised Semantic Segmentation. (arXiv:2311.18628v1 [cs.CV])

Title: Stochastic Vision Transformers with Wasserstein Distance-Aware Attention. (arXiv:2311.18645v1 [cs.CV])

Title: Self-Supervised Learning for Large-Scale Preventive Security Constrained DC Optimal Power Flow. (arXiv:2311.18072v1 [cs.LG])

foundation model

Title: Guided Prompting in SAM for Weakly Supervised Cell Segmentation in Histopathological Images. (arXiv:2311.17960v1 [cs.CV])

Title: Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features. (arXiv:2311.18113v1 [cs.CV])

Title: Label-efficient Training of Small Task-specific Models by Leveraging Vision Foundation Models. (arXiv:2311.18237v1 [cs.CV])

generative

Title: Contrastive Vision-Language Alignment Makes Efficient Instruction Learner. (arXiv:2311.17945v1 [cs.CV])

Title: Rethinking Image Editing Detection in the Era of Generative AI Revolution. (arXiv:2311.17953v1 [cs.CV])

Title: VBench: Comprehensive Benchmark Suite for Video Generative Models. (arXiv:2311.17982v1 [cs.CV])

Title: GELDA: A generative language annotation framework to reveal visual biases in datasets. (arXiv:2311.18064v1 [cs.CV])

Title: Few-shot Image Generation via Style Adaptation and Content Preservation. (arXiv:2311.18169v1 [cs.CV])

Title: TrustMark: Universal Watermarking for Arbitrary Resolution Images. (arXiv:2311.18297v1 [cs.CV])

Title: OmniMotionGPT: Animal Motion Generation with Limited Data. (arXiv:2311.18303v1 [cs.CV])

Title: ROBBIE: Robust Bias Evaluation of Large Generative Language Models. (arXiv:2311.18140v1 [cs.CL])

Title: FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity. (arXiv:2311.18580v1 [cs.CL])

Title: Combining deep generative models with extreme value theory for synthetic hazard simulation: a multivariate and spatially coherent approach. (arXiv:2311.18521v1 [cs.LG])

anomaly

Title: Detecting Anomalous Network Communication Patterns Using Graph Convolutional Networks. (arXiv:2311.18525v1 [cs.CR])

Title: TransNAS-TSAD: Harnessing Transformers for Multi-Objective Neural Architecture Search in Time Series Anomaly Detection. (arXiv:2311.18061v1 [cs.LG])

in-context

Title: LALM: Long-Term Action Anticipation with Language Models. (arXiv:2311.17944v1 [cs.CV])

Title: Understanding and Improving In-Context Learning on Vision-language Models. (arXiv:2311.18021v1 [cs.CV])

Title: Positional Information Matters for Invariant In-Context Learning: A Case Study of Simple Function Classes. (arXiv:2311.18194v1 [cs.LG])