2024-03-15

Title: Procedural terrain generation with style transfer

Title: NoiseDiffusion: Correcting Noise for Image Interpolation with Diffusion Models beyond Spherical Linear Interpolation

Title: ARtVista: Gateway To Empower Anyone Into Artist

Title: Federated Data Model

Title: Envision3D: One Image to 3D with Anchor Views Interpolation

Title: Unveiling the Truth: Exploring Human Gaze Patterns in Fake Images

Title: Representing Anatomical Trees by Denoising Diffusion of Implicit Neural Fields

Title: VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework

Title: Spatial-temporal Memories Enhanced Graph Autoencoder for Anomaly Detection in Dynamic Graphs

Title: Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient Generative Inference

Title: StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Title: Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines

Title: UniCode: Learning a Unified Codebook for Multimodal Large Language Models

Title: Large Language Models are Parallel Multilingual Learners

Title: Desigen: A Pipeline for Controllable Design Template Generation

Title: Rethinking Referring Object Removal

Title: Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

Title: Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation

Title: Unveiling the Generalization Power of Fine-Tuned Large Language Models

Title: Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Title: Intention-aware Denoising Diffusion Model for Trajectory Prediction

Title: PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Title: Intention-driven Ego-to-Exo Video Generation

Title: Noise Dimension of GAN: An Image Compression Perspective

Title: Customizing Segmentation Foundation Model via Prompt Learning for Instance Segmentation

Title: LAN: Learning Adaptive Neighbors for Real-Time Insider Threat Detection

Title: Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective

Title: Annotation Free Semantic Segmentation with Vision Foundation Models

Title: Privacy Preserving Anomaly Detection on Homomorphic Encrypted Data from IoT Sensors

Title: Video Editing via Factorized Diffusion Distillation

Title: GiT: Towards Generalist Vision Transformer through Universal Language Interface

Title: XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization

Title: OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments

Title: Mitigating attribute amplification in counterfactual image generation

Title: Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity

Title: 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

Title: Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk

Title: Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing

Title: MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models

Title: SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams

Title: Rectifying Demonstration Shortcut in In-Context Learning

Title: Anomaly Detection by Adapting a pre-trained Vision Language Model

Title: EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning

Title: VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Title: Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

Title: MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Title: Explore In-Context Segmentation via Latent Diffusion Models

Title: Score-Guided Diffusion for 3D Human Recovery

Title: Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation

Title: Generalized Predictive Model for Autonomous Driving

Title: 3D-VLA: A 3D Vision-Language-Action Generative World Model

Title: OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Title: SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior

Title: GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding