2024-06-13

Title: An Effective Approach to Scramble Multiple Diagnostic Imageries Using Chaos-Based Cryptography

Title: BrainChat: Decoding Semantic Information from fMRI using Vision-language Pretrained Models

Title: When is an Embedding Model More Promising than Another?

Title: Treeffuser: Probabilistic Predictions via Conditional Diffusions with Gradient-Boosted Trees

Title: ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Title: AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation

Title: Sustainable self-supervised learning for speech representations

Title: CUPID: Contextual Understanding of Prompt-conditioned Image Distributions

Title: Object-level Scene Deocclusion

Title: HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness

Title: Hierarchical Patch Diffusion Models for High-Resolution Video Generation

Title: Are Large Language Models Good Statisticians?

Title: Sense Less, Generate More: Pre-training LiDAR Perception with Masked Autoencoders for Ultra-Efficient 3D Sensing

Title: SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models

Title: Understanding and Mitigating Compositional Issues in Text-to-Image Generative Models

Title: DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition

Title: FaithFill: Faithful Inpainting for Object Completion Using a Single Reference Image

Title: Flexible Music-Conditioned Dance Generation with Style Description Prompts

Title: Small Scale Data-Free Knowledge Distillation

Title: A Comprehensive Survey on Machine Learning Driven Material Defect Detection: Challenges, Solutions, and Future Prospects

Title: GENIU: A Restricted Data Access Unlearning for Imbalanced Data

Title: An Empirical Study of Mamba-based Language Models

Title: Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation

Title: Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations

Title: Ablation Based Counterfactuals

Title: DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Title: Accurate Explanation Model for Image Classifiers using Class Association Embedding

Title: Guiding In-Context Learning of LLMs through Quality Estimation for Machine Translation

Title: SimSAM: Simple Siamese Representations Based Semantic Affinity Matrix for Unsupervised Image Segmentation

Title: Generalizable Disaster Damage Assessment via Change Detection with Vision Foundation Model

Title: CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models

Title: Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement

Title: Continuous fake media detection: adapting deepfake detectors to new generative techniques

Title: Diffusion-Promoted HDR Video Reconstruction

Title: A Sociotechnical Lens for Evaluating Computer Vision Models: A Case Study on Detecting and Reasoning about Gender and Emotion

Title: MaIL: Improving Imitation Learning with Mamba

Title: Dataset Enhancement with Instance-Level Augmentations

Title: A deep cut into Split Federated Self-supervised Learning

Title: Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

Title: GraphFM: A Comprehensive Benchmark for Graph Foundation Model

Title: WMAdapter: Adding WaterMark Control to Latent Diffusion Models

Title: APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentatio

Title: 2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

Title: FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation

Title: Tailoring Generative AI Chatbots for Multiethnic Communities in Disaster Preparedness Communication: Extending the CASA Paradigm

Title: OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Title: State Soup: In-Context Skill Learning, Retrieval and Mixing

Title: Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Title: OLMES: A Standard for Language Model Evaluations

Title: Self-supervised Learning of Neural Implicit Feature Fields for Camera Pose Refinement

Title: PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences

Title: Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models

Title: What If We Recaption Billions of Web Images with LLaMA-3?

Title: Enhancing End-to-End Autonomous Driving with Latent World Model

Title: Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation