2024-06-14

Title: Enhanced Anomaly Detection in Automotive Systems Using SAAD: Statistical Aggregated Anomaly Detection

Title: DiTFastAttn: Attention Compression for Diffusion Transformer Models

Title: Language Model Council: Benchmarking Foundation Models on Highly Subjective Tasks by Consensus

Title: FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion

Title: End-to-End Argument Mining as Augmented Natural Language Generation

Title: Self-Supervised Speech Representations are More Phonetic than Semantic

Title: TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation

Title: Vivid-ZOO: Multi-View Video Generation with Diffusion Model

Title: Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification

Title: mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus

Title: Standard Language Ideology in AI-Generated Language

Title: Comparative Analysis of Deep Convolutional Neural Networks for Detecting Medical Image Deepfakes

Title: FouRA: Fourier Low Rank Adaptation

Title: Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Title: Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

Title: COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing

Title: An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Title: Multiple Prior Representation Learning for Self-Supervised Monocular Depth Estimation via Hybrid Transformer

Title: Step-by-Step Diffusion: An Elementary Tutorial

Title: Preserving Identity with Variational Score for General-purpose 3D Editing

Title: XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Title: Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting Process: Methodology and Benchmark

Title: Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?

Title: FacEnhance: Facial Expression Enhancing with Recurrent DDPMs

Title: ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models

Title: Data-Free Generative Replay for Class-Incremental Learning on Imbalanced Data

Title: EquiPrompt: Debiasing Diffusion Models via Iterative Bootstrapping in Chain of Thoughts

Title: DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning

Title: Chain-of-Though (CoT) prompting strategies for medical error detection and correction

Title: Weakly-supervised anomaly detection for multimodal data distributions

Title: LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks

Title: DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation

Title: Towards Multilingual Audio-Visual Question Answering

Title: EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Title: GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

Title: Orthogonality and isotropy of speaker and phonetic information in self-supervised speech representations

Title: Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn't

Title: On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models

Title: Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models

Title: StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning

Title: You Don't Need Data-Augmentation in Self-Supervised Learning

Title: Parameter-Efficient Active Learning for Foundational models

Title: Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation

Title: Learning from Natural Language Explanations for Generalizable Entity Matching

Title: DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding

Title: Separations in the Representational Capabilities of Transformers and Recurrent Architectures

Title: Advancing Graph Generation through Beta Diffusion

Title: Understanding Hallucinations in Diffusion Models through Mode Interpolation

Title: Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

Title: CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models

Title: GGHead: Fast and Generalizable 3D Gaussian Heads

Title: Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models

Title: Towards Vision-Language Geo-Foundation Model: A Survey

Title: SimGen: Simulator-conditioned Driving Scene Generation

Title: WonderWorld: Interactive 3D Scene Generation from a Single Image

Title: Real-Time Deepfake Detection in the Real-World

Title: OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation

Title: Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion

Title: ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing

Title: 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Title: Interpreting the Weight Space of Customized Diffusion Models

Title: Depth Anything V2

Title: An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Title: Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Title: Rethinking Score Distillation as a Bridge Between Image Distributions