2024-04-03

Title: Holo-VQVAE: VQ-VAE for phase-only holograms

Title: LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model

Title: Generative AI for Architectural Design: A Literature Review

Title: DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

Title: Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Title: Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

Title: DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery

Title: Predicting the Performance of Foundation Models via Agreement-on-the-Line

Title: Diffusion Deepfake

Title: Entity Disambiguation via Fusion Entity Decoding

Title: Enhancing Functional Safety in Automotive AMS Circuits through Unsupervised Machine Learning

Title: AI WALKUP: A Computer-Vision Approach to Quantifying MDS-UPDRS in Parkinson's Disease

Title: FashionEngine: Interactive Generation and Editing of 3D Clothed Humans

Title: Release of Pre-Trained Models for the Japanese Language

Title: MotionChain: Conversational Motion Controllers via Multimodal Prompts

Title: Upsample Guidance: Scale Up Diffusion Models without Training

Title: Generative AI for Immersive Communication: The Next Frontier in Internet-of-Senses Through 6G

Title: AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation

Title: Self-Improvement Programming for Temporal Knowledge Graph Question Answering

Title: Asymptotics of Language Model Alignment

Title: T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

Title: Generative AI-Based Text Generation Methods Using Pre-Trained GPT-2 Model

Title: Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

Title: Real, fake and synthetic faces - does the coin have three sides?

Title: Bi-LORA: A Vision-Language Approach for Synthetic Image Detection

Title: Team UTSA-NLP at SemEval 2024 Task 5: Prompt Ensembling for Argument Reasoning in Civil Procedures with GPT4

Title: Fashion Style Editing with Generative Human Prior

Title: DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning

Title: Africa-Centric Self-Supervised Pre-Training for Multilingual Speech Representation in a Sub-Saharan Context

Title: AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug Design

Title: SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation

Title: Universal representations for financial transactional data: embracing local, global, and external contexts

Title: Deconstructing In-Context Learning: Understanding Prompts via Corruption

Title: Long-context LLMs Struggle with Long In-context Learning

Title: Red-Teaming Segment Anything Model

Title: WcDT: World-centric Diffusion Transformer for Traffic Scene Generation

Title: Adaptive Feature Fusion Neural Network for Glaucoma Segmentation on Unseen Fundus Images

Title: BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition

Title: Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models

Title: 3D Congealing: 3D-Aware Image Alignment in the Wild

Title: Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models

Title: Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks

Title: GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image