2024-08-19

Title: Segment Anything for Videos: A Systematic Survey

Title: TurboEdit: Instant text-based image editing

Title: METR: Image Watermarking with Large Number of Unique Messages

Title: Penny-Wise and Pound-Foolish in Deepfake Detection

Title: SpectralEarth: Training Hyperspectral Foundation Models at Scale

Title: Achieving Complex Image Edits via Function Aggregation with Diffusion Models

Title: Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness

Title: Visual-Friendly Concept Protection via Selective Adversarial Perturbations

Title: GS-ID: Illumination Decomposition on Gaussian Splatting via Diffusion Prior and Parametric Light Source Optimization

Title: Inverse design with conditional cascaded diffusion models

Title: Integrating Multi-view Analysis: Multi-view Mixture-of-Expert for Textual Personality Detection

Title: A New Chinese Landscape Paintings Generation Model based on Stable Diffusion using DreamBooth

Title: RadioDiff: An Effective Generative Diffusion Model for Sampling-Free Dynamic Radio Map Construction

Title: Generative Dataset Distillation Based on Diffusion Model

Title: An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation

Title: Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning

Title: A Novel Buffered Federated Learning Framework for Privacy-Driven Anomaly Detection in IIoT

Title: Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion

Title: PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

Title: Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions

Title: Representation Learning of Geometric Trees

Title: Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models

Title: PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future

Title: SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation

Title: xGen-MM (BLIP-3): A Family of Open Large Multimodal Models