2024-12-18

Title: SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout

Title: Climate Aware Deep Neural Networks (CADNN) for Wind Power Simulation

Title: Towards LLM-based optimization compilers. Can LLMs learn how to apply a single peephole optimization? Reasoning is all LLMs need!

Title: Multimodal Approaches to Fair Image Classification: An Ethical Perspective

Title: Explore Theory of Mind: Program-guided adversarial data generation for theory of mind reasoning

Title: Adopting Explainable-AI to investigate the impact of urban morphology design on energy and environmental performance in dry-arid climates

Title: Multi-Surrogate-Teacher Assistance for Representation Alignment in Fingerprint-based Indoor Localization

Title: Provably Secure Robust Image Steganography via Cross-Modal Error Correction

Title: Can video generation replace cinematographers? Research on the cinematic language of generated video

Title: You Only Submit One Image to Find the Most Suitable Generative Model

Title: Deep Learning for Hydroelectric Optimization: Generating Long-Term River Discharge Scenarios with Ensemble Forecasts from Global Circulation Models

Title: OmniPrism: Learning Disentangled Visual Concept for Image Generation

Title: Towards a Universal Synthetic Video Detector: From Face or Background Manipulations to Fully AI-Generated Content

Title: RAG Playground: A Framework for Systematic Evaluation of Retrieval Strategies and Prompt Engineering in RAG Systems

Title: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering

Title: Efficient Scaling of Diffusion Transformers for Text-to-Image Generation

Title: Causally Consistent Normalizing Flow

Title: Numerical Pruning for Efficient Autoregressive Models

Title: LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Title: Pattern Analogies: Learning to Perform Programmatic Image Edits by Analogy

Title: Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues

Title: Invisible Watermarks: Attacks and Robustness

Title: Addressing Small and Imbalanced Medical Image Datasets Using Generative Models: A Comparative Study of DDPM and PGGANs with Random and Greedy K Sampling

Title: Stiefel Flow Matching for Moment-Constrained Structure Elucidation

Title: Consistent Diffusion: Denoising Diffusion Model with Data-Consistent Training for Image Restoration

Title: SAModified: A Foundation Model-Based Zero-Shot Approach for Refining Noisy Land-Use Land-Cover Maps

Title: ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers

Title: A Simple and Efficient Baseline for Zero-Shot Generative Classification

Title: OpenViewer: Openness-Aware Multi-View Learning

Title: RDPI: A Refine Diffusion Probability Generation Method for Spatiotemporal Data Imputation

Title: A Two-Fold Patch Selection Approach for Improved 360-Degree Image Quality Assessment

Title: ShiftedBronzes: Benchmarking and Analysis of Domain Fine-Grained Classification in Open-World Settings

Title: Uncertainty-Aware Hybrid Inference with On-Device Small and Remote Large Language Models

Title: PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model

Title: Progressive Monitoring of Generative Model Training Evolution

Title: Guided and Variance-Corrected Fusion with One-shot Style Alignment for Large-Content Image Generation

Title: Rethinking Diffusion-Based Image Generators for Fundus Fluorescein Angiography Synthesis on Limited Data

Title: RA-SGG: Retrieval-Augmented Scene Graph Generation Framework via Multi-Prototype Learning

Title: Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning

Title: Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera

Title: ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction

Title: An Agentic Approach to Automatic Creation of P&ID Diagrams from Natural Language Descriptions

Title: Unsupervised Region-Based Image Editing of Denoising Diffusion Models

Title: Graph Spring Neural ODEs for Link Sign Prediction

Title: CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models

Title: Synthetic Data Generation for Anomaly Detection on Table Grapes

Title: ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting

Title: Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance

Title: Future Aspects in Human Action Recognition: Exploring Emerging Techniques and Ethical Influences

Title: A New Adversarial Perspective for LiDAR-based 3D Object Detection

Title: Modality-Inconsistent Continual Learning of Multimodal Large Language Models

Title: VidTok: A Versatile and Open-Source Video Tokenizer

Title: Prompt Augmentation for Self-supervised Text-guided Image Manipulation

Title: Motion-2-to-3: Leveraging 2D Motion Data to Boost 3D Motion Generation

Title: F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration

Title: Move-in-2D: 2D-Conditioned Human Motion Generation

Title: StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models

Title: MotionBridge: Dynamic Video Inbetweening with Flexible Controls

Title: CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models