2024-12-02

Title: Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop

Title: Bi-ICE: An Inner Interpretable Framework for Image Classification via Bi-directional Interactions between Concept and Input Embeddings

Title: MADE: Graph Backdoor Defense with Masked Unlearning

Title: Dynamic Logistic Ensembles with Recursive Probability and Automatic Subset Splitting for Enhanced Binary Classification

Title: RoMo: Robust Motion Segmentation Improves Structure from Motion

Title: PRSI: Privacy-Preserving Recommendation Model Based on Vector Splitting and Interactive Protocols

Title: HDI-Former: Hybrid Dynamic Interaction ANN-SNN Transformer for Object Detection Using Frames and Events

Title: OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains

Title: HoliSDiP: Image Super-Resolution via Holistic Semantics and Diffusion Prior

Title: Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling

Title: SpotLight: Shadow-Guided Object Relighting via Diffusion

Title: Point Cloud Unsupervised Pre-training via 3D Gaussian Splatting

Title: Towards Chunk-Wise Generation for Long Videos

Title: SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality

Title: TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video

Title: FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

Title: AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

Title: Active Data Curation Effectively Distills Large-Scale Multimodal Models

Title: GaussianSpeech: Audio-Driven Gaussian Avatars

Title: MatchDiffusion: Training-free Generation of Match-cuts

Title: Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

Title: An indicator for effectiveness of text-to-image guardrails utilizing the Single-Turn Crescendo Attack (STCA)

Title: On the Effectiveness of Incremental Training of Large Language Models

Title: Random Walks with Tweedie: A Unified Framework for Diffusion Models

Title: Exponential Moving Average of Weights in Deep Learning: Dynamics and Benefits

Title: Evaluating Vision-Language Models as Evaluators in Path Planning

Title: Generative Visual Communication in the Era of Vision-Language Models

Title: The Last Mile to Supervised Performance: Semi-Supervised Domain Adaptation for Semantic Segmentation

Title: DiffMVR: Diffusion-based Automated Multi-Guidance Video Restoration

Title: Inference Privacy: Properties and Mechanisms

Title: Locally Differentially Private Online Federated Learning With Correlated Noise

Title: Cyber-Attack Technique Classification Using Two-Stage Trained Large Language Models

Title: CoVis: A Collaborative Framework for Fine-grained Graphic Visual Understanding

Title: Fall Leaf Adversarial Attack on Traffic Sign Classification

Title: MRI Breast tissue segmentation using nnU-Net for biomechanical modeling

Title: UOE: Unlearning One Expert Is Enough For Mixture-of-experts LLMS

Title: Formal Verification of Digital Twins with TLA and Information Leakage Control

Title: Stratified Non-Negative Tensor Factorization

Title: Reconstructing Animals and the Wild

Title: Lifting Motion to the 3D World via 2D Diffusion

Title: Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds

Title: FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution

Title: Measuring Risk of Bias in Biomedical Reports: The RoBBR Benchmark

Title: Sharing the Path: A Threshold Scheme from Isogenies and Error Correcting Codes

Title: An Integrated Artificial Intelligence Operating System for Advanced Low-Altitude Aviation Applications

Title: CrossTracker: Robust Multi-modal 3D Multi-Object Tracking via Cross Correction

Title: COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection

Title: Improving Batch Normalization with TTA for Robust Object Detection in Self-Driving

Title: Swarm Intelligence-Driven Client Selection for Federated Learning in Cybersecurity applications

Title: Sneaking Syntax into Transformer Language Models with Tree Regularization

Title: T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

Title: Evaluating Sparse Autoencoders on Targeted Concept Erasure Tasks

Title: Textured As-Is BIM via GIS-informed Point Cloud Segmentation

Title: FedRGL: Robust Federated Graph Learning for Label Noise

Title: MATATA: a weak-supervised MAthematical Tool-Assisted reasoning for Tabular Applications

Title: Federated Continual Graph Learning

Title: Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease

Title: EzSQL: An SQL intermediate representation for improving SQL-to-text Generation

Title: Data Augmentation with Diffusion Models for Colon Polyp Localization on the Low Data Regime: How much real data is enough?

Title: VIPaint: Image Inpainting with Pre-Trained Diffusion Models via Variational Inference

Title: Efficient Track Anything

Title: Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Title: Rephrasing Electronic Health Records for Pretraining Clinical Language Models

Title: Waterfall Transformer for Multi-person Pose Estimation

Title: ICLERB: In-Context Learning Embedding and Reranker Benchmark

Title: Knowledge Database or Poison Base? Detecting RAG Poisoning Attack through LLM Activations

Title: Random Sampling for Diffusion-based Adversarial Purification

Title: Perception of Visual Content: Differences Between Humans and Foundation Models

Title: Det-SAM2:Technical Report on the Self-Prompting Segmentation Framework Based on Segment Anything Model 2

Title: Zero-shot Slot Filling in the Age of LLMs for Dialogue Systems

Title: SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing

Title: Harden Deep Neural Networks Against Fault Injections Through Weight Scaling

Title: MVFormer: Diversifying Feature Normalization and Token Mixing for Efficient Vision Transformers

Title: Presenting a new approach in security in inter-vehicle networks (VANET)

Title: Locally-Focused Face Representation for Sketch-to-Image Generation Using Noise-Induced Refinement

Title: Pilot Contamination Aware Transformer for Downlink Power Control in Cell-Free Massive MIMO Networks

Title: Enhancing Neural Network Robustness Against Fault Injection Through Non-linear Weight Transformations

Title: PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors

Title: 3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes

Title: DIESEL -- Dynamic Inference-Guidance via Evasion of Semantic Embeddings in LLMs

Title: I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting

Title: Way to Specialist: Closing Loop Between Specialized LLM and Evolving Domain Knowledge Graph

Title: MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation

Title: LADDER: Multi-objective Backdoor Attack via Evolutionary Algorithm

Title: 360Recon: An Accurate Reconstruction Method Based on Depth Fusion from 360 Images

Title: Detailed Object Description with Controllable Dimensions

Title: Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Title: Integration of Contextual Descriptors in Ontology Alignment for Enrichment of Semantic Correspondence

Title: Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models

Title: MSG score: A Comprehensive Evaluation for Multi-Scene Video Generation

Title: A Comparative Analysis of Vulnerability Management Tools: Evaluating Nessus, Acunetix, and Nikto for Risk Based Security Solutions

Title: Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures

Title: TEA: Trajectory Encoding Augmentation for Robust and Transferable Policies in Offline Reinforcement Learning

Title: On Moving Object Segmentation from Monocular Video with Transformers

Title: Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

Title: LoRA of Change: Learning to Generate LoRA for the Editing Instruction from A Single Before-After Image Pair

Title: A Game-Theoretic Approach to the Study of Blockchain's Robustness

Title: SOWing Information: Cultivating Contextual Coherence with MLLMs in Image Generation

Title: Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs

Title: Video Depth without Video Models

Title: Convex Regularization and Convergence of Policy Gradient Flows under Safety Constraints

Title: An Extensive Evaluation of Factual Consistency in Large Language Models for Data-to-Text Generation

Title: Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation

Title: Cross-Spectral Attention for Unsupervised RGB-IR Face Verification and Person Re-identification

Title: Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection

Title: Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG

Title: Z-STAR+: A Zero-shot Style Transfer Method via Adjusting Style Distribution

Title: Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes

Title: SmartLLMSentry: A Comprehensive LLM Based Smart Contract Vulnerability Detection Framework

Title: InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception

Title: Controlling Participation in Federated Learning with Feedback

Title: Face2QR: A Unified Framework for Aesthetic, Face-Preserving, and Scannable QR Code Generation

Title: Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention

Title: AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using Smartphones

Title: On-chip Hyperspectral Image Segmentation with Fully Convolutional Networks for Scene Understanding in Autonomous Driving

Title: OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration

Title: GMS-VINS:Multi-category Dynamic Objects Semantic Segmentation for Enhanced Visual-Inertial Odometry Using a Promptable Foundation Model

Title: SADG: Segment Any Dynamic Gaussian Without Object Trackers

Title: Extracting Information in a Low-resource Setting: Case Study on Bioinformatics Workflows

Title: Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation

Title: SAMa: Material-aware 3D Selection and Segmentation

Title: Trajectory Attention for Fine-grained Video Motion Control

Title: Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation

Title: PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning

Title: Towards a Mechanistic Explanation of Diffusion Model Generalization

Title: CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections

Title: Characterizing JavaScript Security Code Smells

Title: Libra: Leveraging Temporal Images for Biomedical Radiology Analysis

Title: Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints

Title: DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

Title: On the effectiveness of discrete representations in sparse mixture of experts

Title: AMO Sampler: Enhancing Text Rendering with Overshooting

Title: Any-Resolution AI-Generated Image Detection by Spectral Learning

Title: Gradient Inversion Attack on Graph Neural Networks

Title: Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

Title: Adaptive Interactive Segmentation for Multimodal Medical Imaging via Selection Engine

Title: Learning Visual Abstract Reasoning through Dual-Stream Networks

Title: Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension Ability

Title: Multi-task CNN Behavioral Embedding Model For Transaction Fraud Detection

Title: Fleximo: Towards Flexible Text-to-Human Motion Video Generation

Title: Look Every Frame All at Once: Video-Ma$^2$mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Title: Robust Bayesian Scene Reconstruction by Leveraging Retrieval-Augmented Priors

Title: ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection

Title: Random Feature Models with Learnable Activation Functions

Title: A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models

Title: FLARE: Towards Universal Dataset Purification against Backdoor Attacks

Title: V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow

Title: Interleaved-Modal Chain-of-Thought

Title: Diorama: Unleashing Zero-shot Single-view 3D Scene Modeling

Title: COLD: Causal reasOning in cLosed Daily activities

Title: Knowledge-Data Fusion Based Source-Free Semi-Supervised Domain Adaptation for Seizure Subtype Classification

Title: Graph-Enhanced EEG Foundation Model

Title: Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Title: Retrieval-guided Cross-view Image Synthesis

Title: RL-MILP Solver: A Reinforcement Learning Approach for Solving Mixed-Integer Linear Programs with Graph Neural Networks

Title: DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Title: RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Title: Quantized Delta Weight Is Safety Keeper

Title: QUOTA: Quantifying Objects with Text-to-Image Models for Any Domain

Title: Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook

Title: SkelMamba: A State Space Model for Efficient Skeleton Action Recognition of Neurological Disorders

Title: Training Agents with Weakly Supervised Feedback from Large Language Models

Title: Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding

Title: Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning

Title: Ensemble Watermarks for Large Language Models

Title: KV Shifting Attention Enhances Language Modeling

Title: In-Context Learning with Noisy Labels

Title: Enhancing Sentiment Analysis in Bengali Texts: A Hybrid Approach Using Lexicon-Based Algorithm and Pretrained Language Model Bangla-BERT

Title: LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention

Title: Gaussian Splashing: Direct Volumetric Rendering Underwater

Title: Can Large Language Models Reason about the Region Connection Calculus?

Title: Tortho-Gaussian: Splatting True Digital Orthophoto Maps

Title: FairDD: Fair Dataset Distillation via Synchronized Matching

Title: Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings

Title: LLM Teacher-Student Framework for Text Classification With No Manually Annotated Data: A Case Study in IPTC News Topic Classification

Title: Learned Random Label Predictions as a Neural Network Complexity Metric

Title: CAdam: Confidence-Based Optimization for Online Learning

Title: Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing

Title: TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting

Title: Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS

Title: ChineseWebText 2.0: Large-Scale High-quality Chinese Web Text with Multi-dimensional and fine-grained information

Title: Privacy-Preserving Orthogonal Aggregation for Guaranteeing Gender Fairness in Federated Recommendation

Title: SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks

Title: MIMDE: Exploring the Use of Synthetic vs Human Data for Evaluating Multi-Insight Multi-Document Extraction Tasks

Title: Explaining the Impact of Training on Vision Models via Activation Clustering

Title: JetFormer: An Autoregressive Generative Model of Raw Images and Text

Title: Towards Santali Linguistic Inclusion: Building the First Santali-to-English Translation Model using mT5 Transformer and Data Augmentation

Title: Risk-Averse Certification of Bayesian Neural Networks

Title: Real-Time Anomaly Detection in Video Streams

Title: A Note on Small Percolating Sets on Hypercubes via Generative AI

Title: Graph Neural Networks for Heart Failure Prediction on an EHR-Based Patient Similarity Graph

Title: HVAC-DPT: A Decision Pretrained Transformer for HVAC Control

Title: A Multi-Loss Strategy for Vehicle Trajectory Prediction: Combining Off-Road, Diversity, and Directional Consistency Losses

Title: A Comprehensive Content Verification System for ensuring Digital Integrity in the Age of Deep Fakes

Title: Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

Title: Evidence-Based Threat Modeling for ICS

Title: Riemannian Denoising Score Matching for Molecular Structure Optimization with Accurate Energy

Title: LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

Title: PerLA: Perceptive 3D Language Assistant

Title: MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks

Title: Rethinking the initialization of Momentum in Federated Learning with Heterogeneous Data

Title: INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

Title: SDR-GNN: Spectral Domain Reconstruction Graph Neural Network for Incomplete Multimodal Learning in Conversational Emotion Recognition

Title: Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation

Title: Towards Class-wise Robustness Analysis

Title: What fifty-one years of Linguistics and Artificial Intelligence research tell us about their correlation: A scientometric review

Title: SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection

Title: Reverse Thinking Makes LLMs Stronger Reasoners

Title: AIDetx: a compression-based method for identification of machine-learning generated text

Title: LUMIA: Linear probing for Unimodal and MultiModal Membership Inference A!acks leveraging internal LLM states

Title: Open source Differentiable ODE Solving Infrastructure

Title: FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation

Title: GuardSplat: Robust and Efficient Watermarking for 3D Gaussian Splatting

Title: $C^{3}$-NeRF: Modeling Multiple Scenes via Conditional-cum-Continual Neural Radiance Fields

Title: Quantifying the synthetic and real domain gap in aerial scene understanding

Title: SIMS: Simulating Human-Scene Interactions with Real World Script Planning

Title: Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders

Title: On Domain-Specific Post-Training for Multimodal Large Language Models

Title: VLSBench: Unveiling Visual Leakage in Multimodal Safety

Title: Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

Title: T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs