2024-12-03

Title: A Supercomputing Based Distributed Cloud Marketplace

Title: TransFair: Transferring Fairness from Ocular Disease Classification to Progression Prediction

Title: LeMoLE: LLM-Enhanced Mixture of Linear Experts for Time Series Forecasting

Title: MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image

Title: Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration

Title: Deep Learning-Based Electricity Price Forecast for Virtual Bidding in Wholesale Electricity Market

Title: DiffGuard: Text-Based Safety Checker for Diffusion Models

Title: Targeted Therapy in Data Removal: Object Unlearning Based on Scene Graphs

Title: Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning

Title: Addressing Vulnerabilities in AI-Image Detection: Challenges and Proposed Solutions

Title: Safe to Serve: Aligning Instruction-Tuned Models for Safety and Helpfulness

Title: Dual Prototyping with Domain and Class Prototypes for Affective Brain-Computer Interface in Unseen Target Conditions

Title: Visual Error Patterns in Multi-Modal AI: A Statistical Approach

Title: Unpacking the Individual Components of Diffusion Policy

Title: Residual Attention Single-Head Vision Transformer Network for Rolling Bearing Fault Diagnosis in Noisy Environments

Title: Energy-Efficient Split Learning for Fine-Tuning Large Language Models in Edge Networks

Title: A Novel Approach to Image Steganography Using Generative Adversarial Networks

Title: Fine-Tuning Large Language Models for Scientific Text Classification: A Comparative Study

Title: Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

Title: Multi-Label Contrastive Learning : A Comprehensive Study

Title: ElectroVizQA: How well do Multi-modal LLMs perform in Electronics Visual Question Answering?

Title: Differential learning kinetics govern the transition from memorization to generalization during in-context learning

Title: Predicting Extubation Failure in Intensive Care: The Development of a Novel, End-to-End Actionable and Interpretable Prediction System

Title: Demographic Predictability in 3D CT Foundation Embeddings

Title: SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments

Title: OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation

Title: Bridging the Gap: Aligning Text-to-Image Diffusion Models with Specific Feedback

Title: Streamlined Federated Unlearning: Unite as One to Be Highly Efficient

Title: Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads

Title: Scaling Particle Collision Data Analysis

Title: Event-based Tracking of Any Point with Motion-Robust Correlation Features

Title: FonTS: Text Rendering with Typography and Style Controls

Title: EFSA: Episodic Few-Shot Adaptation for Text-to-Image Retrieval

Title: Differentiable Topology Estimating from Curvatures for 3D Shapes

Title: Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers

Title: MPQ-Diff: Mixed Precision Quantization for Diffusion Models

Title: Knowledge-Augmented Explainable and Interpretable Learning for Anomaly Detection and Diagnosis

Title: Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise

Title: DLaVA: Document Language and Vision Assistant for Answer Localization with Enhanced Interpretability and Trustworthiness

Title: ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model

Title: T-3DGS: Removing Transient Objects for 3D Scene Reconstruction

Title: VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models

Title: AerialGo: Walking-through City View Generation from Aerial Perspectives

Title: STEP: Enhancing Video-LLMs' Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Title: To Ensemble or Not: Assessing Majority Voting Strategies for Phishing Detection with Large Language Models

Title: Origin-Destination Demand Prediction: An Urban Radiation and Attraction Perspective

Title: Spatial Clustering of Molecular Localizations with Graph Neural Networks

Title: Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning

Title: Art-Free Generative Models: Art Creation Without Graphic Art Knowledge

Title: LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Title: Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation

Title: Train Once for All: A Transitional Approach for Efficient Aspect Sentiment Triplet Extraction

Title: N\"ushuRescue: Revitalization of the endangered N\"ushu Language with AI

Title: MATTER: Multi-stage Adaptive Thermal Trojan for Efficiency & Resilience degradation

Title: Clinical Document Corpora and Assorted Domain Proxies: A Survey of Diversity in Corpus Design, with Focus on German Text Data

Title: Hybrid Spiking Neural Network -- Transformer Video Classification Model

Title: Twisted Convolutional Networks (TCNs): Enhancing Feature Interactions for Non-Spatial Data Classification

Title: Robust Testing for Deep Learning using Human Label Noise

Title: Excretion Detection in Pigsties Using Convolutional and Transformerbased Deep Neural Networks

Title: Facial Expression Recognition with Controlled Privacy Preservation and Feature Compensation

Title: SS Linear Fusion Model: Hyperspectral Imaging Efficient Spatial and Spectral Linear Model with Bidirectional Feature Learning

Title: HSLiNets: Hyperspectral Image and LiDAR Data Fusion Using Efficient Dual Linear Feature Learning Networks

Title: Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment

Title: Towards Pixel-Level Prediction for Gaze Following: Benchmark and Approach

Title: HiMoE: Heterogeneity-Informed Mixture-of-Experts for Fair Spatial-Temporal Forecasting

Title: Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments

Title: EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Edge Devices

Title: Fusing Physics-Driven Strategies and Cross-Modal Adversarial Learning: Toward Multi-Domain Applications

Title: Enhancing Zero-shot Chain of Thought Prompting via Uncertainty-Guided Strategy Selection

Title: Does Self-Attention Need Separate Weights in Transformers?

Title: LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation

Title: Approximate Fiber Product: A Preliminary Algebraic-Geometric Perspective on Multimodal Embedding Alignment

Title: DogLayout: Denoising Diffusion GAN for Discrete and Continuous Layout Generation

Title: Toward Fair Graph Neural Networks Via Dual-Teacher Knowledge Distillation

Title: A generalization of Burmester-Desmedt GKE based on a non-abelian finite group action

Title: GradiSeg: Gradient-Guided Gaussian Segmentation with Enhanced 3D Boundary Precision

Title: On Foundation Models for Dynamical Systems from Purely Synthetic Data

Title: DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Title: Hard-Label Black-Box Attacks on 3D Point Clouds

Title: QuAKE: Speeding up Model Inference Using Quick and Approximate Kernels for Exponential Non-Linearities

Title: ACTISM: Threat-informed Dynamic Security Modelling for Automotive Systems

Title: TAROT: Targeted Data Selection via Optimal Transport

Title: FreeCond: Free Lunch in the Input Conditions of Text-Guided Inpainting

Title: Dynamic Token Selection for Aerial-Ground Person Re-Identification

Title: Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training

Title: Two Models for Surface Segmentation using the Total Variation of the Normal Vector

Title: ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models

Title: A conditional Generative Adversarial network model for the Weather4Cast 2024 Challenge

Title: Learning Locally, Revising Globally: Global Reviser for Federated Learning with Noisy Labels

Title: Non-native speakers of English or ChatGPT: Who thinks better?

Title: BGM: Background Mixup for X-ray Prohibited Items Detection

Title: AgriBench: A Hierarchical Agriculture Benchmark for Multimodal Large Language Models

Title: Enhancing Skin Cancer Diagnosis (SCD) Using Late Discrete Wavelet Transform (DWT) and New Swarm-Based Optimizers

Title: Jailbreak Large Visual Language Models Through Multi-Modal Linkage

Title: Automatic Differentiation-based Full Waveform Inversion with Flexible Workflows

Title: Density-aware Global-Local Attention Network for Point Cloud Segmentation

Title: Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding

Title: Distributed Differentially Private Data Analytics via Secure Sketching

Title: Homeostazis and Sparsity in Transformer

Title: Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Title: Graph-to-SFILES: Control structure prediction from process topologies using generative artificial intelligence

Title: Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects

Title: Human Action CLIPS: Detecting AI-generated Human Motion

Title: Exact Certification of (Graph) Neural Networks Against Label Poisoning

Title: TextClass Benchmark: A Continuous Elo Rating of LLMs in Social Sciences

Title: Evaluating the Consistency of LLM Evaluators

Title: RoBo6: Standardized MMT Light Curve Dataset for Rocket Body Classification

Title: Rank It, Then Ask It: Input Reranking for Maximizing the Performance of LLMs on Symmetric Tasks

Title: Motion Dreamer: Realizing Physically Coherent Video Generation through Scene-Aware Motion Reasoning

Title: SeQwen at the Financial Misinformation Detection Challenge Task: Sequential Learning for Claim Verification and Explanation Generation in Financial Domains

Title: Unveiling Performance Challenges of Large Language Models in Low-Resource Healthcare: A Demographic Fairness Perspective

Title: Accelerating Multimodel Large Language Models by Searching Optimal Vision Token Reduction

Title: Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

Title: Polish Medical Exams: A new dataset for cross-lingual medical knowledge transfer assessment

Title: Friend or Foe? Harnessing Controllable Overfitting for Anomaly Detection

Title: Continuous Concepts Removal in Text-to-image Diffusion Models

Title: Evaluating Large Language Models' Capability to Launch Fully Automated Spear Phishing Campaigns: Validated on Human Subjects

Title: Generative LiDAR Editing with Controllable Novel Object Layouts

Title: PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

Title: TraCS: Trajectory Collection in Continuous Space under Local Differential Privacy

Title: Exposing LLM Vulnerabilities: Adversarial Scam Detection and Performance

Title: Visual Modality Prompt for Adapting Vision-Language Object Detectors

Title: A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision

Title: VideoSAVi: Self-Aligned Video Language Models without Human Supervision

Title: ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning

Title: Sketch-Guided Motion Diffusion for Stylized Cinemagraph Synthesis

Title: DFRot: Achieving Outlier-Free and Massive Activation-Free for Rotated LLMs with Refined Rotation

Title: Towards Unified Molecule-Enhanced Pathology Image Representation Learning via Integrating Spatial Transcriptomics

Title: Multi-Agent Collaboration in Incident Response with Large Language Models

Title: Improving Decoupled Posterior Sampling for Inverse Problems using Data Consistency Constraint

Title: Learning on Less: Constraining Pre-trained Model Learning for Generalizable Diffusion-Generated Image Detection

Title: FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation

Title: ChainGuard: A Blockchain-based Authentication and Access Control Scheme for Distributed Networks

Title: 2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification

Title: SEAM: A Secure Automated and Maintainable Smart Contract Upgrade Framework

Title: MIMIC: Multimodal Islamophobic Meme Identification and Classification

Title: FlashSLAM: Accelerated RGB-D SLAM for Real-Time 3D Scene Reconstruction with Gaussian Splatting

Title: DMFourLLIE: Dual-Stage and Multi-Branch Fourier Network for Low-Light Image Enhancement

Title: Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding

Title: LVLM-COUNT: Enhancing the Counting Ability of Large Vision-Language Models

Title: Towards Privacy-Preserving Medical Imaging: Federated Learning with Differential Privacy and Secure Aggregation Using a Modified ResNet Architecture

Title: Collaborative Proof-of-Work: A Secure Dynamic Approach to Fair and Efficient Blockchain Mining

Title: Intermediate Outputs Are More Sensitive Than You Think

Title: The Forking Way: When TEEs Meet Consensus

Title: Protect Your Secrets: Understanding and Measuring Data Exposure in VSCode Extensions

Title: Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation

Title: Bridging Fairness Gaps: A (Conditional) Distance Covariance Perspective in Fairness Learning

Title: Decision Transformer vs. Decision Mamba: Analysing the Complexity of Sequential Decision Making in Atari Games

Title: Perturb and Recover: Fine-tuning for Effective Backdoor Removal from CLIP

Title: Refine3DNet: Scaling Precision in 3D Object Reconstruction from Multi-View RGB Images using Attention

Title: Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks

Title: ChatSplat: 3D Conversational Gaussian Splatting

Title: Precise Facial Landmark Detection by Dynamic Semantic Aggregation Transformer

Title: CtrlNeRF: The Generative Neural Radiation Fields for the Controllable Synthesis of High-fidelity 3D-Aware Images

Title: DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling

Title: Learning to Forget using Hypernetworks

Title: PGSO: Prompt-based Generative Sequence Optimization Network for Aspect-based Sentiment Analysis

Title: SelfPrompt: Autonomously Evaluating LLM Robustness via Domain-Constrained Knowledge Guidelines and Refined Adversarial Prompts

Title: Prompt as Free Lunch: Enhancing Diversity in Source-Free Cross-domain Few-shot Learning through Semantic-Guided Prompting

Title: A Wave is Worth 100 Words: Investigating Cross-Domain Transferability in Time Series

Title: DIVD: Deblurring with Improved Video Diffusion Model

Title: Post-Vaccination COVID-19 Data Analysis: Privacy and Ethics

Title: Learning Mamba as a Continual Learner

Title: Local vs. Global: Local Land-Use and Land-Cover Models Deliver Higher Quality Maps

Title: Memories of Forgotten Concepts

Title: EDTformer: An Efficient Decoder Transformer for Visual Place Recognition

Title: Online Poisoning Attack Against Reinforcement Learning under Black-box Environments

Title: A Comprehensive Guide to Explainable AI: From Classical Models to LLMs

Title: Generative Model for Synthesizing Ionizable Lipids: A Monte Carlo Tree Search Approach

Title: Categorical Keypoint Positional Embedding for Robust Animal Re-Identification

Title: EventGPT: Event Stream Understanding with Multimodal Large Language Models

Title: AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

Title: Particle-based 6D Object Pose Estimation from Point Clouds using Diffusion Models

Title: AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Title: Advanced Video Inpainting Using Optical Flow-Guided Efficient Diffusion

Title: Deep evolving semi-supervised anomaly detection

Title: Thermal Vision: Pioneering Non-Invasive Temperature Tracking in Congested Spaces

Title: Quantifying perturbation impacts for large language models

Title: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting

Title: Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification

Title: Beyond Pixels: Text Enhances Generalization in Real-World Image Restoration

Title: SyncVIS: Synchronized Video Instance Segmentation

Title: Leveraging Intermediate Neural Collapse with Simplex ETFs for Efficient Deep Neural Networks

Title: Exploring Large Vision-Language Models for Robust and Efficient Industrial Anomaly Detection

Title: Symbolic Quantitative Information Flow for Probabilistic Programs

Title: SOUL: A Semi-supervised Open-world continUal Learning method for Network Intrusion Detection

Title: A Deep Generative Model for the Design of Synthesizable Ionizable Lipids

Title: Calibration through the Lens of Interpretability

Title: Bilinear Convolution Decomposition for Causal RL Interpretability

Title: Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages

Title: STEVE-Audio: Expanding the Goal Conditioning Modalities of Embodied Agents in Minecraft

Title: ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

Title: WAFFLE: Multimodal Floorplan Understanding in the Wild

Title: Token Cropr: Faster ViTs for Quite a Few Tasks

Title: Optimal Algorithms for Augmented Testing of Discrete Distributions

Title: Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive

Title: Incentivizing Truthful Collaboration in Heterogeneous Federated Learning

Title: TGTOD: A Global Temporal Graph Transformer for Outlier Detection at Scale

Title: Seldom: An Anonymity Network with Selective Deanonymization

Title: DSSRNN: Decomposition-Enhanced State-Space Recurrent Neural Network for Time-Series Analysis

Title: Competition Dynamics Shape Algorithmic Phases of In-Context Learning

Title: e-Fold Cross-Validation for Recommender-System Evaluation

Title: Detecting Memorization in Large Language Models

Title: Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Title: Evaluating Automated Radiology Report Quality through Fine-Grained Phrasal Grounding of Clinical Findings

Title: SAUP: Situation Awareness Uncertainty Propagation on LLM Agent

Title: TruncFormer: Private LLM Inference Using Only Truncations

Title: Classifying Simulated Gait Impairments using Privacy-preserving Explainable Artificial Intelligence and Mobile Phone Videos

Title: Blindfold: Confidential Memory Management by Untrusted Operating System

Title: Research on Optimizing Real-Time Data Processing in High-Frequency Trading Algorithms using Machine Learning

Title: FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Title: Lookahead Counterfactual Fairness

Title: TRUST: A Toolkit for TEE-Assisted Secure Outsourced Computation over Integers

Title: Multi-Agent Deep Reinforcement Learning for Distributed and Autonomous Platoon Coordination via Speed-regulation over Large-scale Transportation Networks

Title: Advancing Speech Language Models by Scaling Supervised Fine-Tuning with Over 60,000 Hours of Synthetic Speech Dialogue Data

Title: Federated Motor Imagery Classification for Privacy-Preserving Brain-Computer Interfaces

Title: STATIC : Surface Temporal Affine for TIme Consistency in Video Monocular Depth Estimation

Title: DuoCast: Duo-Probabilistic Meteorology-Aware Model for Extended Precipitation Nowcasting

Title: Automated Extraction of Acronym-Expansion Pairs from Scientific Papers

Title: Hiding Faces in Plain Sight: Defending DeepFakes by Disrupting Face Detection

Title: One Shot, One Talk: Whole-body Talking Avatar from a Single Image

Title: DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding

Title: Look Ma, No Ground Truth! Ground-Truth-Free Tuning of Structure from Motion and Visual SLAM

Title: LoyalDiffusion: A Diffusion Model Guarding Against Data Replication

Title: Object Tracking in a $360^o$ View: A Novel Perspective on Bridging the Gap to Biomedical Advancements

Title: RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model Accuracy

Title: Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation

Title: A Comprehensive Evaluation of Semantic Relation Knowledge of Pretrained Language Models and Humans

Title: Referring Video Object Segmentation via Language-aligned Track Selection

Title: TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Title: A2VIS: Amodal-Aware Approach to Video Instance Segmentation

Title: R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation

Title: Graph Community Augmentation with GMM-based Modeling in Latent Space

Title: HumekaFL: Automated Detection of Neonatal Asphyxia Using Federated Learning

Title: Rectified Flow For Structure Based Drug Design

Title: Dual-Branch Graph Transformer Network for 3D Human Mesh Reconstruction from Video

Title: MeasureNet: Measurement Based Celiac Disease Identification

Title: SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages

Title: MiningGPT -- A Domain-Specific Large Language Model for the Mining Industry

Title: TinyFusion: Diffusion Transformers Learned Shallow

Title: Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data

Title: Siamese Machine Unlearning with Knowledge Vaporization and Concentration

Title: PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Title: Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes

Title: Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation

Title: Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

Title: Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective

Title: Multimodal Fusion Learning with Dual Attention for Medical Imaging

Title: Yi-Lightning Technical Report

Title: EmojiDiff: Advanced Facial Expression Control with High Identity Preservation in Portrait Generation

Title: NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Title: Do Large Language Models with Reasoning and Acting Meet the Needs of Task-Oriented Dialogue?

Title: Towards Robust Interpretable Surrogates for Optimization

Title: Indexing Economic Fluctuation Narratives from Keiki Watchers Survey

Title: EdgeOAR: Real-time Online Action Recognition On Edge Devices

Title: Ponder & Press: Advancing Visual GUI Agent towards General Computer Control

Title: MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost

Title: PASTA-4-PHT: A Pipeline for Automated Security and Technical Audits for the Personal Health Train

Title: MFTF: Mask-free Training-free Object Level Layout Control Diffusion Model

Title: FedAH: Aggregated Head for Personalized Federated Learning

Title: Cross-Modal Visual Relocalization in Prior LiDAR Maps Utilizing Intensity Textures

Title: Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Title: Explainable fault and severity classification for rolling element bearings using Kolmogorov-Arnold networks

Title: The "LLM World of Words" English free association norms generated by large language models

Title: A Versatile Influence Function for Data Attribution with Non-Decomposable Loss

Title: Negative Token Merging: Image-based Adversarial Feature Guidance

Title: MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models

Title: Integrative CAM: Adaptive Layer Fusion for Comprehensive Interpretation of CNNs

Title: Exploring the Robustness of AI-Driven Tools in Digital Forensics: A Preliminary Study

Title: Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability

Title: Behavior Backdoor for Deep Learning Models

Title: Understanding the World's Museums through Vision-Language Reasoning

Title: An overview of diffusion models for generative artificial intelligence

Title: Hierarchical VAE with a Diffusion-based VampPrior

Title: Adapting Large Language Models to Log Analysis with Interpretable Domain Knowledge

Title: Efficient LLM Inference using Dynamic Input Pruning and Cache-Aware Masking

Title: Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data

Title: Machine Learning Analysis of Anomalous Diffusion

Title: Holistic Understanding of 3D Scenes as Universal Scene Description

Title: ULSR-GS: Ultra Large-scale Surface Reconstruction Gaussian Splatting with Multi-View Geometric Consistency

Title: MambaU-Lite: A Lightweight Model based on Mamba and Integrated Channel-Spatial Attention for Skin Lesion Segmentation

Title: HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving

Title: CellSeg1: Robust Cell Segmentation with One Training Image

Title: Impromptu Cybercrime Euphemism Detection

Title: Network Simulation with Complex Cyber-attack Scenarios

Title: MamKPD: A Simple Mamba Baseline for Real-Time 2D Keypoint Detection

Title: FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Title: CPA: Camera-pose-awareness Diffusion Transformer for Video Generation

Title: MVImgNet2.0: A Larger-scale Dataset of Multi-view Images

Title: DiffPatch: Generating Customizable Adversarial Patches using Diffusion Model

Title: Early Exit Is a Natural Capability in Transformer-based Models: An Empirical Study on Early Exit without Joint Optimization

Title: Phaseformer: Phase-based Attention Mechanism for Underwater Image Restoration and Beyond

Title: A comprehensive review of datasets and deep learning techniques for vision in Unmanned Surface Vehicles

Title: Multi-Granularity Video Object Segmentation

Title: Improving Object Detection by Modifying Synthetic Data with Explainable AI

Title: Adversarial Attacks on Hyperbolic Networks

Title: RaD: A Metric for Medical Image Distribution Comparison in Out-of-Domain Detection and Other Applications

Title: Scaling Law for Language Models Training Considering Batch Size

Title: Structured 3D Latents for Scalable and Versatile 3D Generation

Title: HaGRIDv2: 1M Images for Static and Dynamic Hand Gesture Recognition

Title: ReHub: Linear Complexity Graph Transformers with Adaptive Hub-Spoke Reassignment

Title: Traversing the Subspace of Adversarial Patches

Title: The Future of Document Verification: Leveraging Blockchain and Self-Sovereign Identity for Enhanced Security and Transparency

Title: The Bare Necessities: Designing Simple, Effective Open-Vocabulary Scene Graphs

Title: Effectiveness of L2 Regularization in Privacy-Preserving Machine Learning

Title: Towards Type Agnostic Cyber Defense Agents

Title: Improved Large Language Model Jailbreak Detection via Pretrained Embeddings

Title: SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model

Title: Optimizing Domain-Specific Image Retrieval: A Benchmark of FAISS and Annoy with Fine-Tuned Features

Title: Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection

Title: VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval

Title: Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle

Title: Tokenizing 3D Molecule Structure with Quantized Spherical Coordinates

Title: Multi-objective Deep Learning: Taxonomy and Survey of the State of the Art

Title: 3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

Title: FairML: A Julia Package for Fair Classification

Title: Epipolar Attention Field Transformers for Bird's Eye View Semantic Segmentation

Title: FEVER-OOD: Free Energy Vulnerability Elimination for Robust Out-of-Distribution Detection

Title: Arabic Handwritten Document OCR Solution with Binarization and Adaptive Scale Fusion Detection

Title: Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking

Title: OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Title: If Eleanor Rigby Had Met ChatGPT: A Study on Loneliness in a Post-LLM World

Title: NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers

Title: Image Forgery Localization via Guided Noise and Multi-Scale Feature Aggregation

Title: Using Large Language Models in Automatic Hint Ranking and Generation Tasks

Title: Review of Mathematical Optimization in Federated Learning

Title: Linearly Homomorphic Signature with Tight Security on Lattice

Title: Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior

Title: Privacy-Preserving Federated Learning via Homomorphic Adversarial Networks

Title: Verified Foundations for Differential Privacy

Title: Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning

Title: Causal Discovery by Interventions via Integer Programming

Title: Diffusion Models with Anisotropic Gaussian Splatting for Image Inpainting

Title: Unlocking Video-LLM via Agent-of-Thoughts Distillation

Title: Uncertainty-Aware Regularization for Image-to-Image Translation

Title: Are We There Yet? Revealing the Risks of Utilizing Large Language Models in Scholarly Peer Review

Title: Towards Resource Efficient and Interpretable Bias Mitigation in Large Language Models

Title: Driving Scene Synthesis on Free-form Trajectories with Generative Prior

Title: HUGSIM: A Real-Time, Photo-Realistic and Closed-Loop Simulator for Autonomous Driving

Title: LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

Title: BroadTrack: Broadcast Camera Tracking for Soccer

Title: Attacks on multimodal models

Title: Adversarial Sample-Based Approach for Tighter Privacy Auditing in Final Model-Only Scenarios

Title: XQ-GAN: An Open-source Image Tokenization Framework for Autoregressive Generation

Title: HackSynth: LLM Agent and Evaluation Framework for Autonomous Penetration Testing

Title: Identifying Reliable Predictions in Detection Transformers

Title: Hard Constraint Guided Flow Matching for Gradient-Free Generation of PDE Solutions

Title: Pretrained Reversible Generation as Unsupervised Visual Representation Learning

Title: CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion

Title: IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models

Title: PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos

Title: SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

Title: V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction

Title: COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training

Title: Efficient Semantic Communication Through Transformer-Aided Compression

Title: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster

Title: Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis

Title: World-consistent Video Diffusion with Explicit 3D Modeling

Title: X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Title: RandAR: Decoder-only Autoregressive Visual Generation in Random Orders