2025-03-18

Title: Explainable Sentiment Analysis with DeepSeek-R1: Performance, Efficiency, and Few-Shot Learning

Title: TRUTH DECAY: Quantifying Multi-Turn Sycophancy in Language Models

Title: Automating Mathematical Proof Generation Using Large Language Model Agents and Knowledge Graphs

Title: Zero Trust Architecture: A Systematic Literature Review

Title: LogitLens4LLMs: Extending Logit Lens Analysis to Modern Large Language Models

Title: MELON: Multimodal Mixture-of-Experts with Spectral-Temporal Fusion for Long-Term Mobility Estimation in Critical Care

Title: Generalization of Video-Based Heart Rate Estimation Methods To Low Illumination and Elevated Heart Rates

Title: A Survey of Direct Preference Optimization

Title: Refining Filter Global Feature Weighting for Fully-Unsupervised Clustering

Title: Privacy-Preserved Automated Scoring using Federated Learning for Educational Research

Title: Fine-Tuning Diffusion Generative Models via Rich Preference Optimization

Title: BACE-RUL: A Bi-directional Adversarial Network with Covariate Encoding for Machine Remaining Useful Life Prediction

Title: Industrial-Grade Sensor Simulation via Gaussian Splatting: A Modular Framework for Scalable Editing and Full-Stack Validation

Title: CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control

Title: BioMamba: Leveraging Spectro-Temporal Embedding in Bidirectional Mamba for Enhanced Biosignal Classification

Title: Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization

Title: reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs

Title: UBMF: Uncertainty-Aware Bayesian Meta-Learning Framework for Fault Diagnosis with Imbalanced Industrial Data

Title: Enhancing Resiliency of Sketch-based Security via LSB Sharing-based Dynamic Late Merging

Title: Rethinking Multi-modal Object Detection from the Perspective of Mono-Modality Feature Learning

Title: StyleMorpheus: A Style-Based 3D-Aware Morphable Face Model

Title: Semantic-Clipping: Efficient Vision-Language Modeling with Semantic-Guidedd Visual Selection

Title: Key, Value, Compress: A Systematic Exploration of KV Cache Compression Techniques

Title: Bridging the LLM Accessibility Divide? Performance, Fairness, and Cost of Closed versus Open LLMs for Automated Essay Scoring

Title: Performance Analysis of Decentralized Federated Learning Deployments

Title: A Transformer and Prototype-based Interpretable Model for Contextual Sarcasm Detection

Title: Trust Under Siege: Label Spoofing Attacks against Machine Learning for Android Malware Detection

Title: Test-Time Training Provably Improves Transformers as In-context Learners

Title: Local Pan-Privacy for Federated Analytics

Title: OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs

Title: FedALT: Federated Fine-Tuning through Adaptive Local Training with Rest-of-the-World LoRA

Title: GPT's Devastated and LLaMA's Content: Emotion Representation Alignment in LLMs for Keyword-based Generation

Title: DecAlign: Hierarchical Cross-Modal Alignment for Decoupled Multimodal Representation Learning

Title: UStyle: Waterbody Style Transfer of Underwater Scenes by Depth-Guided Feature Synthesis

Title: Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing

Title: PREAMBLE: Private and Efficient Aggregation of Block Sparse Vectors and Applications

Title: LLMs for Translation: Historical, Low-Resourced Languages and Contemporary AI Models

Title: Spatio-temporal Fourier Transformer (StFT) for Long-term Dynamics Prediction

Title: Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities

Title: A Survey on SAR ship classification using Deep Learning

Title: LAG-MMLU: Benchmarking Frontier LLM Understanding in Latvian and Giriama

Title: Implementation of classical client universal blind quantum computation with 8-state RSP in current architecture

Title: A Framework for Evaluating Emerging Cyberattack Capabilities of AI

Title: Practical Implications of Implementing Local Differential Privacy for Smart grids

Title: RePanda: Pandas-powered Tabular Verification and Reasoning

Title: REGEN: A Dataset and Benchmarks with Natural Language Critiques and Narratives

Title: Generating a Biometrically Unique and Realistic Iris Database

Title: Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder

Title: Your Text Encoder Can Be An Object-Level Watermarking Controller

Title: Integration of Explainable AI Techniques with Large Language Models for Enhanced Interpretability for Sentiment Analysis

Title: SPOC: Spatially-Progressing Object State Change Segmentation in Video

Title: CHOrD: Generation of Collision-Free, House-Scale, and Organized Digital Twins for 3D Indoor Scenes with Controllable Floor Plans and Optimal Layouts

Title: HInter: Exposing Hidden Intersectional Bias in Large Language Models

Title: Effective and Efficient Cross-City Traffic Knowledge Transfer A Privacy-Preserving Perspective

Title: Entropy-regularized Gradient Estimators for Approximate Bayesian Inference

Title: Revisiting Gradient Descent: A Dual-Weight Method for Improved Learning

Title: Evaluation of Intra-operative Patient-specific Methods for Point Cloud Completion for Minimally Invasive Liver Interventions

Title: No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language models

Title: Applications of Large Language Model Reasoning in Feature Generation

Title: Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition

Title: 3D Gaussian Splatting against Moving Objects for High-Fidelity Street Scene Reconstruction

Title: ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object

Title: Winning the MIDST Challenge: New Membership Inference Attacks on Diffusion Models for Tabular Data Synthesis

Title: UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Title: Mixed-feature Logistic Regression Robust to Distribution Shifts

Title: QDM: Quadtree-Based Region-Adaptive Sparse Diffusion Models for Efficient Image Super-Resolution

Title: A Survey on Federated Fine-tuning of Large Language Models

Title: Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art

Title: Leveraging Motion Information for Better Self-Supervised Video Correspondence Learning

Title: Real-Time Manipulation Action Recognition with a Factorized Graph Sequence Encoder

Title: PSGait: Multimodal Gait Recognition using Parsing Skeleton

Title: TACO: Taming Diffusion for in-the-wild Video Amodal Completion

Title: TLUE: A Tibetan Language Understanding Evaluation Benchmark

Title: Tailor: An Integrated Text-Driven CG-Ready Human and Garment Generation System

Title: Revisiting Training-Inference Trigger Intensity in Backdoor Attacks

Title: A Comprehensive Survey on Knowledge Distillation

Title: Prototype-Based Image Prompting for Weakly Supervised Histopathological Image Segmentation

Title: Robust Dataset Distillation by Matching Adversarial Trajectories

Title: Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models

Title: V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

Title: SFMNet: Sparse Focal Modulation for 3D Object Detection

Title: E-SAM: Training-Free Segment Every Entity Model

Title: Towards Vision Zero: The Accid3nD Dataset

Title: Large Language Models in Legislative Content Analysis: A Dataset from the Polish Parliament

Title: A Speech-to-Video Synthesis Approach Using Spatio-Temporal Diffusion for Vocal Tract MRI

Title: ChronosX: Adapting Pretrained Time Series Models with Exogenous Variables

Title: RECSIP: REpeated Clustering of Scores Improving the Precision

Title: MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling

Title: Robust Isolation Forest using Soft Sparse Random Projection and Valley Emphasis Method

Title: DiffGAP: A Lightweight Diffusion Module in Contrastive Space for Bridging Cross-Model Gap

Title: A State Alignment-Centric Approach to Federated System Identification: The FedAlign Framework

Title: Point-Cache: Test-time Dynamic and Hierarchical Cache for Robust and Generalizable Point Cloud Analysis

Title: Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion

Title: Efficient and Privacy-Preserved Link Prediction via Condensed Graphs

Title: Probabilistic Graph Circuits: Deep Generative Models for Tractable Probabilistic Inference over Graphs

Title: PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing

Title: Learning Extremely High Density Crowds as Active Matters

Title: SEAL: Semantic Aware Image Watermarking

Title: Multi-Agent Systems Execute Arbitrary Malicious Code

Title: Breaking the Box: Enhancing Remote Sensing Image Segmentation with Freehand Sketches

Title: STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation

Title: Cross-Modal Diffusion for Biomechanical Dynamical Systems Through Local Manifold Alignment

Title: Gun Detection Using Combined Human Pose and Weapon Appearance

Title: TFHE-Coder: Evaluating LLM-agentic Fully Homomorphic Encryption Code Generation

Title: Adaptive Label Correction for Robust Medical Image Segmentation with Noisy Labels

Title: A Bubble-Cluster Federated Learning Framework for Privacy-Preserving Demand Forecasting on Heterogeneous Retail Data

Title: Interpretation Gaps in LLM-Assisted Comprehension of Privacy Documents

Title: Research on Large Language Model Cross-Cloud Privacy Protection and Collaborative Training based on Federated Learning

Title: LIAM: Multimodal Transformer for Language Instructions, Images, Actions and Semantic Maps

Title: From Laboratory to Real World: A New Benchmark Towards Privacy-Preserved Visible-Infrared Person Re-Identification

Title: A Novel Double Pruning method for Imbalanced Data using Information Entropy and Roulette Wheel Selection for Breast Cancer Diagnosis

Title: RePerformer: Immersive Human-centric Volumetric Videos from Playback to Photoreal Reperformance

Title: Electromagnetic Side-Channel Analysis of PRESENT Lightweight Cipher

Title: Cracking the PUMA Challenge in 24 Hours with CellViT++ and nnU-Net

Title: Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection

Title: Exploration of VLMs for Driver Monitoring Systems Applications

Title: Toward Foundation Models for Online Complex Event Detection in CPS-IoT: A Case Study

Title: Bi-Criteria Optimization for Combinatorial Bandits: Sublinear Regret and Constraint Violation under Bandit Feedback

Title: Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

Title: The Lucie-7B LLM and the Lucie Training Dataset: Open resources for multilingual language generation

Title: Towards Learning High-Precision Least Squares Algorithms with Sequence Models

Title: One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise

Title: Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs

Title: Empirical Privacy Variance

Title: Leveraging Vision Capabilities of Multimodal LLMs for Automated Data Extraction from Plots

Title: CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era

Title: VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining

Title: GS-3I: Gaussian Splatting for Surface Reconstruction from Illumination-Inconsistent Images

Title: Augmented Adversarial Trigger Learning

Title: SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression

Title: EXPRESS: An LLM-Generated Explainable Property Valuation System with Neighbor Imputation

Title: General Table Question Answering via Answer-Formula Joint Generation

Title: Synthesizing Privacy-Preserving Text Data via Finetuning without Finetuning Billion-Scale LLMs

Title: ProbDiffFlow: An Efficient Learning-Free Framework for Probabilistic Single-Image Optical Flow Estimation

Title: ResLPR: A LiDAR Data Restoration Network and Benchmark for Robust Place Recognition Against Weather Corruptions

Title: Probabilistic Neural Networks (PNNs) with t-Distributed Outputs: Adaptive Prediction Intervals Beyond Gaussian Assumptions

Title: Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation

Title: ASD Classification on Dynamic Brain Connectome using Temporal Random Walk with Transformer-based Dynamic Network Embedding

Title: L2COcc: Lightweight Camera-Centric Semantic Scene Completion via Distillation of LiDAR Model

Title: Deepfake Detection with Optimized Hybrid Model: EAR Biometric Descriptor via Improved RCNN

Title: Pathology Image Restoration via Mixture of Prompts

Title: MExD: An Expert-Infused Diffusion Model for Whole-Slide Image Classification

Title: SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation

Title: HKCanto-Eval: A Benchmark for Evaluating Cantonese Language Understanding and Cultural Comprehension in LLMs

Title: Consistent-Point: Consistent Pseudo-Points for Semi-Supervised Crowd Counting and Localization

Title: BREEN: Bridge Data-Efficient Encoder-Free Multimodal Learning with Learnable Queries

Title: Causality Model for Semantic Understanding on Videos

Title: ISLR101: an Iranian Word-Level Sign Language Recognition Dataset

Title: Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation

Title: Learning Privacy from Visual Entities

Title: DPF-Net: Physical Imaging Model Embedded Data-Driven Underwater Image Enhancement

Title: Diffusion-based Synthetic Data Generation for Visible-Infrared Person Re-Identification

Title: GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing

Title: CAKE: Cascading and Adaptive KV Cache Eviction with Layer Preferences

Title: Geometry-Aware Face Reconstruction Under Occluded Scenes

Title: Learning Contour-Guided 3D Face Reconstruction with Occlusions

Title: Defense Against Model Stealing Based on Account-Aware Distribution Discrepancy

Title: Segment Any-Quality Images with Generative Latent Space Enhancement

Title: AI-Powered Automated Model Construction for Patient-Specific CFD Simulations of Aortic Flows

Title: HyConEx: Hypernetwork classifier with counterfactual explanations

Title: EditID: Training-Free Editable ID Customization for Text-to-Image Generation

Title: A Plug-and-Play Learning-based IMU Bias Factor for Robust Visual-Inertial Odometry

Title: Investigating Human-Aligned Large Language Model Uncertainty

Title: Towards Suturing World Models: Learning Predictive Models for Robotic Surgical Tasks

Title: Time-EAPCR-T: A Universal Deep Learning Approach for Anomaly Detection in Industrial Equipment

Title: SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs

Title: Debiasing Diffusion Model: Enhancing Fairness through Latent Representation Learning in Stable Diffusion Model

Title: BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis

Title: ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos

Title: PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

Title: From Guessing to Asking: An Approach to Resolving the Persona Knowledge Gap in LLMs during Multi-Turn Conversations

Title: AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding

Title: Multi-Granular Multimodal Clue Fusion for Meme Understanding

Title: Diffusion on Graph: Augmentation of Graph Structure for Node Classification

Title: GAN-Based Single-Stage Defense for Traffic Sign Classification Under Adversarial Patch Attack

Title: Deblur Gaussian Splatting SLAM

Title: BalancedDPO: Adaptive Multi-Metric Alignment

Title: RaSA: Rank-Sharing Low-Rank Adaptation

Title: Personalize Anything for Free with Diffusion Transformer

Title: MoECollab: Democratizing LLM Development Through Collaborative Mixture of Experts

Title: Point Cloud Based Scene Segmentation: A Survey

Title: GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation

Title: SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models

Title: Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Title: UniBERTs: Adversarial Training for Language-Universal Representations

Title: LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization

Title: Scaling Semantic Categories: Investigating the Impact on Vision Transformer Labeling Performance

Title: SCOOP: CoSt-effective COngestiOn Attacks in Payment Channel Networks

Title: FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

Title: UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing

Title: TuneNSearch: a hybrid transfer learning and local search approach for solving vehicle routing problems

Title: Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding

Title: Plausibility Vaccine: Injecting LLM Knowledge for Event Plausibility

Title: ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory

Title: Domain Generalization for Improved Human Activity Recognition in Office Space Videos Using Adaptive Pre-processing

Title: Algebraic Adversarial Attacks on Explainability Models

Title: GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

Title: TinySQL: A Progressive Text-to-SQL Dataset for Mechanistic Interpretability Research

Title: A Linearized Alternating Direction Multiplier Method for Federated Matrix Completion Problems

Title: In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention

Title: Cohort-attention Evaluation Metric against Tied Data: Studying Performance of Classification Models in Cancer Detection

Title: VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis

Title: RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning

Title: A Survey on Human Interaction Motion Generation

Title: Asynchronous Predictive Counterfactual Regret Minimization$^+$ Algorithm in Solving Extensive-Form Games

Title: NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models

Title: TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image

Title: LangDA: Building Context-Awareness via Language for Domain Adaptive Semantic Segmentation

Title: SAM2 for Image and Video Segmentation: A Comprehensive Survey

Title: Mixed-granularity Implicit Representation for Continuous Hyperspectral Compressive Reconstruction

Title: Privacy-Preserving Biometric Verification with Handwritten Random Digit String

Title: Improving Generalization of Universal Adversarial Perturbation via Dynamic Maximin Optimization

Title: A Reinforcement Learning-Driven Transformer GAN for Molecular Generation

Title: DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Title: Grounded Chain-of-Thought for Multimodal Large Language Models

Title: Pairwise Similarity Regularization for Semi-supervised Graph Medical Image Segmentation

Title: BLIA: Detect model memorization in binary classification model through passive Label Inference attack

Title: Leveraging Deep Neural Networks for Aspect-Based Sentiment Classification

Title: A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules

Title: From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration

Title: An Optimization Framework for Differentially Private Sparse Fine-Tuning

Title: GSBAK$^K$: $top$-$K$ Geometric Score-based Black-box Attack

Title: CompMarkGS: Robust Watermarking for Compression 3D Gaussian Splatting

Title: DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode

Title: Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data

Title: GuideDog: A Real-World Egocentric Multimodal Dataset for Blind and Low-Vision Accessibility-Aware Guidance

Title: ACT360: An Efficient 360-Degree Action Detection and Summarization Framework for Mission-Critical Training and Debriefing

Title: Adaptive Transformer Attention and Multi-Scale Fusion for Spine 3D Segmentation

Title: Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation

Title: Evolution-based Region Adversarial Prompt Learning for Robustness Enhancement in Vision-Language Models

Title: An interpretable approach to automating the assessment of biofouling in video footage

Title: nvBench 2.0: A Benchmark for Natural Language to Visualization under Ambiguity

Title: Early Detection of Forest Calamities in Homogeneous Stands -- Deep Learning Applied to Bark-Beetle Outbreaks

Title: UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network

Title: Safeguarding LLM Embeddings in End-Cloud Collaboration via Entropy-Driven Perturbation

Title: Federated Continual Instruction Tuning

Title: Experiments with Optimal Model Trees

Title: HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models

Title: ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs

Title: MMLNB: Multi-Modal Learning for Neuroblastoma Subtyping Classification Assisted with Textual Description Generation

Title: AR-1-to-3: Single Image to Consistent 3D Object Generation via Next-View Prediction

Title: MirrorGuard: Adaptive Defense Against Jailbreaks via Entropy-Guided Mirror Crafting

Title: Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs

Title: HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model

Title: GIFT: Generated Indoor video frames for Texture-less point tracking

Title: Performance Analysis and Industry Deployment of Post-Quantum Cryptography Algorithms

Title: Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction

Title: HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

Title: FedSDP: Explainable Differential Privacy in Federated Learning via Shapley Values

Title: Unlock Pose Diversity: Accurate and Efficient Implicit Keypoint-based Spatiotemporal Diffusion for Audio-driven Talking Portrait

Title: Training Video Foundation Models with NVIDIA NeMo

Title: Optimal Denoising in Score-Based Generative Models: The Role of Data Regularity

Title: OptiPMB: Enhancing 3D Multi-Object Tracking with Optimized Poisson Multi-Bernoulli Filtering

Title: Aligning Vision to Language: Text-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning

Title: Prospects for Mitigating Spectral Variability in Tropical Species Classification Using Self-Supervised Learning

Title: Exploring 3D Activity Reasoning and Planning: From Implicit Human Intentions to Route-Aware Planning

Title: Enhancing Job Salary Prediction with Disentangled Composition Effect Modeling: A Neural Prototyping Approach

Title: SparseAlign: A Fully Sparse Framework for Cooperative Object Detection

Title: A Multi-Stage Framework with Taxonomy-Guided Reasoning for Occupation Classification Using Large Language Models

Title: TFDM: Time-Variant Frequency-Based Point Cloud Diffusion with Mamba

Title: Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients

Title: Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation

Title: PoseSyn: Synthesizing Diverse 3D Pose Data from In-the-Wild 2D Data

Title: HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model

Title: Overview of the NTCIR-18 Automatic Evaluation of LLMs (AEOLLM) Task

Title: InsightDrive: Insight Scene Representation for End-to-End Autonomous Driving

Title: Uncertainty-Aware Knowledge Distillation for Compact and Efficient 6DoF Pose Estimation

Title: MaskSDM with Shapley values to improve flexibility, robustness, and explainability in species distribution modeling

Title: Do Vision Models Develop Human-Like Progressive Difficulty Understanding?

Title: Federated Learning with Domain Shift Eraser

Title: Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation

Title: DehazeMamba: SAR-guided Optical Remote Sensing Image Dehazing with Adaptive State Space Model

Title: Rethinking Image Evaluation in Super-Resolution

Title: A Framework to Assess Multilingual Vulnerabilities of LLMs

Title: Gaussian On-the-Fly Splatting: A Progressive Framework for Robust Near Real-Time 3DGS Optimization

Title: ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning

Title: Who Wrote This? Identifying Machine vs Human-Generated Text in Hausa

Title: REPA: Russian Error Types Annotation for Evaluating Text Generation and Judgment Capabilities

Title: ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models

Title: Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference

Title: Code-Driven Inductive Synthesis: Enhancing Reasoning Abilities of Large Language Models with Sequences

Title: DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Title: MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs

Title: VeriLeaky: Navigating IP Protection vs Utility in Fine-Tuning for LLM-Driven Verilog Coding

Title: 3D Human Interaction Generation: A Survey

Title: ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation

Title: Patient-specific radiomic feature selection with reconstructed healthy persona of knee MR images

Title: DynSTG-Mamba: Dynamic Spatio-Temporal Graph Mamba with Cross-Graph Knowledge Distillation for Gait Disorders Recognition

Title: Laplace-Net: Learning Dynamical Systems with External Forcing

Title: Language-guided Open-world Video Anomaly Detection

Title: PAUSE: Low-Latency and Privacy-Aware Active User Selection for Federated Learning

Title: DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction

Title: GC-Fed: Gradient Centralized Federated Learning with Partial Client Participation

Title: 3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o

Title: 3D Hierarchical Panoptic Segmentation in Real Orchard Environments Across Different Sensors

Title: Deep Learning Advancements in Anomaly Detection: A Comprehensive Survey

Title: Clustering is back: Reaching state-of-the-art LiDAR instance segmentation without training

Title: Improving Complex Reasoning with Dynamic Prompt Corruption: A soft prompt Optimization Approach

Title: MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis

Title: Can Language Models Follow Multiple Turns of Entangled Instructions?

Title: ProDiF: Protecting Domain-Invariant Features to Secure Pre-Trained Models Against Extraction

Title: Mind the Gap: Confidence Discrepancy Can Guide Federated Semi-Supervised Learning Across Pseudo-Mismatch

Title: HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures

Title: Sampling Innovation-Based Adaptive Compressive Sensing

Title: TablePilot; Recommending Human-Preferred Tabular Data Analysis with Large Language Models

Title: FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis

Title: Graph Generative Models Evaluation with Masked Autoencoder

Title: Generative Gaussian Splatting: Generating 3D Scenes with Video Diffusion Priors

Title: LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation

Title: A Survey on Transformer Context Extension: Approaches and Evaluation

Title: UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation

Title: GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation

Title: Computation Mechanism Behind LLM Position Generalization

Title: MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Portrait Few-Step Synthesis

Title: Edit Transfer: Learning Image Editing via Vision In-Context Relations

Title: Valid Text-to-SQL Generation with Unification-based DeepStochLog

Title: STEP: Simultaneous Tracking and Estimation of Pose for Animals and Humans

Title: Agents Play Thousands of 3D Video Games

Title: One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation

Title: Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning

Title: SyncDiff: Diffusion-based Talking Head Synthesis with Bottlenecked Temporal Visual Prior for Improved Synchronization

Title: Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning

Title: Scale Efficient Training for Large Datasets

Title: Aligned Probing: Relating Toxic Behavior and Model Internals

Title: MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Title: Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis

Title: DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective

Title: Securing Virtual Reality Experiences: Unveiling and Tackling Cybersickness Attacks with Explainable AI

Title: SuperBPE: Space Travel for Language Models

Title: Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation

Title: xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference

Title: Escaping Plato's Cave: Robust Conceptual Reasoning through Interpretable 3D Neural Object Volumes

Title: Measuring In-Context Computation Complexity via Hidden State Prediction

Title: Uncovering Utility Functions from Observed Outcomes

Title: Less Biased Noise Scale Estimation for Threshold-Robust RANSAC

Title: BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing

Title: Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images

Title: MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling

Title: DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models

Title: VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning

Title: Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance

Title: MetaScale: Test-Time Scaling with Evolving Meta-Thoughts