2025-03-19

Title: SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models

Title: CoCMT: Communication-Efficient Cross-Modal Transformer for Collaborative Perception

Title: Ensemble Learning for Large Language Models in Text and Code Generation: A Survey

Title: The Role of Hyperparameters in Predictive Multiplicity

Title: NeurIPS 2023 LLM Efficiency Fine-tuning Competition

Title: It is Too Many Options: Pitfalls of Multiple-Choice Questions in Generative AI and Medical Education

Title: MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance

Title: Prompt Sentiment: The Catalyst for LLM Change

Title: RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration

Title: CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning

Title: Evaluating the Process Modeling Abilities of Large Language Models -- Preliminary Foundations and Results

Title: Agent-Enhanced Large Language Models for Researching Political Institutions

Title: Cognitive Activation and Chaotic Dynamics in Large Language Models: A Quasi-Lyapunov Analysis of Reasoning Mechanisms

Title: Context-aware Multimodal AI Reveals Hidden Pathways in Five Centuries of Art Evolution

Title: FedTilt: Towards Multi-Level Fairness-Preserving and Robust Federated Learning

Title: MSCMHMST: A traffic flow prediction model based on Transformer

Title: HAR-DoReMi: Optimizing Data Mixture for Self-Supervised Human Activity Recognition Across Heterogeneous IMU Datasets

Title: Enhancing Visual Representation with Textual Semantics: Textual Semantics-Powered Prototypes for Heterogeneous Federated Learning

Title: Semi-Decision-Focused Learning with Deep Ensembles: A Practical Framework for Robust Portfolio Optimization

Title: CNCast: Leveraging 3D Swin Transformer and DiT for Enhanced Regional Weather Forecasting

Title: Fuzzy Rule-based Differentiable Representation Learning

Title: Towards Privacy-Preserving Data-Driven Education: The Potential of Federated Learning

Title: Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models

Title: MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG

Title: ExChanGeAI: An End-to-End Platform and Efficient Foundation Model for Electrocardiogram Analysis and Fine-tuning

Title: Analytic Subspace Routing: How Recursive Least Squares Works in Continual Learning of Large Language Model

Title: A Comprehensive Survey on Visual Concept Mining in Text-to-image Diffusion Models

Title: Let Synthetic Data Shine: Domain Reassembly and Soft-Fusion for Single Domain Generalization

Title: XChainDataGen: A Cross-Chain Dataset Generation Framework

Title: Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos

Title: Web Artifact Attacks Disrupt Vision Language Models

Title: Pensez: Less Data, Better Reasoning -- Rethinking French LLM

Title: FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models

Title: Feature Extraction and Analysis for GPT-Generated Text

Title: Mitigating Spectral Bias in Neural Operators via High-Frequency Scaling for Physical Systems

Title: Improving Geometric Consistency for 360-Degree Neural Radiance Fields in Indoor Scenarios

Title: SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint

Title: Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition

Title: TextInVision: Text and Prompt Complexity Driven Visual Text Generation Benchmark

Title: CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings

Title: AccelGen: Heterogeneous SLO-Guaranteed High-Throughput LLM Inference Serving for Diverse Applications

Title: Learning from Synchronization: Self-Supervised Uncalibrated Multi-View Person Association in Challenging Scenes

Title: FedVSR: Towards Model-Agnostic Federated Learning in Video Super-Resolution

Title: Fast alignment of heterogeneous images in sliced Wasserstein distance

Title: Effective Dimension Aware Fractional-Order Stochastic Gradient Descent for Convex Optimization Problems

Title: Continual Unlearning for Foundational Text-to-Image Models without Generalization Erosion

Title: Mitigating KV Cache Competition to Enhance User Experience in LLM Inference

Title: 8-Calves Image dataset

Title: Using 3D reconstruction from image motion to predict total leaf area in dwarf tomato plants

Title: Identifying and Mitigating Position Bias of Multi-image Vision-Language Models

Title: LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation

Title: AI-Powered Prediction of Nanoparticle Pharmacokinetics: A Multi-View Learning Approach

Title: SMILE: a Scale-aware Multiple Instance Learning Method for Multicenter STAS Lung Cancer Histopathology Diagnosis

Title: Text-Guided Image Invariant Feature Learning for Robust Image Watermarking

Title: Organ-aware Multi-scale Medical Image Segmentation Using Text Prompt Engineering

Title: FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification

Title: MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments

Title: Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly Detection

Title: SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing

Title: Enabling Inclusive Systematic Reviews: Incorporating Preprint Articles with Large Language Model-Driven Evaluations

Title: Less is More: Improving Motion Diffusion Models with Sparse Keyframes

Title: Robust3D-CIL: Robust Class-Incremental Learning for 3D Perception

Title: Empirical Calibration and Metric Differential Privacy in Language Models

Title: Multi-label feature selection based on binary hashing learning and dynamic graph constraints

Title: MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation

Title: MoK-RAG: Mixture of Knowledge Paths Enhanced Retrieval-Augmented Generation for Embodied AI Environments

Title: YOLO-LLTS: Real-Time Low-Light Traffic Sign Detection via Prior-Guided Enhancement and Multi-Branch Feature Interaction

Title: COMM:Concentrated Margin Maximization for Robust Document-Level Relation Extraction

Title: Exploiting Inherent Class Label: Towards Robust Scribble Supervised Semantic Segmentation

Title: TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection

Title: Quantification of Uncertainties in Probabilistic Deep Neural Network by Implementing Boosting of Variational Inference

Title: PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds

Title: Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels

Title: ConSCompF: Consistency-focused Similarity Comparison Framework for Generative Large Language Models

Title: Reconstructing Cell Lineage Trees from Phenotypic Features with Metric Learning

Title: Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation

Title: Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models

Title: Multi-Modal Self-Supervised Semantic Communication

Title: Make the Most of Everything: Further Considerations on Disrupting Diffusion-based Customization

Title: FrustumFusionNets: A Three-Dimensional Object Detection Network Based on Tractor Road Scene

Title: SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World Model

Title: Improving LLM Video Understanding with 16 Frames Per Second

Title: DIFFVSGG: Diffusion-Driven Online Video Scene Graph Generation

Title: Survey of Adversarial Robustness in Multimodal Large Language Models

Title: MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Title: FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks

Title: SoccerSynth Field: enhancing field detection with synthetic data from virtual soccer simulator

Title: Empowering LLMs in Decision Games through Algorithmic Data Synthesis

Title: A-SCoRe: Attention-based Scene Coordinate Regression for wide-ranging scenarios

Title: SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability

Title: DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection

Title: Empowering Smaller Models: Tuning LLaMA and Gemma with Chain-of-Thought for Ukrainian Exam Tasks

Title: TarPro: Targeted Protection against Malicious Image Editing

Title: Multimodal Feature-Driven Deep Learning for the Prediction of Duck Body Dimensions and Weight

Title: MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling

Title: Predicting Human Choice Between Textually Described Lotteries

Title: Securing Automated Insulin Delivery Systems: A Review of Security Threats and Protectives Strategies

Title: LEGNet: Lightweight Edge-Gaussian Driven Network for Low-Quality Remote Sensing Image Object Detection

Title: Boosting Semi-Supervised Medical Image Segmentation via Masked Image Consistency and Discrepancy Learning

Title: MP-GUI: Modality Perception with MLLMs for GUI Understanding

Title: Synthetic Data Generation Using Large Language Models: Advances in Text and Code

Title: Uncertainty-Aware Global-View Reconstruction for Multi-View Multi-Label Feature Selection

Title: Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting

Title: A Revisit to the Decoder for Camouflaged Object Detection

Title: Intra and Inter Parser-Prompted Transformers for Effective Image Restoration

Title: Learning on LLM Output Signatures for gray-box LLM Behavior Analysis

Title: ON-Traffic: An Operator Learning Framework for Online Traffic Flow Estimation and Uncertainty Quantification from Lagrangian Sensors

Title: Fast Autoregressive Video Generation with Diagonal Decoding

Title: Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency

Title: Wiki-Quantities and Wiki-Measurements: Datasets of Quantities and their Measurement Context from Wikipedia

Title: Condensing Action Segmentation Datasets via Generative Network Inversion

Title: SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models

Title: CARE: A QLoRA-Fine Tuned Multi-Domain Chatbot With Fast Learning On Minimal Hardware

Title: Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding

Title: Comparative and Interpretative Analysis of CNN and Transformer Models in Predicting Wildfire Spread Using Remote Sensing Data

Title: Speculative Decoding for Verilog: Speed and Quality, All in One

Title: RBFIM: Perceptual Quality Assessment for Compressed Point Clouds Using Radial Basis Function Interpolation

Title: Towards Harmless Multimodal Assistants with Blind Preference Optimization

Title: RoGSplat: Learning Robust Generalizable Human Gaussian Splatting from Sparse Multi-View Images

Title: AI-Driven Diabetic Retinopathy Diagnosis Enhancement through Image Processing and Salp Swarm Algorithm-Optimized Ensemble Network

Title: Decision Tree Induction Through LLMs via Semantically-Aware Evolution

Title: Segmentation-Guided Neural Radiance Fields for Novel Street View Synthesis

Title: Panoramic Distortion-Aware Tokenization for Person Detection and Localization Using Transformers in Overhead Fisheye Images

Title: Multi-task Learning for Identification of Porcelain in Song and Yuan Dynasties

Title: CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models

Title: Make Your Training Flexible: Towards Deployment-Efficient Video Models

Title: Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning

Title: Deep Unsupervised Segmentation of Log Point Clouds

Title: Trading-off Accuracy and Communication Cost in Federated Learning

Title: Quantization-Free Autoregressive Action Transformer

Title: DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal

Title: CTSR: Controllable Fidelity-Realness Trade-off Distillation for Real-World Image Super Resolution

Title: Manual Labelling Artificially Inflates Deep Learning-Based Segmentation Performance on Closed Canopy: Validation Using TLS

Title: Free-Lunch Color-Texture Disentanglement for Stylized Image Generation

Title: Anti-Tamper Radio meets Reconfigurable Intelligent Surface for System-Level Tamper Detection

Title: XOXO: Stealthy Cross-Origin Context Poisoning Attacks against AI Coding Assistants

Title: Entente: Cross-silo Intrusion Detection on Network Log Graphs with Federated Learning

Title: Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs

Title: Improved Scalable Lipschitz Bounds for Deep Neural Networks

Title: Unveiling the Role of Randomization in Multiclass Adversarial Classification: Insights from Graph Theory

Title: COPA: Comparing the Incomparable to Explore the Pareto Front

Title: DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies

Title: LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models

Title: EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment

Title: Revealing higher-order neural representations with generative artificial intelligence

Title: PENCIL: Long Thoughts with Short Memory

Title: 3D Densification for Multi-Map Monocular VSLAM in Endoscopy

Title: VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation

Title: MAST-Pro: Dynamic Mixture-of-Experts for Adaptive Segmentation of Pan-Tumors with Knowledge-Driven Prompts

Title: Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis

Title: RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment

Title: Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels

Title: Good/Evil Reputation Judgment of Celebrities by LLMs via Retrieval Augmented Generation

Title: Vexed by VEX tools: Consistency evaluation of container vulnerability scanners

Title: How much do LLMs learn from negative examples?

Title: From "Hallucination" to "Suture": Insights from Language Philosophy to Enhance Large Language Models

Title: Technical Report: Aggregation on Learnable Manifolds for Asynchronous Federated Optimization

Title: Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance

Title: DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers

Title: Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models

Title: ExDDV: A New Dataset for Explainable Deepfake Detection in Video

Title: MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

Title: Joint Image-Instance Spatial-Temporal Attention for Few-shot Action Recognition

Title: PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play

Title: LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers

Title: EnvBench: A Benchmark for Automated Environment Setup

Title: Bolt3D: Generating 3D Scenes in Seconds

Title: RWKV-7 "Goose" with Expressive Dynamic State Evolution

Title: SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model

Title: Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

Title: DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Title: Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Title: Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

Title: State Space Model Meets Transformer: A New Paradigm for 3D Object Detection

Title: Deeply Supervised Flow-Based Generative Models

Title: Advances in 4D Generation: A Survey

Title: The Power of Context: How Multimodality Improves Image Super-Resolution

Title: Aligning Multimodal LLM with Human Preference: A Survey

Title: MusicInfuser: Making Video Diffusion Listen and Dance