2025-03-25

Title: How Effective Is Constitutional AI in Small LLMs? A Study on DeepSeek-R1 and Its Peers

Title: State Fourier Diffusion Language Model (SFDLM): A Scalable, Novel Iterative Approach to Language Modeling

Title: ChatGPT or A Silent Everywhere Helper: A Survey of Large Language Models

Title: IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D Scenes

Title: A Comprehensive Survey on Long Context Language Modeling

Title: Enhancing Subsequent Video Retrieval via Vision-Language Models (VLMs)

Title: Generative Modeling of Class Probability for Multi-Modal Representation Learning

Title: V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms

Title: Beyond Negation Detection: Comprehensive Assertion Detection Models for Clinical NLP

Title: Enhanced Smart Contract Reputability Analysis using Multimodal Data Fusion on Ethereum

Title: On-Device Federated Continual Learning on RISC-V-based Ultra-Low-Power SoC for Intelligent Nano-Drone Swarms

Title: LEMMA: Learning from Errors for MatheMatical Advancement in LLMs

Title: CausalRivers -- Scaling up benchmarking of causal discovery for real-world time-series

Title: Feature-Based Dual Visual Feature Extraction Model for Compound Multimodal Emotion Recognition

Title: Collaborative Value Function Estimation Under Model Mismatch: A Federated Temporal Difference Analysis

Title: Language-specific Neurons Do Not Facilitate Cross-Lingual Transfer

Title: Bayesian generative models can flag performance loss, bias, and out-of-distribution image content

Title: What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models

Title: SaudiCulture: A Benchmark for Evaluating Large Language Models Cultural Competence within Saudi Arabia

Title: ProDehaze: Prompting Diffusion Models Toward Faithful Image Dehazing

Title: Judge Anything: MLLM as a Judge Across Any Modality

Title: Meme Similarity and Emotion Detection using Multimodal Analysis

Title: Efficient Knowledge Distillation via Curriculum Extraction

Title: Variance Control via Weight Rescaling in LLM Pre-training

Title: Towards Understanding the Benefits of Neural Network Parameterizations in Geophysical Inversions: A Study With Neural Fields

Title: Improving Quantization with Post-Training Model Expansion

Title: Language Models May Verbatim Complete TextThey Were Not Explicitly Trained On

Title: Bayesian Teaching Enables Probabilistic Reasoning in Large Language Models

Title: Should we pre-train a decoder in contrastive learning for dense prediction tasks?

Title: FMDConv: Fast Multi-Attention Dynamic Convolution via Speed-Accuracy Trade-off

Title: MetaSel: A Test Selection Approach for Fine-tuned DNN Models

Title: DermDiff: Generative Diffusion Model for Mitigating Racial Biases in Dermatology Diagnosis

Title: Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks

Title: PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning

Title: Fairness-Driven LLM-based Causal Discovery with Active Learning and Dynamic Scoring

Title: Is there anything left? Measuring semantic residuals of objects removed from 3D Gaussian Splatting

Title: Measuring the Robustness of Audio Deepfake Detectors

Title: Large Language Models Can Verbatim Reproduce Long Malicious Sequences

Title: Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility

Title: ConSol: Sequential Probability Ratio Testing to Find Consistent LLM Reasoning Paths Efficiently

Title: LEMIX: Enabling Testing of Embedded Applications as Linux Applications

Title: Guidance Free Image Editing via Explicit Conditioning

Title: GPBench: A Comprehensive and Fine-Grained Benchmark for Evaluating Large Language Models as General Practitioners

Title: Unraveling Pedestrian Fatality Patterns: A Comparative Study with Explainable AI

Title: FairFlow: Mitigating Dataset Biases through Undecided Learning

Title: InstructVEdit: A Holistic Approach for Instructional Video Editing

Title: On The Sample Complexity Bounds In Bilevel Reinforcement Learning

Title: Visual Variational Autoencoder Prompt Tuning

Title: Efficient Diffusion Training through Parallelization with Truncated Karhunen-Loève Expansion

Title: Sentinel: Multi-Patch Transformer with Temporal and Channel Attention for Time Series Forecasting

Title: OMR-Diffusion:Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Intent Understanding

Title: Enhancing Persona Consistency for LLMs' Role-Playing using Persona-Aware Contrastive Learning

Title: CardioTabNet: A Novel Hybrid Transformer Model for Heart Disease Prediction using Tabular Medical Data

Title: 3D Modeling: Camera Movement Estimation and path Correction for SFM Model using the Combination of Modified A-SIFT and Stereo System

Title: A Temporal Modeling Framework for Video Pre-Training on Video Instance Segmentation

Title: DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion

Title: Towards Transformer-Based Aligned Generation with Self-Coherence Guidance

Title: Safe RLHF-V: Safe Reinforcement Learning from Human Feedback in Multimodal Large Language Models

Title: Decentralized Federated Dataset Dictionary Learning for Multi-Source Domain Adaptation

Title: CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model

Title: MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion

Title: MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking

Title: Multi-modality Anomaly Segmentation on the Road

Title: Normalized Matching Transformer

Title: EMPLACE: Self-Supervised Urban Scene Change Detection

Title: Towards Invisible Backdoor Attack on Text-to-Image Diffusion Model

Title: DynASyn: Multi-Subject Personalization Enabling Dynamic Action Synthesis

Title: Co-op: Correspondence-based Novel Object Pose Estimation

Title: Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection

Title: Serial Low-rank Adaptation of Vision Transformer

Title: HiLoTs: High-Low Temporal Sensitive Representation Learning for Semi-Supervised LiDAR Segmentation in Autonomous Driving

Title: Building Resource-Constrained Language Agents: A Korean Case Study on Chemical Toxicity Information

Title: Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes

Title: Bandwidth Reservation for Time-Critical Vehicular Applications: A Multi-Operator Environment

Title: Design and implementation of a novel cryptographically secure pseudorandom number generator

Title: Aligning Foundation Model Priors and Diffusion-Based Hand Interactions for Occlusion-Resistant Two-Hand Reconstruction

Title: Topology preserving Image segmentation using the iterative convolution-thresholding method

Title: Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM

Title: Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models

Title: Relation Extraction with Instance-Adapted Predicate Descriptions

Title: A Roadmap Towards Improving Multi-Agent Reinforcement Learning With Causal Discovery And Inference

Title: Neural Network Approach to Stochastic Dynamics for Smooth Multimodal Density Estimation

Title: Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models

Title: Connectedness: a dimension of security bug severity assessment for measuring uncertainty

Title: RefCut: Interactive Segmentation with Reference Guidance

Title: Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration

Title: 4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding

Title: Fingerprinting Implementations of Cryptographic Primitives and Protocols that Use Post-Quantum Algorithms

Title: Adapt, Agree, Aggregate: Semi-Supervised Ensemble Labeling for Graph Convolutional Networks

Title: NVBleed: Covert and Side-Channel Attacks on NVIDIA Multi-GPU Interconnect

Title: ClaraVid: A Holistic Scene Reconstruction Benchmark From Aerial Perspective With Delentropy-Based Complexity Profiling

Title: Detecting and Mitigating DDoS Attacks with AI: A Survey

Title: A Distributed Blockchain-based Access Control for the Internet of Things

Title: Satisfactory Medical Consultation based on Terminology-Enhanced Information Retrieval and Emotional In-Context Learning

Title: Think Before Refusal : Triggering Safety Reflection in LLMs to Mitigate False Refusal Behavior

Title: Understanding and Mitigating Side and Covert Channel Vulnerabilities Introduced by RowHammer Defenses

Title: MedPlan:A Two-Stage RAG-Based System for Personalized Medical Plan Generation

Title: GLADMamba: Unsupervised Graph-Level Anomaly Detection Powered by Selective State Space Model

Title: Guided Diffusion for the Extension of Machine Vision to Human Visual Perception

Title: WindowKV: Task-Adaptive Group-Wise KV Cache Window Selection for Efficient LLM Inference

Title: Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization

Title: STShield: Single-Token Sentinel for Real-Time Jailbreak Detection in Large Language Models

Title: Experience Retrieval-Augmentation with Electronic Health Records Enables Accurate Discharge QA

Title: TransAnimate: Taming Layer Diffusion to Generate RGBA Video

Title: An Empirical Study of the Role of Incompleteness and Ambiguity in Interactions with Large Language Models

Title: FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation

Title: SLIDE: Sliding Localized Information for Document Extraction

Title: On the Origins of Sampling Bias: Implications on Fairness Measurement and Mitigation

Title: Won: Establishing Best Practices for Korean Financial NLP

Title: Understanding the Effects of RLHF on the Quality and Detectability of LLM-Generated Texts

Title: Real-World Remote Sensing Image Dehazing: Benchmark and Baseline

Title: PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos

Title: Co-SemDepth: Fast Joint Semantic Segmentation and Depth Estimation on Aerial Images

Title: Metaphor-based Jailbreaking Attacks on Text-to-Image Models

Title: Geometric Constrained Non-Line-of-Sight Imaging

Title: Instructing the Architecture Search for Spatial-temporal Sequence Forecasting with LLM

Title: SymmCompletion: High-Fidelity and High-Consistency Point Cloud Completion with Symmetry Guidance

Title: Personalized Language Models via Privacy-Preserving Evolutionary Model Merging

Title: Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning

Title: Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook

Title: OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models

Title: Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models

Title: DualCP: Rehearsal-Free Domain-Incremental Learning via Dual-Level Concept Prototype

Title: BERTDetect: A Neural Topic Modelling Approach for Android Malware Detection

Title: Interpretable Feature Interaction via Statistical Self-supervised Learning on Tabular Data

Title: PolarFree: Polarization-based Reflection-free Imaging

Title: Investigating Recent Large Language Models for Vietnamese Machine Reading Comprehension

Title: Dynamic Allocation Hypernetwork with Adaptive Model Recalibration for FCL

Title: Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation

Title: Self-Explaining Neural Networks for Business Process Monitoring

Title: PanopticSplatting: End-to-End Panoptic Gaussian Splatting

Title: A Multi-Model Adaptation of Speculative Decoding for Classification

Title: Model-Guardian: Protecting against Data-Free Model Stealing Using Gradient Representations and Deceptive Predictions

Title: Vehicular Road Crack Detection with Deep Learning: A New Online Benchmark for Comprehensive Evaluation of Existing Algorithms

Title: Unified Geometry and Color Compression Framework for Point Clouds via Generative Diffusion Priors

Title: Temporal Relation Extraction in Clinical Texts: A Span-based Graph Transformer Approach

Title: $D^2LoRA$: Data-Driven LoRA Initialization for Low Resource Tasks

Title: M3Net: Multimodal Multi-task Learning for 3D Detection, Segmentation, and Occupancy Prediction in Autonomous Driving

Title: PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

Title: Detection of Somali-written Fake News and Toxic Messages on the Social Media Using Transformer-based Language Models

Title: End-to-End Implicit Neural Representations for Classification

Title: GeoBenchX: Benchmarking LLMs for Multistep Geospatial Tasks

Title: Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization

Title: MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection

Title: An Image-like Diffusion Method for Human-Object Interaction Detection

Title: MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation

Title: TCFG: Tangential Damping Classifier-free Guidance

Title: AGIR: Assessing 3D Gait Impairment with Reasoning based on LLMs

Title: LocDiffusion: Identifying Locations on Earth by Diffusing in the Hilbert Space

Title: LongDiff: Training-Free Long Video Generation in One Go

Title: Decorum: A Language-Based Approach For Style-Conditioned Synthesis of Indoor 3D Scenes

Title: DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation

Title: Evaluating Negative Sampling Approaches for Neural Topic Models

Title: Self-Attention Diffusion Models for Zero-Shot Biomedical Image Segmentation: Unlocking New Frontiers in Medical Imaging

Title: Unmasking Deceptive Visuals: Benchmarking Multimodal Large Language Models on Misleading Chart Question Answering

Title: Literature Review: Cyber Security Monitoring in Maritime

Title: Training A Neural Network For Partially Occluded Road Sign Identification In The Context Of Autonomous Vehicles

Title: Causality-Aware Next Location Prediction Framework based on Human Mobility Stratification

Title: Exploring Topic Trends in COVID-19 Research Literature using Non-Negative Matrix Factorization

Title: FROG: Fair Removal on Graphs

Title: SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction

Title: LakotaBERT: A Transformer-based Model for Low Resource Lakota Language

Title: Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters

Title: MammAlps: A multi-view video behavior monitoring dataset of wild mammals in the Swiss Alps

Title: A Framework for Finding Local Saddle Points in Two-Player Zero-Sum Black-Box Games

Title: Decoupling Angles and Strength in Low-rank Adaptation

Title: PG-SAM: Prior-Guided SAM with Medical for Multi-organ Segmentation

Title: KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies

Title: ShED-HD: A Shannon Entropy Distribution Framework for Lightweight Hallucination Detection on Edge Devices

Title: DiffGED: Computing Graph Edit Distance via Diffusion-based Graph Matching

Title: Enhancing Multi-Label Emotion Analysis and Corresponding Intensities for Ethiopian Languages

Title: Surface-Aware Distilled 3D Semantic Features

Title: The Human-Machine Identity Blur: A Unified Framework for Cybersecurity Risk Management in 2025

Title: Analyzing Islamophobic Discourse Using Semi-Coded Terms and LLMs

Title: TrackID3x3: A Dataset and Algorithm for Multi-Player Tracking with Identification and Pose Estimation in 3x3 Basketball Full-court Videos

Title: CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

Title: Sun-Shine: A Large Language Model for Tibetan Culture

Title: When is dataset cartography ineffective? Using training dynamics does not improve robustness against Adversarial SQuAD

Title: Fact-checking AI-generated news reports: Can LLMs catch their own lies?

Title: LGPS: A Lightweight GAN-Based Approach for Polyp Segmentation in Colonoscopy Images

Title: Surgical Action Planning with Large Language Models

Title: Image-to-Text for Medical Reports Using Adaptive Co-Attention and Triple-LSTM Module

Title: Diff-Palm: Realistic Palmprint Generation with Polynomial Creases and Intra-Class Variation Controllable Diffusion Models

Title: LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

Title: Knowledge Transfer from LLMs to Provenance Analysis: A Semantic-Augmented Method for APT Detection

Title: Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization

Title: Plug-and-Play Interpretable Responsible Text-to-Image Generation via Dual-Space Multi-facet Concept Control

Title: Towards Training-free Anomaly Detection with Vision and Language Foundation Models

Title: Mitigating Cache Noise in Test-Time Adaptation for Large Vision-Language Models

Title: Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models

Title: SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking

Title: PS-EIP: Robust Photometric Stereo Based on Event Interval Profile

Title: Attacking and Improving the Tor Directory Protocol

Title: Latent Embedding Adaptation for Human Preference Alignment in Diffusion Planners

Title: Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models

Title: Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation

Title: Context-Enhanced Memory-Refined Transformer for Online Action Detection

Title: J&H: Evaluating the Robustness of Large Language Models Under Knowledge-Injection Attacks in Legal Domain

Title: MaSS13K: A Matting-level Semantic Segmentation Benchmark

Title: DiffusedWrinkles: A Diffusion-Based Model for Data-Driven Garment Animation

Title: Maximum Redundancy Pruning: A Principle-Driven Layerwise Sparsity Allocation for LLMs

Title: Exploring State Space Model in Wavelet Domain: An Infrared and Visible Image Fusion Network via Wavelet Transform and State Space Model

Title: PP-FormulaNet: Bridging Accuracy and Efficiency in Advanced Formula Recognition

Title: RoCA: Robust Contrastive One-class Time Series Anomaly Detection with Contaminated Data

Title: Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance

Title: PDDM: Pseudo Depth Diffusion Model for RGB-PD Semantic Segmentation Based in Complex Indoor Scenes

Title: Solving Situation Puzzles with Large Language Model and External Reformulation

Title: Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning

Title: Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning

Title: VTD-CLIP: Video-to-Text Discretization via Prompting CLIP

Title: U-REPA: Aligning Diffusion U-Nets to ViTs

Title: Panorama Generation From NFoV Image Done Right

Title: Breaking the Encoder Barrier for Seamless Video-Language Understanding

Title: Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation

Title: Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement Learning

Title: A Simple yet Effective Layout Token in Large Language Models for Document Understanding

Title: On the Perception Bottleneck of VLMs for Chart Understanding

Title: Distributionally Robust Federated Learning: An ADMM Algorithm

Title: ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation

Title: Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness

Title: Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models

Title: InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Title: Hiding Images in Diffusion Models by Editing Learned Score Functions

Title: MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing

Title: PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models

Title: CFReID: Continual Few-shot Person Re-Identification

Title: Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding

Title: Explaining Domain Shifts in Language: Concept erasing for Interpretable Image Classification

Title: PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model

Title: MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering

Title: Statistically Testing Training Data for Unwanted Error Patterns using Rule-Oriented Regression

Title: Autoregressive Language Models for Knowledge Base Population: A case study in the space mission domain

Title: Deterministic Certification of Graph Neural Networks against Graph Poisoning Attacks with Arbitrary Perturbations

Title: Can Text-to-Video Generation help Video-Language Alignment?

Title: Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model

Title: SciClaims: An End-to-End Generative System for Biomedical Claim Analysis

Title: AIM2PC: Aerial Image to 3D Building Point Cloud Reconstruction

Title: DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels

Title: Natural Language Processing for Electronic Health Records in Scandinavian Languages: Norwegian, Swedish, and Danish

Title: HiRes-FusedMIM: A High-Resolution RGB-DSM Pre-trained Model for Building-Level Remote Sensing Applications

Title: Distilling Stereo Networks for Performant and Efficient Leaner Networks

Title: Benchmarking Post-Hoc Unknown-Category Detection in Food Recognition

Title: The (Un)suitability of Passwords and Password Managers in Virtual Reality

Title: Discriminative protein sequence modelling with Latent Space Diffusion

Title: EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation

Title: ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset

Title: AMD-Hummingbird: Towards an Efficient Text-to-Video Model

Title: Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models

Title: Distil-xLSTM: Learning Attention Mechanisms through Recurrent Structures

Title: Anchor-based oversampling for imbalanced tabular data via contrastive and adversarial learning

Title: Adapting Video Diffusion Models for Time-Lapse Microscopy

Title: Unified Uncertainty-Aware Diffusion for Multi-Agent Trajectory Modeling

Title: ClinText-SP and RigoBERTa Clinical: a new set of open resources for Spanish Clinical NLP

Title: LinkAlign: Scalable Schema Linking for Real-World Large-Scale Multi-Database Text-to-SQL

Title: LANGALIGN: Enhancing Non-English Language Models via Cross-Lingual Embedding Alignment

Title: Adventurer: Exploration with BiGAN for Deep Reinforcement Learning

Title: Generative Dataset Distillation using Min-Max Diffusion Model

Title: Dig2DIG: Dig into Diffusion Information Gains for Image Fusion

Title: Robust Lane Detection with Wavelet-Enhanced Context Modeling and Adaptive Sampling

Title: Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks

Title: ZeroLM: Data-Free Transformer Architecture Search for Language Models

Title: Robust face recognition based on the wing loss and the $\ell_1$ regularization

Title: Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark

Title: Structure-Aware Correspondence Learning for Relative Pose Estimation

Title: Any6D: Model-free 6D Pose Estimation of Novel Objects

Title: Human Motion Unlearning

Title: NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Title: Commander-GPT: Fully Unleashing the Sarcasm Detection Capability of Multi-Modal Large Language Models

Title: OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad

Title: LLaVAction: evaluating and training multi-modal large language models for action recognition

Title: GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting

Title: Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings

Title: Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos

Title: SFDLA: Source-Free Document Layout Analysis

Title: Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition

Title: Simulation-Driven Balancing of Competitive Game Levels with Reinforcement Learning

Title: Construction Identification and Disambiguation Using BERT: A Case Study of NPN

Title: EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos

Title: Mechanistic Interpretability of Fine-Tuned Vision Transformers on Distorted Images: Decoding Attention Head Behavior for Transparent and Trustworthy AI

Title: Good Keypoints for the Two-View Geometry Estimation Problem

Title: AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning

Title: Frequency Dynamic Convolution for Dense Image Prediction

Title: Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection

Title: Streaming Federated Learning with Markovian Data

Title: CRCL: Causal Representation Consistency Learning for Anomaly Detection in Surveillance Videos

Title: SKDU at De-Factify 4.0: Vision Transformer with Data Augmentation for AI-Generated Image Detection

Title: Defeating Prompt Injections by Design

Title: Interpretable and Fair Mechanisms for Abstaining Classifiers

Title: Unsupervised Detection of Fraudulent Transactions in E-commerce Using Contrastive Learning

Title: Secure Edge Computing Reference Architecture for Data-driven Structural Health Monitoring: Lessons Learned from Implementation and Benchmarking

Title: An End-to-End GSM/SMS Encrypted Approach for Smartphone Employing Advanced Encryption Standard(AES)

Title: HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation

Title: Exploring the Integration of Key-Value Attention Into Pure and Hybrid Transformers for Semantic Segmentation

Title: A semantic communication-based workload-adjustable transceiver for wireless AI-generated content (AIGC) delivery

Title: I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Title: Seeing Speech and Sound: Distinguishing and Locating Audios in Visual Scenes

Title: Efficient and Accurate Scene Text Recognition with Cascaded-Transformers

Title: CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

Title: AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration

Title: xKV: Cross-Layer SVD for KV-Cache Compression

Title: Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

Title: FFN Fusion: Rethinking Sequential Computation in Large Language Models

Title: Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

Title: CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Title: SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction

Title: Training-free Diffusion Acceleration with Bottleneck Sampling

Title: Video-T1: Test-Time Scaling for Video Generation

Title: SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding

Title: DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation

Title: Aether: Geometric-Aware Unified World Modeling

Title: Tuning-Free Amodal Segmentation via the Occlusion-Free Bias of Inpainting Models

Title: Equivariant Image Modeling

Title: Target-Aware Video Diffusion Models