2025-06-04

Title: Research on Medical Named Entity Identification Based On Prompt-Biomrc Model and Its Application in Intelligent Consultation System

Title: Graph-Based Adversarial Domain Generalization with Anatomical Correlation Knowledge for Cross-User Human Activity Recognition

Title: Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

Title: TaskVAE: Task-Specific Variational Autoencoders for Exemplar Generation in Continual Learning for Human Activity Recognition

Title: Matrix Is All You Need

Title: Turning LLM Activations Quantization-Friendly

Title: Johnny: Structuring Representation Space to Enhance Machine Abstract Reasoning Ability

Title: Towards Unsupervised Training of Matching-based Graph Edit Distance Solver via Preference-aware GAN

Title: Improvement of AMPs Identification with Generative Adversarial Network and Ensemble Classification

Title: SpecMemo: Speculative Decoding is in Your Pocket

Title: Surrogate Interpretable Graph for Random Decision Forests

Title: Coded Robust Aggregation for Distributed Learning under Byzantine Attacks

Title: No Free Lunch in Active Learning: LLM Embedding Quality Dictates Query Strategy Success

Title: NovelHopQA: Diagnosing Multi-Hop Reasoning Failures in Long Narrative Contexts

Title: CNVSRC 2024: The Second Chinese Continuous Visual Speech Recognition Challenge

Title: Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing

Title: Object-centric Self-improving Preference Optimization for Text-to-Image Generation

Title: Are classical deep neural networks weakly adversarially robust?

Title: Fairness through Feedback: Addressing Algorithmic Misgendering in Automatic Gender Recognition

Title: Enhancing Paraphrase Type Generation: The Impact of DPO and RLHF Evaluated with Human-Ranked Data

Title: ChatCFD: an End-to-End CFD Agent with Domain-specific Structured Thinking

Title: Improve Multi-Modal Embedding Learning via Explicit Hard Negative Gradient Amplifying

Title: Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs

Title: The End Of Universal Lifelong Identifiers: Identity Systems For The AI Era

Title: A tertiary review on quantum cryptography

Title: Adaptive Privacy-Preserving SSD

Title: Towards Secure MLOps: Surveying Attacks, Mitigation Strategies, and Research Challenges

Title: Asymmetry by Design: Boosting Cyber Defenders with Differential Access to AI

Title: FinS-Pilot: A Benchmark for Online Financial System

Title: Blockchain Powered Edge Intelligence for U-Healthcare in Privacy Critical and Time Sensitive Environment

Title: Beyond the Protocol: Unveiling Attack Vectors in the Model Context Protocol Ecosystem

Title: Enhancing Multimodal Continual Instruction Tuning with BranchLoRA

Title: Docker under Siege: Securing Containers in the Modern Era

Title: Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges

Title: Generalization Performance of Ensemble Clustering: From Theory to Algorithm

Title: Evaluating the Unseen Capabilities: How Many Theorems Do LLMs Know?

Title: Predicting Blood Type: Assessing Model Performance with ROC Analysis

Title: Privacy-Aware, Public-Aligned: Embedding Risk Detection and Public Values into Scalable Clinical Text De-Identification for Trusted Research Environments

Title: EWGN: Elastic Weight Generation and Context Switching in Deep Learning

Title: An Introduction to Flow Matching and Diffusion Models

Title: Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition

Title: Robust Federated Learning against Noisy Clients via Masked Optimization

Title: RATFM: Retrieval-augmented Time Series Foundation Model for Anomaly Detection

Title: Temporal Causal-based Simulation for Realistic Time-series Generation

Title: SALAD: Systematic Assessment of Machine Unlearing on LLM-Aided Hardware Design

Title: Towards Better Generalization and Interpretability in Unsupervised Concept-Based Models

Title: Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences

Title: SAB3R: Semantic-Augmented Backbone in 3D Reconstruction

Title: Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains

Title: Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models

Title: ReconXF: Graph Reconstruction Attack via Public Feature Explanations on Privatized Node Features and Labels

Title: Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability

Title: Z-Error Loss for Training Neural Networks

Title: Mitigating Data Poisoning Attacks to Local Differential Privacy

Title: HENT-SRT: Hierarchical Efficient Neural Transducer with Self-Distillation for Joint Speech Recognition and Translation

Title: TIIF-Bench: How Does Your T2I Model Follow Your Instructions?

Title: Quantifying task-relevant representational similarity using decision variable correlation

Title: Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360-Degree Firefighting Videos

Title: An Approximation Theory Perspective on Machine Learning

Title: Different Speech Translation Models Encode and Translate Speaker Gender Differently

Title: Echoes of Phonetics: Unveiling Relevant Acoustic Cues for ASR via Feature Attribution

Title: KDRL: Post-Training Reasoning LLMs via Unified Knowledge Distillation and Reinforcement Learning

Title: Leveraging Natural Language Processing to Unravel the Mystery of Life: A Review of NLP Approaches in Genomics, Transcriptomics, and Proteomics

Title: Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

Title: VLCD: Vision-Language Contrastive Distillation for Accurate and Efficient Automatic Placenta Analysis

Title: From Street Views to Urban Science: Discovering Road Safety Factors with Multimodal Large Language Models

Title: Motion aware video generative model

Title: PAIR-Net: Enhancing Egocentric Speaker Detection via Pretrained Audio-Visual Fusion and Alignment Loss

Title: Rig3R: Rig-Aware Conditioning for Learned 3D Reconstruction

Title: Latent Stochastic Interpolants

Title: Angles Don't Lie: Unlocking Training-Efficient RL Through the Model's Own Signals

Title: Why Gradients Rapidly Increase Near the End of Training

Title: Improving Knowledge Distillation Under Unknown Covariate Shift Through Confidence-Guided Data Augmentation

Title: QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation

Title: LAM SIMULATOR: Advancing Data Generation for Large Action Model Training via Online Exploration and Trajectory Feedback

Title: Through a Steerable Lens: Magnifying Neural Network Interpretability via Phase-Based Extrapolation

Title: Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments

Title: Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models

Title: Quantifying Misattribution Unfairness in Authorship Attribution

Title: Are Crypto Ecosystems (De)centralizing? A Framework for Longitudinal Analysis

Title: Something Just Like TRuST : Toxicity Recognition of Span and Target

Title: Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

Title: Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection

Title: RATE-Nav: Region-Aware Termination Enhancement for Zero-shot Object Navigation with Vision-Language Models

Title: Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening

Title: InterRVOS: Interaction-aware Referring Video Object Segmentation

Title: RoadFormer : Local-Global Feature Fusion for Road Surface Classification in Autonomous Driving

Title: MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models

Title: A TRPCA-Inspired Deep Unfolding Network for Hyperspectral Image Denoising via Thresholded t-SVD and Top-K Sparse Transformer

Title: Approximate Borderline Sampling using Granular-Ball for Classification Tasks

Title: ViTNF: Leveraging Neural Fields to Boost Vision Transformers in Generalized Category Discovery

Title: Reconciling Hessian-Informed Acceleration and Scalar-Only Communication for Efficient Federated Zeroth-Order Fine-Tuning

Title: SFBD Flow: A Continuous-Optimization Framework for Training Diffusion Models with Noisy Samples

Title: Exploring Explanations Improves the Robustness of In-Context Learning

Title: Univariate to Multivariate: LLMs as Zero-Shot Predictors for Time-Series Forecasting

Title: GAdaBoost: An Efficient and Robust AdaBoost Algorithm Based on Granular-Ball Structure

Title: Consultant Decoding: Yet Another Synergistic Mechanism

Title: Improving Generalization of Neural Combinatorial Optimization for Vehicle Routing Problems via Test-Time Projection Learning

Title: RRCANet: Recurrent Reusable-Convolution Attention Network for Infrared Small Target Detection

Title: The Devil is in the Darkness: Diffusion-Based Nighttime Dehazing Anchored in Brightness Perception

Title: Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather

Title: GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation

Title: Modelship Attribution: Tracing Multi-Stage Manipulations Across Generative Models

Title: Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology

Title: SingaKids: A Multilingual Multimodal Dialogic Tutor for Language Learning

Title: AERO: A Redirection-Based Optimization Framework Inspired by Judo for Robust Probabilistic Forecasting

Title: Guiding Registration with Emergent Similarity from Pre-Trained Diffusion Models

Title: Gender Inequality in English Textbooks Around the World: an NLP Approach

Title: Comparative Analysis of AI Agent Architectures for Entity Relationship Classification

Title: From Anger to Joy: How Nationality Personas Shape Emotion Attribution in Large Language Models

Title: Empowering Functional Neuroimaging: A Pre-trained Generative Framework for Unified Representation of Neural Signals

Title: A Review of Various Datasets for Machine Learning Algorithm-Based Intrusion Detection System: Advances and Challenges

Title: Video-Level Language-Driven Video-Based Visible-Infrared Person Re-Identification

Title: Should LLM Safety Be More Than Refusing Harmful Instructions?

Title: SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios

Title: IP-Dialog: Evaluating Implicit Personalization in Dialogue Systems with Synthetic Data

Title: Weak Supervision for Real World Graphs

Title: ANT: Adaptive Neural Temporal-Aware Text-to-Motion Model

Title: Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework

Title: ReSpace: Text-Driven 3D Scene Synthesis and Editing with Preference Alignment

Title: MidPO: Dual Preference Optimization for Safety and Helpfulness in Large Language Models via a Mixture of Experts Framework

Title: XToM: Exploring the Multilingual Theory of Mind for Large Language Models

Title: HRTR: A Single-stage Transformer for Fine-grained Sub-second Action Segmentation in Stroke Rehabilitation

Title: Generative Perception of Shape and Material from Differential Motion

Title: Towards Better De-raining Generalization via Rainy Characteristics Memorization and Replay

Title: FroM: Frobenius Norm-Based Data-Free Adaptive Model Merging

Title: BitBypass: A New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage

Title: ORPP: Self-Optimizing Role-playing Prompts to Enhance Language Model Capabilities

Title: Do Language Models Think Consistently? A Study of Value Preferences Across Varying Response Lengths

Title: Enhancing Large Language Models with Neurosymbolic Reasoning for Multilingual Tasks

Title: Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models

Title: Co-Evidential Fusion with Information Volume for Medical Image Segmentation

Title: Towards In-the-wild 3D Plane Reconstruction from a Single Image

Title: LumosFlow: Motion-Guided Long Video Generation

Title: KARE-RAG: Knowledge-Aware Refinement and Enhancement for RAG

Title: M$^3$FinMeeting: A Multilingual, Multi-Sector, and Multi-Task Financial Meeting Understanding Evaluation Dataset

Title: RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Title: Answer Convergence as a Signal for Early Stopping in Reasoning

Title: VisuRiddles: Fine-grained Perception is a Primary Bottleneck for Multimodal Large Language Models in Abstract Visual Reasoning

Title: VerificAgent: Integrating Expert Knowledge and Fact-Checked Memory for Robust Domain-Specific Task Planning

Title: Rethinking Post-Unlearning Behavior of Large Vision-Language Models

Title: CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG

Title: Attention Knows Whom to Trust: Attention-based Trust Management for LLM Multi-Agent Systems

Title: CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale

Title: Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025

Title: Response-Level Rewards Are All You Need for Online Reinforcement Learning in LLMs: A Mathematical Perspective

Title: Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models

Title: DCI: Dual-Conditional Inversion for Boosting Diffusion-Based Image Editing

Title: Pruning General Large Language Models into Customized Expert Models

Title: Privacy-Preserving Federated Convex Optimization: Balancing Partial-Participation and Efficiency via Noise Cancellation

Title: Contrast & Compress: Learning Lightweight Embeddings for Short Trajectories

Title: HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference

Title: IndoSafety: Culturally Grounded Safety for LLMs in Indonesian Languages

Title: Evaluating Named Entity Recognition Models for Russian Cultural News Texts: From BERT to LLM

Title: On Generalization across Measurement Systems: LLMs Entail More Test-Time Compute for Underrepresented Cultures

Title: Beyond the Surface: Measuring Self-Preference in LLM Judgments

Title: EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing

Title: Hyperspectral Image Generation with Unmixing Guided Diffusion Model

Title: One-Step Diffusion-based Real-World Image Super-Resolution with Visual Perception Distillation

Title: Simple, Good, Fast: Self-Supervised World Models Free of Baggage

Title: Synthetic Iris Image Databases and Identity Leakage: Risks and Mitigation Strategies

Title: HAM: A Hyperbolic Step to Regulate Implicit Bias

Title: ControlMambaIR: Conditional Controls with State-Space Model for Image Restoration

Title: A Pretrained Probabilistic Transformer for City-Scale Traffic Volume Prediction

Title: Are Economists Always More Introverted? Analyzing Consistency in Persona-Assigned LLMs

Title: Tarallo: Evading Behavioral Malware Detectors in the Problem Space

Title: Beyond Invisibility: Learning Robust Visible Watermarks for Stronger Copyright Protection

Title: Small Aid, Big Leap: Efficient Test-Time Adaptation for Vision-Language Models with AdaptNet

Title: EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving

Title: Decentralized COVID-19 Health System Leveraging Blockchain

Title: Self-Disentanglement and Re-Composition for Cross-Domain Few-Shot Segmentation

Title: TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Title: Poster: FedBlockParadox -- A Framework for Simulating and Securing Decentralized Federated Learning

Title: Solving Inverse Problems with FLAIR

Title: Decompose, Plan in Parallel, and Merge: A Novel Paradigm for Large Language Models based Planning with Multiple Constraints

Title: MASTER: Enhancing Large Language Model via Multi-Agent Simulated Teaching

Title: XicorAttention: Time Series Transformer Using Attention with Nonlinear Correlation

Title: LayoutRAG: Retrieval-Augmented Model for Content-agnostic Conditional Layout Generation

Title: Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences

Title: On Entity Identification in Language Models

Title: Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack

Title: Heterogeneous Group-Based Reinforcement Learning for LLM-based Multi-Agent Systems

Title: RACE-Align: Retrieval-Augmented and Chain-of-Thought Enhanced Preference Alignment for Large Language Models

Title: GeneA-SLAM2: Dynamic SLAM with AutoEncoder-Preprocessed Genetic Keypoints Resampling and Depth Variance-Guided Dynamic Region Removal

Title: Open-PMC-18M: A High-Fidelity Large Scale Medical Dataset for Multimodal Representation Learning

Title: RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS

Title: Multi-task Learning with Active Learning for Arabic Offensive Speech Detection

Title: Investigating Mask-aware Prototype Learning for Tabular Anomaly Detection

Title: Exploiting the English Vocabulary Profile for L2 word-level vocabulary assessment with LLMs

Title: Unified Attention Modeling for Efficient Free-Viewing and Visual Search via Shared Representations

Title: A Dynamic Transformer Network for Vehicle Detection

Title: FreeScene: Mixed Graph Diffusion for 3D Scene Synthesis from Free Prompts

Title: Automated Measurement of Optic Nerve Sheath Diameter Using Ocular Ultrasound Video

Title: SemVink: Advancing VLMs' Semantic Understanding of Optical Illusions via Visual Global Thinking

Title: CART-based Synthetic Tabular Data Generation for Imbalanced Regression

Title: ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations

Title: TO-GATE: Clarifying Questions and Summarizing Responses with Trajectory Optimization for Eliciting Human Preference

Title: Random Registers for Cross-Domain Few-Shot Learning

Title: Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments

Title: METok: Multi-Stage Event-based Token Compression for Efficient Long Video Understanding

Title: Learning Pyramid-structured Long-range Dependencies for 3D Human Pose Estimation

Title: Hierarchical Self-Prompting SAM: A Prompt-Free Medical Image Segmentation Framework

Title: Enhancing Abnormality Identification: Robust Out-of-Distribution Strategies for Deepfake Detection

Title: ATAG: AI-Agent Application Threat Assessment with Attack Graphs

Title: BNPO: Beta Normalization Policy Optimization

Title: Pan-Arctic Permafrost Landform and Human-built Infrastructure Feature Detection with Vision Transformers and Location Embeddings

Title: Token and Span Classification for Entity Recognition in French Historical Encyclopedias

Title: CoT is Not True Reasoning, It Is Just a Tight Constraint to Imitate: A Theory Perspective

Title: GaRA-SAM: Robustifying Segment Anything Model with Gated-Rank Adaptation

Title: Overcoming Challenges of Partial Client Participation in Federated Learning : A Comprehensive Review

Title: Scaling Fine-Grained MoE Beyond 50B Parameters: Empirical Evaluation and Practical Insights

Title: Dense Match Summarization for Faster Two-view Estimation

Title: A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation

Title: Sociodynamics-inspired Adaptive Coalition and Client Selection in Federated Learning

Title: Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning

Title: Towards Auto-Annotation from Annotation Guidelines: A Benchmark through 3D LiDAR Detection

Title: INESC-ID @ eRisk 2025: Exploring Fine-Tuned, Similarity-Based, and Prompt-Based Approaches to Depression Symptom Identification

Title: From Theory to Practice with RAVEN-UCB: Addressing Non-Stationarity in Multi-Armed Bandits through Variance Adaptation

Title: MTL-KD: Multi-Task Learning Via Knowledge Distillation for Generalizable Neural Vehicle Routing Solver

Title: MIND: Material Interface Generation from UDFs for Non-Manifold Surface Reconstruction

Title: An Algorithmic Pipeline for GDPR-Compliant Healthcare Data Anonymisation: Moving Toward Standardisation

Title: Quantitative LLM Judges

Title: Adaptive Graph Pruning for Multi-Agent Communication

Title: HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring

Title: FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models

Title: FORLA:Federated Object-centric Representation Learning with Slot Attention

Title: Memory-Efficient and Privacy-Preserving Collaborative Training for Mixture-of-Experts LLMs

Title: Computation- and Communication-Efficient Online FL for Resource-Constrained Aerial Vehicles

Title: Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation

Title: HaploOmni: Unified Single Transformer for Multimodal Video Understanding and Generation

Title: On the Robustness of Tabular Foundation Models: Test-Time Attacks and In-Context Defenses

Title: Astrophotography turbulence mitigation via generative models

Title: Performance of leading large language models in May 2025 in Membership of the Royal College of General Practitioners-style examination questions: a cross-sectional analysis

Title: It's Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems

Title: A Multi-Agent Framework for Mitigating Dialect Biases in Privacy Policy Question-Answering Systems

Title: DFBench: Benchmarking Deepfake Image Detection Capability of Large Multimodal Models

Title: Conditioning Large Language Models on Legal Systems? Detecting Punishable Hate Speech

Title: Leveraging Information Retrieval to Enhance Spoken Language Understanding Prompts in Few-Shot Learning

Title: Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective

Title: Sample complexity of Schrödinger potential estimation

Title: Facts Do Care About Your Language: Assessing Answer Quality of Multilingual LLMs

Title: Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers

Title: EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models

Title: LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM

Title: Agnostic Learning under Targeted Poisoning: Optimal Rates and the Role of Randomness

Title: StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs

Title: ORV: 4D Occupancy-centric Robot Video Generation

Title: SG2VID: Scene Graphs Enable Fine-Grained Control for Video Synthesis

Title: InterMamba: Efficient Human-Human Interaction Generation with Adaptive Spatio-Temporal Mamba

Title: Non-Asymptotic Length Generalization

Title: How Explanations Leak the Decision Logic: Stealing Graph Neural Networks via Explanation Alignment

Title: Explicitly Modeling Subcortical Vision with a Neuro-Inspired Front-End Improves CNN Robustness

Title: From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Title: FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens

Title: EgoVLM: Policy Optimization for Egocentric Video Understanding

Title: Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback

Title: ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions

Title: Revisiting Continuity of Image Tokens for Cross-domain Few-shot Learning

Title: Rectified Flows for Fast Multiscale Fluid Flow Modeling

Title: Zero-Shot Tree Detection and Segmentation from Aerial Forest Imagery

Title: Controllable Human-centric Keyframe Interpolation with Generative Prior

Title: AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation

Title: DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation

Title: AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation

Title: Native-Resolution Image Synthesis

Title: SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation

Title: Not All Tokens Are Meant to Be Forgotten

Title: GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Title: Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM

Title: UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Title: IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation