2025-06-23

Title: Veracity: An Open-Source AI Fact-Checking System

Title: Rethinking LLM Training through Information Geometry and Quantum Metrics

Title: MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents

Title: Finance Language Model Evaluation (FLaME)

Title: Entropy-Driven Pre-Tokenization for Byte-Pair Encoding

Title: Language Models can perform Single-Utterance Self-Correction of Perturbed Reasoning

Title: From RAG to Agentic: Validating Islamic-Medicine Responses with LLM Agents

Title: Reranking-based Generation for Unbiased Perspective Summarization

Title: From General to Targeted Rewards: Surpassing GPT-4 in Open-Ended Long-Context Generation

Title: EvoLM: In Search of Lost Language Model Training Dynamics

Title: Enhancing Document-Level Question Answering via Multi-Hop Retrieval-Augmented Generation with LLaMA 3

Title: DynScaling: Efficient Verifier-free Inference Scaling via Dynamic and Integrated Sampling

Title: Self-Critique-Guided Curiosity Refinement: Enhancing Honesty and Helpfulness in Large Language Models via In-Context Learning

Title: FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning

Title: Under the Shadow of Babel: How Language Shapes Reasoning in LLMs

Title: SGIC: A Self-Guided Iterative Calibration Framework for RAG

Title: JETHICS: Japanese Ethics Understanding Evaluation Dataset

Title: Comparative Analysis of Abstractive Summarization Models for Clinical Radiology Reports

Title: PL-Guard: Benchmarking Language Model Safety for Polish

Title: Analyzing the Influence of Knowledge Graph Information on Relation Extraction

Title: Can structural correspondences ground real world representational content in Large Language Models?

Title: InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems

Title: Large Language Models in Argument Mining: A Survey

Title: RiOT: Efficient Prompt Refinement with Residual Optimization Tree

Title: From LLM-anation to LLM-orchestrator: Coordinating Small Models for Data Labeling

Title: OJBench: A Competition Level Code Benchmark For Large Language Models

Title: NepaliGPT: A Generative Language Model for the Nepali Language

Title: When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework

Title: REIS: A High-Performance and Energy-Efficient Retrieval System with In-Storage Processing

Title: StoryWriter: A Multi-Agent Framework for Long Story Generation

Title: Towards Generalizable Generic Harmful Speech Datasets for Implicit Hate Speech Detection

Title: Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples

Title: Measuring (a Sufficient) World Model in LLMs: A Variance Decomposition Framework

Title: A Scoping Review of Synthetic Data Generation for Biomedical Research and Applications

Title: Initial Investigation of LLM-Assisted Development of Rule-Based Clinical NLP System

Title: Arch-Router: Aligning LLM Routing with Human Preferences

Title: Mechanisms vs. Outcomes: Probing for Syntax Fails to Explain Performance on Targeted Syntactic Evaluations

Title: LegiGPT: Party Politics and Transport Policy with Large Language Model

Title: ReasonGRM: Enhancing Generative Reward Models through Large Reasoning Models

Title: The Role of Model Confidence on Bias Effects in Measured Uncertainties

Title: LM-SPT: LM-Aligned Semantic Distillation for Speech Tokenization

Title: Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly

Title: SocialSim: Towards Socialized Simulation of Emotional Support Conversation

Title: Cross-Modal Obfuscation for Jailbreak Attacks on Large Vision-Language Models

Title: DistillNote: LLM-based clinical note summaries improve heart failure diagnosis

Title: MIST: Jailbreaking Black-box Large Language Models via Iterative Semantic Tuning

Title: From Data to Knowledge: Evaluating How Efficiently Language Models Learn Facts

Title: Language Bottleneck Models: A Framework for Interpretable Knowledge Tracing and Beyond

Title: TeXpert: A Multi-Level Benchmark for Evaluating LaTeX Code Generation by LLMs

Title: PersonalAI: Towards digital twins in the graph form

Title: LLM-Generated Feedback Supports Learning If Learners Choose to Use It

Title: Instituto de Telecomunicações at IWSLT 2025: Aligning Small-Scale Speech and Language Models for Speech-to-Text Learning

Title: MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models

Title: Simultaneous Translation with Offline Speech and LLM Models in CUNI Submission to IWSLT 2025

Title: Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs

Title: Chain-of-Thought Prompting Obscures Hallucination Cues in Large Language Models: An Empirical Evaluation

Title: Better Language Model Inversion by Compactly Representing Next-Token Distributions

Title: Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?

Title: CLEAR-3K: Assessing Causal Explanatory Capabilities in Language Models

Title: Towards AI Search Paradigm

Title: Fine-Tuning Lowers Safety and Disrupts Evaluation Consistency