2025-08-04

Title: PhysicsEval: Inference-Time Techniques to Improve the Reasoning Proficiency of Large Language Models on Physics Problems

Title: Do LLMs produce texts with "human-like" lexical diversity?

Title: FACTORY: A Challenging Human-Verified Prompt Set for Long-Form Factuality

Title: Comparison of Large Language Models for Deployment Requirements

Title: Tabular Data Understanding with LLMs: A Survey of Recent Advances and Challenges

Title: Semantic Compression for Word and Sentence Embeddings using Discrete Wavelet Transform

Title: Model Misalignment and Language Change: Traces of AI-Associated Language in Unscripted Spoken English

Title: Integrating clinical reasoning into large language model-based diagnosis through etiology-aware attention steering

Title: Systematic Evaluation of Optimization Techniques for Long-Context Language Models

Title: PilotRL: Training Language Model Agents via Global Planning-Guided Progressive Reinforcement Learning

Title: Lucy: edgerunning agentic web search on mobile with machine generated task vectors

Title: EdgeInfinite-Instruct: Bridging SFT-Based Optimization and NPU-Level Efficiency for Edge Devices

Title: Multi-Layer Attention is the Amplifier of Demonstration Effectiveness

Title: SA-GCS: Semantic-Aware Gaussian Curriculum Scheduling for UAV Vision-Language Navigation

Title: ReaGAN: Node-as-Agent-Reasoning Graph Agentic Network

Title: Learning an Efficient Multi-Turn Dialogue Evaluator from Multiple Judges

Title: GETALP@AutoMin 2025: Leveraging RAG to Answer Questions based on Meeting Transcripts

Title: EFlat-LoRA: Efficiently Seeking Flat Minima for Better Generalization in Fine-Tuning Large Language Models and Beyond

Title: PaPaformer: Language Model from Pre-trained Paraller Paths

Title: SynAdapt: Learning Adaptive Reasoning in Large Language Models via Synthetic Continuous Chain-of-Thought

Title: A Context-Aware Dual-Metric Framework for Confidence Estimation in Large Language Models

Title: Prompting Science Report 3: I'll pay you or I'll kill you -- but will you care?

Title: DACTYL: Diverse Adversarial Corpus of Texts Yielded from Large Language Models

Title: Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications

Title: MELAC: Massive Evaluation of Large Language Models with Alignment of Culture in Persian Language

Title: Team "better_call_claude": Style Change Detection using a Sequential Sentence Pair Classifier

Title: Better Call Claude: Can LLMs Detect Changes of Writing Style?

Title: NyayaRAG: Realistic Legal Judgment Prediction with RAG under the Indian Common Law System

Title: Dynamically Adaptive Reasoning via LLM-Guided MCTS for Efficient and Context-Aware KGQA

Title: Out-of-Context Abduction: LLMs Make Inferences About Procedural Data Leveraging Declarative Facts in Earlier Training Data

Title: Applying Psychometrics to Large Language Model Simulated Populations: Recreating the HEXACO Personality Inventory Experiment with Generative Agents

Title: Agentic large language models improve retrieval-based radiology question answering

Title: GLiDRE: Generalist Lightweight model for Document-level Relation Extraction

Title: MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection under Cloaking Perturbations

Title: ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation

Title: Do They Understand Them? An Updated Evaluation on Nonbinary Pronoun Handling in Large Language Models

Title: Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models