2025-02-18

Title: Hallucinations and Truth: A Comprehensive Accuracy Evaluation of RAG, LoRA and DoRA

Title: Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias

Title: Named entity recognition for Serbian legal documents: Design, methodology and dataset development

Title: Post-training an LLM for RAG? Train on Self-Generated Demonstrations

Title: Retrieval-augmented Encoders for Extreme Multi-label Text Classification

Title: Lost in the Passage: Passage-level In-context Learning Does Not Necessarily Need a "Passage"

Title: BabyLM Turns 3: Call for papers for the 2025 BabyLM workshop

Title: User Profile with Large Language Models: Construction, Updating, and Benchmarking

Title: Exploring Synaptic Resonance in Large Language Models: A Novel Approach to Contextual Memory Integration

Title: Injecting Domain-Specific Knowledge into Large Language Models: A Comprehensive Survey

Title: An Empirical Analysis of Uncertainty in Large Language Model Evaluations

Title: OPTISHEAR: Towards Efficient and Adaptive Pruning of Large Language Models via Evolutionary Optimization

Title: BASE-SQL: A powerful open source Text-To-SQL baseline approach

Title: 1bit-Merging: Dynamic Quantized Merging for Large Language Models

Title: LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging

Title: Why is prompting hard? Understanding prompts on binary sequence predictors

Title: Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models

Title: Multilingual Encoder Knows more than You Realize: Shared Weights Pretraining for Extremely Low-Resource Languages

Title: Towards Effective Extraction and Evaluation of Factual Claims

Title: Divergent Thoughts toward One Goal: LLM-based Multi-Agent Collaboration System for Electronic Design Automation

Title: NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering

Title: The Representation and Recall of Interwoven Structured Knowledge in LLMs: A Geometric and Layered Analysis

Title: CiteCheck: Towards Accurate Citation Faithfulness Detection

Title: MET-Bench: Multimodal Entity Tracking for Evaluating the Limitations of Vision-Language and Reasoning Models

Title: Developing Conversational Speech Systems for Robots to Detect Speech Biomarkers of Cognition in People Living with Dementia

Title: Enhancing Conversational Agents from Open-Source Large Language Models with Illocutionary Force and Document-Based Knowledge Retrieval

Title: Fundamental Principles of Linguistic Structure are Not Represented by o3

Title: Exploring Contextual Flux in Large Language Models: A Novel Approach to Self-Modulating Semantic Networks

Title: Neural Networks Remember More: The Power of Parameter Isolation and Combination

Title: FinMTEB: Finance Massive Text Embedding Benchmark

Title: RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization

Title: Evaluating Large language models on Understanding Korean indirect Speech acts

Title: RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation

Title: CounterBench: A Benchmark for Counterfactuals Reasoning in Large Language Models

Title: GRIFFIN: Effective Token Alignment for Faster Speculative Decoding

Title: TUMLU: A Unified and Native Language Understanding Benchmark for Turkic Languages

Title: MultiTEND: A Multilingual Benchmark for Natural Language to NoSQL Query Translation

Title: Mind the Confidence Gap: Overconfidence, Calibration, and Distractor Effects in Large Language Models

Title: MMUNLEARNER: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models

Title: Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models

Title: Beyond Similarity: A Gradient-based Graph Method for Instruction Tuning Data Selection

Title: CARMA: Enhanced Compositionality in LLMs via Advanced Regularisation and Mutual Information Alignment

Title: Demystifying Hateful Content: Leveraging Large Multimodal Models for Hateful Meme Detection with Explainable Decisions

Title: Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models

Title: DEEPER Insight into Your User: Directed Persona Refinement for Dynamic Persona Modeling

Title: Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks

Title: Rewrite to Jailbreak: Discover Learnable and Transferable Implicit Harmfulness Instruction

Title: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Title: SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak Attacks

Title: A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions

Title: Towards Achieving Concept Completeness for Unsupervised Textual Concept Bottleneck Models

Title: CacheFocus: Dynamic Cache Re-Positioning for Efficient Retrieval-Augmented Generation

Title: Knowledge Graph-Driven Retrieval-Augmented Generation: Integrating Deepseek-R1 with Weaviate for Advanced Chatbot Applications

Title: Valuable Hallucinations: Realizable Non-realistic Propositions

Title: Beyond Pairwise: Global Zero-shot Temporal Graph Generation

Title: DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities

Title: FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching

Title: Improving Similar Case Retrieval Ranking Performance By Revisiting RankSVM

Title: Safety Evaluation of DeepSeek Models in Chinese Contexts

Title: Leveraging Constrained Monte Carlo Tree Search to Generate Reliable Long Chain-of-Thought for Mathematical Reasoning

Title: Investigating Language Preference of Multilingual RAG Systems

Title: LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning

Title: The Mirage of Model Editing: Revisiting Evaluation in the Wild

Title: Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls

Title: Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs

Title: TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking

Title: ReLearn: Unlearning via Learning for Large Language Models

Title: Large Language Models Penetration in Scholarly Writing and Peer Review

Title: A Survey of LLM-based Agents in Medicine: How far are we from Baymax?

Title: Asymmetric Conflict and Synergy in Post-training for LLM-based Multilingual Machine Translation

Title: Vendi-RAG: Adaptively Trading-Off Diversity And Quality Significantly Improves Retrieval Augmented Generation With LLMs

Title: Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment

Title: Uncertainty-Aware Step-wise Verification with Generative Reward Models

Title: Leveraging Conditional Mutual Information to Improve Large Language Model Fine-Tuning For Classification

Title: The Shrinking Landscape of Linguistic Diversity in the Age of Large Language Models

Title: Improved Unbiased Watermark for Large Language Models

Title: Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest

Title: The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval

Title: CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships?

Title: Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation

Title: System Message Generation for User Preferences using Open-Source Models

Title: ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability

Title: "Nuclear Deployed!": Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents

Title: VLDBench: Vision Language Models Disinformation Detection Benchmark

Title: Blessing of Multilinguality: A Systematic Analysis of Multilingual In-Context Learning

Title: LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing

Title: Exploring the Small World of Word Embeddings: A Comparative Study on Conceptual Spaces from LLMs of Different Scales

Title: RoleMRC: A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following

Title: HellaSwag-Pro: A Large-Scale Bilingual Benchmark for Evaluating the Robustness of LLMs in Commonsense Reasoning

Title: Revisiting Robust RAG: Do We Still Need Complex Robust Training in the Era of Powerful LLMs?

Title: Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment

Title: ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models

Title: LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy

Title: InsBank: Evolving Instruction Subset for Ongoing Alignment

Title: Exploring Persona Sentiment Sensitivity in Personalized Dialogue Generation

Title: Counterfactual-Consistency Prompting for Relative Temporal Understanding in Large Language Models

Title: Do we Really Need Visual Instructions? Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models

Title: SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL

Title: An Efficient Row-Based Sparse Fine-Tuning

Title: Which Retain Set Matters for LLM Unlearning? A Case Study on Entity Unlearning

Title: Does RAG Really Perform Bad For Long-Context Processing?

Title: From Personas to Talks: Revisiting the Impact of Personas on LLM-Synthesized Emotional Support Conversations

Title: UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization

Title: Aligning Sentence Simplification with ESL Learner's Proficiency for Language Acquisition

Title: UnitCoder: Scalable Iterative Code Synthesis with Unit Test Guidance

Title: GLTW: Joint Improved Graph Transformer and LLM via Three-Word Language for Knowledge Graph Completion

Title: FastMCTS: A Simple Sampling Strategy for Data Synthesis

Title: Ontology-Guided Reverse Thinking Makes Large Language Models Stronger on Knowledge Graph Question Answering

Title: DAST: Context-Aware Compression in LLMs via Dynamic Allocation of Soft Tokens

Title: Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More

Title: Balanced Multi-Factor In-Context Learning for Multilingual Large Language Models

Title: Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?

Title: Chinese Spelling Correction: A Comprehensive Survey of Progress, Challenges, and Opportunities

Title: Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study

Title: Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding

Title: AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification

Title: Training Large Language Models to be Better Rule Followers

Title: Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy

Title: MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training

Title: Evaluating o1-Like LLMs: Unlocking Reasoning for Translation through Comprehensive Analysis

Title: DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection

Title: Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models

Title: Reinforced Information Retrieval

Title: Towards Reasoning Ability of Small Language Models

Title: FaMTEB: Massive Text Embedding Benchmark in Persian Language

Title: InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

Title: Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance

Title: Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?

Title: DR.GAP: Mitigating Bias in Large Language Models using Gender-Aware Prompting with Demonstration and Reasoning

Title: Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

Title: Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL

Title: Diversity-Oriented Data Augmentation with Large Language Models

Title: Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception

Title: RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars

Title: MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task

Title: Improve LLM-as-a-Judge Ability as a General Ability

Title: CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation

Title: LLM Agents Making Agent Tools

Title: Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models

Title: "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

Title: Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment

Title: MT-RAIG: Novel Benchmark and Evaluation Framework for Retrieval-Augmented Insight Generation over Multiple Tables

Title: ReviewEval: An Evaluation Framework for AI-Generated Reviews

Title: Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation

Title: The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It

Title: Efficient Response Generation Method Selection for Fine-Tuning Large Language Models

Title: Personality Editing for Language Models through Relevant Knowledge Editing

Title: Exploring Translation Mechanism of Large Language Models

Title: FineFilter: A Fine-grained Noise Filtering Mechanism for Retrieval-Augmented Large Language Models

Title: Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis

Title: M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis

Title: Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation Capabilities

Title: Text Classification in the LLM Era - Where do we stand?

Title: Can LLM Agents Maintain a Persona in Discourse?

Title: LLMs as a synthesis between symbolic and continuous approaches to language

Title: Exploring Large Language Models in Healthcare: Insights into Corpora Sources, Customization Strategies, and Evaluation Metrics

Title: Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu

Title: Southern Newswire Corpus: A Large-Scale Dataset of Mid-Century Wire Articles Beyond the Front Page

Title: VAQUUM: Are Vague Quantifiers Grounded in Visual Data?

Title: Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity

Title: MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation

Title: EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models

Title: BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages

Title: On Representational Dissociation of Language and Arithmetic in Large Language Models

Title: Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Title: Can Your Uncertainty Scores Detect Hallucinated Entity?

Title: Navigating the Helpfulness-Truthfulness Trade-Off with Uncertainty-Aware Instruction Fine-Tuning

Title: Generating Text from Uniform Meaning Representation

Title: Presumed Cultural Identity: How Names Shape LLM Responses

Title: Merging Language and Domain Specific Models: The Impact on Technical Vocabulary Acquisition

Title: Atom of Thoughts for Markov LLM Test-Time Scaling

Title: Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving

Title: A Dual-Perspective NLG Meta-Evaluation Framework with Automatic Benchmark and Better Interpretability

Title: Designing Role Vectors to Improve LLM Inference Behaviour

Title: AI-generated Text Detection with a GLTR-based Approach

Title: Formalizing Complex Mathematical Statements with LLMs: A Study on Mathematical Definitions

Title: TokenSkip: Controllable Chain-of-Thought Compression in LLMs

Title: Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation

Title: AdaSplash: Adaptive Sparse Flash Attention

Title: VLM$^2$-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Title: Personality Structured Interview for Large Language Model Simulation in Personality Research

Title: A-MEM: Agentic Memory for LLM Agents

Title: On the Query Complexity of Verifier-Assisted Language Generation

Title: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs

Title: REVERSUM: A Multi-staged Retrieval-Augmented Generation Method to Enhance Wikipedia Tail Biographies through Personal Narratives

Title: Idiosyncrasies in Large Language Models