language model

Title: Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey. (arXiv:2312.03014v1 [cs.LG])

Title: Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction. (arXiv:2312.03022v1 [cs.AI])

Title: Evaluating Agents using Social Choice Theory. (arXiv:2312.03121v1 [cs.AI])

Title: FlexModel: A Framework for Interpretability of Distributed Large Language Models. (arXiv:2312.03140v1 [cs.LG])

Title: Teaching Specific Scientific Knowledge into Large Language Models through Additional Training. (arXiv:2312.03360v1 [cs.CL])

Title: Not All Large Language Models (LLMs) Succumb to the "Reversal Curse": A Comparative Study of Deductive Logical Reasoning in BERT and GPT Models. (arXiv:2312.03633v1 [cs.CL])

Title: Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia. (arXiv:2312.03664v1 [cs.AI])

Title: Assertion Enhanced Few-Shot Learning: Instructive Technique for Large Language Models to Generate Educational Explanations. (arXiv:2312.03122v1 [cs.CL])

Title: Corporate Bankruptcy Prediction with Domain-Adapted BERT. (arXiv:2312.03194v1 [cs.CL])

Title: Measuring Misogyny in Natural Language Generation: Preliminary Results from a Case Study on two Reddit Communities. (arXiv:2312.03330v1 [cs.CL])

Title: Compressed Context Memory For Online Language Model Interaction. (arXiv:2312.03414v1 [cs.LG])

Title: Think from Words(TFW): Initiating Human-Like Cognition in Large Language Models Through Think from Words for Japanese Text-level Classification. (arXiv:2312.03458v1 [cs.CL])

Title: DBCopilot: Scaling Natural Language Querying to Massive Databases. (arXiv:2312.03463v1 [cs.CL])

Title: Sig-Networks Toolkit: Signature Networks for Longitudinal Language Modelling. (arXiv:2312.03523v1 [cs.CL])

Title: Holmes: Towards Distributed Training Across Clusters with Heterogeneous NIC Environment. (arXiv:2312.03549v1 [cs.CL])

Title: Evaluating and Mitigating Discrimination in Language Model Decisions. (arXiv:2312.03689v1 [cs.CL])

gpt

Title: XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering. (arXiv:2312.03567v1 [cs.CL])

llm

Title: Inherent limitations of LLMs regarding spatial information. (arXiv:2312.03042v1 [cs.CL])

Title: Clinical Notes Reveal Physician Fatigue. (arXiv:2312.03077v1 [cs.CL])

Title: LLMs for Multi-Modal Knowledge Extraction and Analysis in Intelligence/Safety-Critical Applications. (arXiv:2312.03088v1 [cs.CL])

long context

lora

Title: Deep Reinforcement Learning for Community Battery Scheduling under Uncertainties of Load, PV Generation, and Energy Prices. (arXiv:2312.03008v1 [cs.LG])

Title: I-PHYRE: Interactive Physical Reasoning. (arXiv:2312.03009v1 [cs.AI])

Title: Speculative Exploration on the Concept of Artificial Agents Conducting Autonomous Research. (arXiv:2312.03497v1 [cs.AI])

Title: Active Learning for Abrupt Shifts Change-point Detection via Derivative-Aware Gaussian Processes. (arXiv:2312.03176v1 [cs.LG])

Title: Constrained Bayesian Optimization Under Partial Observations: Balanced Improvements and Provable Convergence. (arXiv:2312.03212v1 [cs.LG])

Title: Run LoRA Run: Faster and Lighter LoRA Implementations. (arXiv:2312.03415v1 [cs.LG])

Title: Search Strategies for Self-driving Laboratories with Pending Experiments. (arXiv:2312.03466v1 [cs.LG])

hallucination

prompt

Title: Exploring Answer Information Methods for Question Generation with Transformers. (arXiv:2312.03483v1 [cs.CL])

Title: PROMISE: A Framework for Model-Driven Stateful Prompt Orchestration. (arXiv:2312.03699v1 [cs.CL])

code

Title: Learning Multi-graph Structure for Temporal Knowledge Graph Reasoning. (arXiv:2312.03004v1 [cs.AI])

Title: Training on Synthetic Data Beats Real Data in Multimodal Relation Extraction. (arXiv:2312.03025v1 [cs.AI])

Title: Generating Interpretable Networks using Hypernetworks. (arXiv:2312.03051v1 [cs.LG])

Title: Can language agents be alternatives to PPO? A Preliminary Empirical Study On OpenAI Gym. (arXiv:2312.03290v1 [cs.AI])

Title: Dyport: Dynamic Importance-based Hypothesis Generation Benchmarking Technique. (arXiv:2312.03303v1 [cs.AI])

Title: Lazy-k: Decoding for Constrained Token Classification. (arXiv:2312.03367v1 [cs.CL])

Title: A Text-to-Text Model for Multilingual Offensive Language Identification. (arXiv:2312.03379v1 [cs.CL])

Title: Interpretability Illusions in the Generalization of Simplified Models. (arXiv:2312.03656v1 [cs.LG])

Title: Transformer-Based Deep Learning Model for Bored Pile Load-Deformation Prediction in Bangkok Subsoil. (arXiv:2312.03041v1 [cs.LG])

Title: REST: Enhancing Group Robustness in DNNs through Reweighted Sparse Training. (arXiv:2312.03044v1 [cs.LG])

Title: Multitask Learning Can Improve Worst-Group Outcomes. (arXiv:2312.03151v1 [cs.LG])

Title: CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models. (arXiv:2312.03256v1 [cs.LG])

Title: Enhancing Molecular Property Prediction via Mixture of Collaborative Experts. (arXiv:2312.03292v1 [cs.LG])

Title: Interpretable Mechanistic Representations for Meal-level Glycemic Control in the Wild. (arXiv:2312.03344v1 [cs.LG])

chat

retrieval augmented generation

rag

Title: Data-Driven Traffic Reconstruction and Kernel Methods for Identifying Stop-and-Go Congestion. (arXiv:2312.03186v1 [cs.LG])

Title: Deep Multimodal Fusion for Surgical Feedback Classification. (arXiv:2312.03231v1 [cs.LG])

Title: Approximating Solutions to the Knapsack Problem using the Lagrangian Dual Framework. (arXiv:2312.03413v1 [cs.LG])

Title: Optimizing Two-Pass Cross-Lingual Transfer Learning: Phoneme Recognition and Phoneme to Grapheme Translation. (arXiv:2312.03312v1 [cs.CL])

Title: Domain Invariant Representation Learning and Sleep Dynamics Modeling for Automatic Sleep Staging. (arXiv:2312.03196v1 [cs.LG])

Title: Physical Symbolic Optimization. (arXiv:2312.03612v1 [cs.LG])

Title: On the Role of Edge Dependency in Graph Generative Models. (arXiv:2312.03691v1 [cs.LG])

multi-run

chain-of-thought

tree-of-thought