2024-01-05

language model

Title: Generalist embedding models are better at short-context clinical semantic search than specialized embedding models. (arXiv:2401.01943v1 [cs.CL])

Title: A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity. (arXiv:2401.01967v1 [cs.CL])

Title: Revisiting Zero-Shot Abstractive Summarization in the Era of Large Language Models from the Perspective of Position Bias. (arXiv:2401.01989v1 [cs.CL])

Title: Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives. (arXiv:2401.02009v1 [cs.CL])

Title: DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models. (arXiv:2401.02132v1 [cs.CL])

Title: TinyLlama: An Open-Source Small Language Model. (arXiv:2401.02385v1 [cs.CL])

Title: ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers. (arXiv:2401.02072v1 [cs.CL])

Title: DIALIGHT: Lightweight Multilingual Development and Evaluation of Task-Oriented Dialogue Systems with Large Language Models. (arXiv:2401.02208v1 [cs.CL])

Title: Beyond Extraction: Contextualising Tabular Data for Efficient Summarisation by Language Models. (arXiv:2401.02333v1 [cs.LG])

Title: LLaMA Pro: Progressive LLaMA with Block Expansion. (arXiv:2401.02415v1 [cs.CL])

gpt

Title: Text2MDT: Extracting Medical Decision Trees from Medical Texts. (arXiv:2401.02034v1 [cs.CL])

Title: Re-evaluating the Memory-balanced Pipeline Parallelism: BPipe. (arXiv:2401.02088v1 [cs.LG])

Title: Exploring Boundary of GPT-4V on Marine Analysis: A Preliminary Case Study. (arXiv:2401.02147v1 [cs.CL])

llm

Title: LLM Augmented LLMs: Expanding Capabilities through Composition. (arXiv:2401.02412v1 [cs.LG])

Title: Understanding LLMs: A Comprehensive Overview from Training to Inference. (arXiv:2401.02038v1 [cs.CL])

Title: Using LLM to select the right SQL Query from candidates. (arXiv:2401.02115v1 [cs.CL])

Title: Are LLMs Robust for Spoken Dialogues?. (arXiv:2401.02297v1 [cs.CL])

Title: SPEER: Sentence-Level Planning of Long Clinical Summaries via Embedded Entity Retrieval. (arXiv:2401.02369v1 [cs.CL])

Title: A Robust Quantile Huber Loss With Interpretable Parameter Adjustment In Distributional Reinforcement Learning. (arXiv:2401.02325v1 [cs.LG])

long context

lora

Title: Trajectory-Oriented Policy Optimization with Sparse Rewards. (arXiv:2401.02225v1 [cs.LG])

hallucination

prompt

code

Title: Location Aware Modular Biencoder for Tourism Question Answering. (arXiv:2401.02187v1 [cs.CL])

Title: Representation Learning of Multivariate Time Series using Attention and Adversarial Training. (arXiv:2401.01987v1 [cs.LG])

Title: SwitchTab: Switched Autoencoders Are Effective Tabular Learners. (arXiv:2401.02013v1 [cs.LG])

Title: Energy based diffusion generator for efficient sampling of Boltzmann distributions. (arXiv:2401.02080v1 [cs.LG])

chat

retrieval augmented generation

retrieval-augmented generation

rag

Title: Shayona@SMM4H23: COVID-19 Self diagnosis classification using BERT and LightGBM models. (arXiv:2401.02158v1 [cs.CL])

Title: Uncertainty-Aware Deep Attention Recurrent Neural Network for Heterogeneous Time Series Imputation. (arXiv:2401.02258v1 [cs.LG])

Title: PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques. (arXiv:2401.02122v1 [cs.CL])

Title: L3Cube-IndicNews: News-based Short Text and Long Document Classification Datasets in Indic Languages. (arXiv:2401.02254v1 [cs.CL])

Title: Beyond Regrets: Geometric Metrics for Bayesian Optimization. (arXiv:2401.01981v1 [cs.LG])

Title: Two-Stage Surrogate Modeling for Data-Driven Design Optimization with Application to Composite Microstructure Generation. (arXiv:2401.02008v1 [cs.LG])

Title: Fast & Fair: Efficient Second-Order Robust Optimization for Fairness in Machine Learning. (arXiv:2401.02012v1 [cs.LG])

Title: Balancing Continual Learning and Fine-tuning for Human Activity Recognition. (arXiv:2401.02255v1 [cs.LG])

Title: Training Single-Layer Morphological Perceptron Using Convex-Concave Programming. (arXiv:2401.02296v1 [cs.LG])

Title: Not all Minorities are Equal: Empty-Class-Aware Distillation for Heterogeneous Federated Learning. (arXiv:2401.02329v1 [cs.LG])

Title: Multi-Source Domain Adaptation with Transformer-based Feature Generation for Subject-Independent EEG-based Emotion Recognition. (arXiv:2401.02344v1 [cs.LG])

multi-run

chain-of-thought

tree-of-thought

agent

Title: Decentralized Multi-Task Online Convex Optimization Under Random Link Failures. (arXiv:2401.02011v1 [cs.LG])