2024-12-20

Title: Improving Generalization Performance of YOLOv8 for Camera Trap Object Detection

Title: Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs

Title: Distilled Pooling Transformer Encoder for Efficient Realistic Image Dehazing

Title: FedSTaS: Client Stratification and Client Level Sampling for Efficient Federated Learning

Title: ViTmiX: Vision Transformer Explainability Augmented by Mixed Visualization Methods

Title: Split Learning in Computer Vision for Semantic Segmentation Delay Minimization

Title: Fake News Detection: Comparative Evaluation of BERT-like Models and Large Language Models with Generative AI-Annotated Data

Title: PixelMan: Consistent Object Editing with Diffusion Models via Pixel Manipulation and Generation

Title: TRecViT: A Recurrent Video Transformer

Title: Distributionally Robust Policy Learning under Concept Drifts

Title: What Has Been Overlooked in Contrastive Source-Free Domain Adaptation: Leveraging Source-Informed Latent Augmentation within Neighborhood Context

Title: Multi-OphthaLingua: A Multilingual Benchmark for Assessing and Debiasing LLM Ophthalmological QA in LMICs

Title: Stealing That Free Lunch: Exposing the Limits of Dyna-Style Reinforcement Learning

Title: Covariances for Free: Exploiting Mean Distributions for Federated Learning with Pre-Trained Models

Title: Personalized Generative Low-light Image Denoising and Enhancement

Title: Semantic Role Labeling of NomBank Partitives

Title: Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters

Title: A Unifying Information-theoretic Perspective on Evaluating Generative Models

Title: State Space Models are Strong Text Rerankers

Title: Dynamic semantic VSLAM with known and unknown objects

Title: ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals

Title: Surrealistic-like Image Generation with Vision-Language Models

Title: Memorization Over Reasoning? Exposing and Mitigating Verbatim Memorization in Large Language Models' Character Understanding Evaluation

Title: SEREP: Semantic Facial Expression Representation for Robust In-the-Wild Capture and Retargeting

Title: ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling

Title: Differentially Private Multi-objective Selection: Pareto and Aggregation Approaches

Title: Nemesis: Noise-randomized Encryption with Modular Efficiency and Secure Integration in Machine Learning Systems

Title: Enhancing Fingerprint Recognition Systems: Comparative Analysis of Biometric Authentication Algorithms and Techniques for Improved Accuracy and Reliability

Title: DriveGPT: Scaling Autoregressive Behavior Models for Driving

Title: Enhancing Diffusion Models for High-Quality Image Generation

Title: FedPIA -- Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning

Title: All-in-One Tuning and Structural Pruning for Domain-Specific LLMs

Title: IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features

Title: Cherry-Picking in Time Series Forecasting: How to Select Datasets to Make Your Model Shine

Title: ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study

Title: GenHMR: Generative Human Mesh Recovery

Title: Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation

Title: LEDiff: Latent Exposure Diffusion for HDR Generation

Title: From Human Annotation to LLMs: SILICON Annotation Workflow for Management Research

Title: Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion

Title: LiftRefine: Progressively Refined View Synthesis from 3D Lifting with Volume-Triplane Representations

Title: DiffusionTrend: A Minimalist Approach to Virtual Fashion Try-On

Title: Towards Provable Security in Industrial Control Systems Via Dynamic Protocol Attestation

Title: Agent-SafetyBench: Evaluating the Safety of LLM Agents

Title: Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Title: Promptable Representation Distribution Learning and Data Augmentation for Gigapixel Histopathology WSI Analysis

Title: DirectorLLM for Human-Centric Video Generation

Title: MAIDS: Malicious Agent Identification-based Data Security Model for Cloud Environments

Title: Drive-1-to-3: Enriching Diffusion Priors for Novel View Synthesis of Real Vehicles

Title: FedMUP: Federated Learning driven Malicious User Prediction Model for Secure Data Distribution in Cloud Environments

Title: Content-style disentangled representation for controllable artistic image stylization and generation

Title: Guided Diffusion Model for Sensor Data Obfuscation

Title: Do Large Language Models Defend Inferentialist Semantics?: On the Logical Expressivism and Anti-Representationalism of LLMs

Title: A hybrid framework for effective and efficient machine unlearning

Title: PA-RAG: RAG Alignment via Multi-Perspective Preference Optimization

Title: Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment

Title: Efficient Self-Supervised Video Hashing with Selective State Spaces

Title: CAE-T: A Channelwise AutoEncoder with Transformer for EEG Abnormality Detection

Title: Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models

Title: Leveraging Time Series Categorization and Temporal Fusion Transformers to Improve Cryptocurrency Price Forecasting

Title: Consistent Human Image and Video Generation with Spatially Conditioned Diffusion

Title: Downscaling Precipitation with Bias-informed Conditional Diffusion Model

Title: Transformer models are gauge invariant: A mathematical connection between AI and particle physics

Title: Summary of Point Transformer with Federated Learning for Predicting Breast Cancer HER2 Status from Hematoxylin and Eosin-Stained Whole Slide Images

Title: {S$^3$-Mamba}: Small-Size-Sensitive Mamba for Lesion Segmentation

Title: Single-Loop Federated Actor-Critic across Heterogeneous Environments

Title: ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model

Title: GBRIP: Granular Ball Representation for Imbalanced Partial Label Learning

Title: AIArena: A Blockchain-Based Decentralized AI Training Platform

Title: Global Spatio-Temporal Fusion-based Traffic Prediction Algorithm with Anomaly Aware

Title: SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection

Title: Alignment-Free RGB-T Salient Object Detection: A Large-scale Dataset and Progressive Correlation Network

Title: DiffSim: Taming Diffusion Models for Evaluating Visual Similarity

Title: CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation

Title: Simulation-Free Hierarchical Latent Policy Planning for Proactive Dialogues

Title: HiCM$^2$: Hierarchical Compact Memory Modeling for Dense Video Captioning

Title: Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Title: Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning

Title: Multi-Sensor Object Anomaly Detection: Unifying Appearance, Geometry, and Internal Properties

Title: LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining

Title: Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization Through Spare-Coding Transformer

Title: KARRIEREWEGE: A Large Scale Career Path Prediction Dataset

Title: How good is GPT at writing political speeches for the White House?

Title: Pitfalls of topology-aware image segmentation

Title: Learning to Generate Research Idea with Dynamic Control

Title: Qua$^2$SeDiMo: Quantifiable Quantization Sensitivity of Diffusion Models

Title: Robust PCA Based on Adaptive Weighted Least Squares and Low-Rank Matrix Factorization

Title: Unified Image Restoration and Enhancement: Degradation Calibrated Cycle Reconstruction Diffusion Model

Title: Review of Fruit Tree Image Segmentation

Title: Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

Title: Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot Learning

Title: RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

Title: Length Controlled Generation for Black-box LLMs

Title: Unveiling Uncertainty: A Deep Dive into Calibration and Performance of Multimodal Large Language Models

Title: LoLaFL: Low-Latency Federated Learning via Forward-only Propagation

Title: Simplicity over Complexity: An ARN-Based Intrusion Detection Method for Industrial Control Network

Title: Analysis and Visualization of Linguistic Structures in Large Language Models: Neural Representations of Verb-Particle Constructions in BERT

Title: MUSTER: Longitudinal Deformable Registration by Composition of Consecutive Deformations

Title: FiVL: A Framework for Improved Vision-Language Alignment

Title: LLMs as mediators: Can they diagnose conflicts accurately?

Title: Lorentzian Residual Neural Networks

Title: Event-assisted 12-stop HDR Imaging of Dynamic Scene

Title: EnergyMoGen: Compositional Human Motion Generation with Energy-Based Diffusion Model in Latent Space

Title: Holistic Adversarially Robust Pruning

Title: FROC: Building Fair ROC from a Trained Classifier

Title: Generative AI for Banks: Benchmarks and Algorithms for Synthetic Financial Transaction Data

Title: On Verbalized Confidence Scores for LLMs

Title: Boosting GNN Performance via Training Sample Selection Based on Adversarial Robustness Evaluation

Title: Query pipeline optimization for cancer patient question answering systems

Title: FLAMe: Federated Learning with Attention Mechanism using Spatio-Temporal Keypoint Transformers for Pedestrian Fall Detection in Smart Cities

Title: PsyDraw: A Multi-Agent Multimodal System for Mental Health Screening in Left-Behind Children

Title: ALKAFI-LLAMA3: Fine-Tuning LLMs for Precise Legal Understanding in Palestine

Title: Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning

Title: Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

Title: ResoFilter: Rine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis

Title: MARIA: a Multimodal Transformer Model for Incomplete Healthcare Data

Title: Non-intrusive and Unconstrained Keystroke Inference in VR Platforms via Infrared Side Channel

Title: Explainable Tampered Text Detection via Multimodal Large Models

Title: Multi-Level Embedding and Alignment Network with Consistency and Invariance Learning for Cross-View Geo-Localization

Title: PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic Segmentation

Title: Mention Attention for Pronoun Translation

Title: Federated Heavy Hitter Analytics with Local Differential Privacy

Title: Synchronized and Fine-Grained Head for Skeleton-Based Ambiguous Action Recognition

Title: Progressive Multimodal Reasoning via Active Retrieval

Title: Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Title: A Survey of RWKV

Title: DS$^2$-ABSA: Dual-Stream Data Synthesis with Label Refinement for Few-Shot Aspect-Based Sentiment Analysis

Title: Position: A taxonomy for reporting and describing AI security incidents

Title: Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling

Title: Graph-Convolutional Networks: Named Entity Recognition and Large Language Model Embedding in Document Clustering

Title: Large-scale School Mapping using Weakly Supervised Deep Learning for Universal School Connectivity

Title: Multimodal Hypothetical Summary for Retrieval-based Multi-image Question Answering

Title: Diffusion priors for Bayesian 3D reconstruction from incomplete measurements

Title: MagicNaming: Consistent Identity Generation by Finding a "Name Space" in T2I Diffusion Models

Title: Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

Title: RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Title: Automatic Spectral Calibration of Hyperspectral Images:Method, Dataset and Benchmark

Title: TDCNet: Transparent Objects Depth Completion with CNN-Transformer Dual-Branch Parallel Network

Title: IDOL: Instant Photorealistic 3D Human Creation from a Single Image

Title: Knowledge Injection via Prompt Distillation

Title: Movie2Story: A framework for understanding videos and telling stories in the form of novel text

Title: Chain-of-MetaWriting: Linguistic and Textual Analysis of How Small Language Models Write Young Students Texts

Title: Stitch Contrast and Segment_Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos

Title: Large Language Models and Code Security: A Systematic Literature Review

Title: Robust Federated Learning in the Face of Covariate Shift: A Magnitude Pruning with Hybrid Regularization Framework for Enhanced Model Aggregation

Title: DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space

Title: LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Title: Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion

Title: GIRAFE: Glottal Imaging Dataset for Advanced Segmentation, Analysis, and Facilitative Playbacks Evaluation

Title: MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance

Title: ConfliBERT: A Language Model for Political Conflict

Title: ScamChatBot: An End-to-End Analysis of Fake Account Recovery on Social Media via Chatbots

Title: AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling

Title: Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation

Title: A Full Transformer-based Framework for Automatic Pain Estimation using Videos

Title: Review-Then-Refine: A Dynamic Framework for Multi-Hop Question Answering with Temporal Adaptability

Title: Qwen2.5 Technical Report

Title: Outcome-Refining Process Supervision for Code Generation

Title: Efficient Ranking, Order Statistics, and Sorting under CKKS

Title: Adaptive Pruning for Large Language Models with Structural Importance Awareness

Title: Jet: A Modern Transformer-Based Normalizing Flow

Title: Leveraging Color Channel Independence for Improved Unsupervised Object Detection

Title: Language Models as Continuous Self-Evolving Data Engineers

Title: Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

Title: OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization

Title: Rethinking Uncertainty Estimation in Natural Language Generation

Title: Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning

Title: Tiled Diffusion

Title: LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation

Title: AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation

Title: MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark

Title: DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Title: AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving

Title: OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving

Title: PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation

Title: Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation

Title: Scaling 4D Representations

Title: Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Title: LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis