2025-01-03

Title: A Breadth-First Catalog of Text Processing, Speech Processing and Multimodal Research in South Asian Languages

Title: Distilling Large Language Models for Efficient Clinical Information Extraction

Title: Highly Optimized Kernels and Fine-Grained Codebooks for LLM Inference on Arm CPUs

Title: Resource-Efficient Transformer Architecture: Optimizing Memory and Execution Time for Real-Time Applications

Title: Learning in Multiple Spaces: Few-Shot Network Attack Detection with Metric-Fused Prototypical Networks

Title: DDD-GenDT: Dynamic Data-driven Generative Digital Twin Framework

Title: AdvAnchor: Enhancing Diffusion Model Unlearning with Adversarial Anchors

Title: LLM-Virus: Evolutionary Jailbreak Attack on Large Language Models

Title: VisTabNet: Adapting Vision Transformers for Tabular Data

Title: Large Language Models for Mathematical Analysis

Title: ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis

Title: "Generative Models for Financial Time Series Data: Enhancing Signal-to-Noise Ratio and Addressing Data Scarcity in A-Share Market

Title: On Adversarial Robustness of Language Models in Transfer Learning

Title: Adversarial Negotiation Dynamics in Generative Language Models

Title: ICLR: In-Context Learning of Representations

Title: Open-Book Neural Algorithmic Reasoning

Title: Position Information Emerges in Causal Transformers Without Positional Encodings via Similarity of Nearby Embeddings

Title: A Novel Framework for Learning Stochastic Representations for Sequence Generation and Recognition

Title: Machine Learning-Based Security Policy Analysis

Title: LTX-Video: Realtime Video Latent Diffusion

Title: Text-to-Image GAN with Pretrained Representations

Title: PQD: Post-training Quantization for Efficient Diffusion Models

Title: Detection-Fusion for Knowledge Graph Extraction from Videos

Title: Minimalist Vision with Freeform Pixels

Title: Temporal reasoning for timeline summarisation in social media

Title: Measuring Large Language Models Capacity to Annotate Journalistic Sourcing

Title: Federated Learning with Workload Reduction through Partial Training of Client Models and Entropy-Based Data Selection

Title: The Text Classification Pipeline: Starting Shallow going Deeper

Title: TrajLearn: Trajectory Prediction Learning using Deep Generative Models

Title: Interactive cybersecurity training system based on simulation environments

Title: MLLM-as-a-Judge for Image Safety without Human Labeling

Title: A Pseudo-random Number Generator for Multi-Sequence Generation with Programmable Statistics

Title: Towards Unraveling and Improving Generalization in World Models

Title: GPT-4 on Clinic Depression Assessment: An LLM-Based Pilot Study

Title: An Empirical Evaluation of Large Language Models on Consumer Health Questions

Title: OciorMVBA: Near-Optimal Error-Free Asynchronous MVBA

Title: DecoratingFusion: A LiDAR-Camera Fusion Network with the Combination of Point-level and Feature-level Fusion

Title: Extracting effective solutions hidden in large language models via generated comprehensive specialists: case studies in developing electronic devices

Title: Federated Deep Subspace Clustering

Title: Zero-Shot Strategies for Length-Controllable Summarization

Title: Make Domain Shift a Catastrophic Forgetting Alleviator in Class-Incremental Learning

Title: Exploring Variability in Fine-Tuned Models for Text Classification with DistilBERT

Title: Cross-Layer Cache Aggregation for Token Reduction in Ultra-Fine-Grained Image Recognition

Title: Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking

Title: EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta

Title: Detection and Prevention of Smishing Attacks

Title: Collaborative Approaches to Enhancing Smart Vehicle Cybersecurity by AI-Driven Threat Detection

Title: Enhancing Wireless Sensor Network Security through Integration with the ServiceNow Cloud Platform

Title: Outlier-Robust Training of Machine Learning Models

Title: A review of faithfulness metrics for hallucination assessment in Large Language Models

Title: Echoes in AI: Quantifying Lack of Plot Diversity in LLM Outputs

Title: LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts

Title: ReFormer: Generating Radio Fakes for Data Augmentation

Title: Dual Diffusion for Unified Image Generation and Understanding

Title: Research on vehicle detection based on improved YOLOv8 network

Title: SAM-Aware Graph Prompt Reasoning Network for Cross-Domain Few-Shot Segmentation

Title: diffIRM: A Diffusion-Augmented Invariant Risk Minimization Framework for Spatiotemporal Prediction over Graphs

Title: OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

Title: OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies

Title: Exploring the Implicit Semantic Ability of Multimodal Large Language Models: A Pilot Study on Entity Set Expansion

Title: MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation

Title: Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition

Title: Chunk-Distilled Language Modeling

Title: PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM

Title: RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Title: A New Dataset and Methodology for Malicious URL Classification

Title: A Novel Shape Guided Transformer Network for Instance Segmentation in Remote Sensing Images

Title: SPDZCoder: Teaching LLMs to Synthesize Privacy Computing Code without Massive Training Data

Title: Low-Rank Adaptation for Foundation Models: A Comprehensive Review

Title: Token Pruning for Caching Better: 9 Times Acceleration on Stable Diffusion for Free

Title: Federated Dropout: Convergence Analysis and Resource Allocation

Title: Efficient Relational Context Perception for Knowledge Graph Completion

Title: Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models

Title: KAE: Kolmogorov-Arnold Auto-Encoder for Representation Learning

Title: Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages

Title: Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents

Title: OV-HHIR: Open Vocabulary Human Interaction Recognition Using Cross-modal Integration of Large Language Models

Title: Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning

Title: METANOIA: A Lifelong Intrusion Detection and Investigation System for Mitigating Concept Drift

Title: DEHYDRATOR: Enhancing Provenance Graph Storage via Hierarchical Encoding and Sequence Generation

Title: Differentiable Prompt Learning for Vision Language Models

Title: SAT-LDM: Provably Generalizable Image Watermarking for Latent Diffusion Models with Self-Augmented Training

Title: Addressing Challenges in Data Quality and Model Generalization for Malaria Detection

Title: Dementia Detection using Multi-modal Methods on Audio Data

Title: Score-Based Metropolis-Hastings Algorithms

Title: Exploring Physics-Informed Neural Networks for Crop Yield Loss Forecasting

Title: Fine-grained Video-Text Retrieval: A New Benchmark and Method

Title: A Method for Enhancing the Safety of Large Model Generation Based on Multi-dimensional Attack and Defense

Title: Innovative Silicosis and Pneumonia Classification: Leveraging Graph Transformer Post-hoc Modeling and Ensemble Techniques

Title: Is Segment Anything Model 2 All You Need for Surgery Video Segmentation? A Systematic Evaluation

Title: Exploiting Boundary Loss for the Hierarchical Panoptic Segmentation of Plants and Leaves

Title: Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches

Title: Superposition in Transformers: A Novel Way of Building Mixture of Experts

Title: Monty Hall and Optimized Conformal Prediction to Improve Decision-Making with LLMs

Title: AraSTEM: A Native Arabic Multiple Choice Question Benchmark for Evaluating LLMs Knowledge In STEM Subjects

Title: An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems

Title: Probing Visual Language Priors in VLMs

Title: KnowRA: Knowledge Retrieval Augmented Method for Document-level Relation Extraction with Comprehensive Reasoning Abilities

Title: VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

Title: Causal Graph Guided Steering of LLM Values via Prompts and Sparse Autoencoders

Title: Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method

Title: Sidewalk Hazard Detection Using Variational Autoencoder and One-Class SVM

Title: Setting Standards in Turkish NLP: TR-MMLU for Large Language Model Evaluation

Title: Unbiased GNN Learning via Fairness-Aware Subgraph Diffusion

Title: VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Title: DreamDrive: Generative 4D Scene Modeling from Street View Images

Title: STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes

Title: DiC: Rethinking Conv3x3 Designs in Diffusion Models

Title: Time-Varying Graph Learning for Data with Heavy-Tailed Distribution

Title: A Study on Context Length and Efficient Transformers for Biomedical Image Analysis

Title: Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting

Title: Applying Graph Explanation to Operator Fusion

Title: Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation

Title: Efficient Standardization of Clinical Notes using Large Language Models

Title: SoundBrush: Sound as a Brush for Visual Scene Editing

Title: Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models

Title: Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

Title: Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting a Petroglyph

Title: Titans: Learning to Memorize at Test Time

Title: Deeply Learned Robust Matrix Completion for Large-scale Low-rank Data Recovery

Title: IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently

Title: Labels Generated by Large Language Model Helps Measuring People's Empathy in Vitro

Title: Adjoint sharding for very long context training of state space models

Title: Knowledge-Guided Prompt Learning for Deepfake Facial Image Detection

Title: NN-ResDMD: Learning Koopman Representations for Complex Dynamics with Spectral Residuals

Title: Kolmogorov GAM Networks are all you need!

Title: Everywhere Attack: Attacking Locally and Globally to Boost Targeted Transferability

Title: KAN KAN Buff Signed Graph Neural Networks?

Title: Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

Title: CODEOFCONDUCT at Multilingual Counterspeech Generation: A Context-Aware Model for Robust Counterspeech Generation in Low-Resource Languages

Title: DDD: Discriminative Difficulty Distance for plant disease diagnosis

Title: RORem: Training a Robust Object Remover with Human-in-the-Loop

Title: Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines

Title: DIVE: Diversified Iterative Self-Improvement

Title: Foreground-Covering Prototype Generation and Matching for SAM-Aided Few-Shot Segmentation

Title: Beyond Static Datasets: A Behavior-Driven Entity-Specific Simulation to Overcome Data Scarcity and Train Effective Crypto Anti-Money Laundering Models

Title: Less is More: Token Context-aware Learning for Object Tracking

Title: Enhancing Transformers for Generalizable First-Order Logical Entailment

Title: Beyond Words: AuralLLM and SignMST-C for Precise Sign Language Production and Bidirectional Accessibility

Title: Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements

Title: FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation

Title: Navigating Nuance: In Quest for Political Truth

Title: Shifting-Merging: Secure, High-Capacity and Efficient Steganography via Large Language Models

Title: LENS-XAI: Redefining Lightweight and Explainable Network Security through Knowledge Distillation and Variational Autoencoders for Scalable Intrusion Detection in Cybersecurity

Title: Multimodal Large Models Are Effective Action Anticipators

Title: Make Shuffling Great Again: A Side-Channel Resistant Fisher-Yates Algorithm for Protecting Neural Networks

Title: Reasoning-Oriented and Analogy-Based Methods for Locating and Editing in Zero-Shot Event-Relational Reasoning

Title: MixSA: Training-free Reference-based Sketch Extraction via Mixture-of-Self-Attention

Title: Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention

Title: Information Sifting Funnel: Privacy-preserving Collaborative Inference Against Model Inversion Attacks

Title: Embedding Style Beyond Topics: Analyzing Dispersion Effects Across Different Language Models

Title: LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning about Actions

Title: Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation

Title: A Survey of Secure Semantic Communications

Title: Scale-wise Bidirectional Alignment Network for Referring Remote Sensing Image Segmentation

Title: DiffETM: Diffusion Process Enhanced Embedded Topic Model

Title: Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation

Title: Exploring Structured Semantic Priors Underlying Diffusion Score for Test-time Adaptation

Title: LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models

Title: FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation

Title: TrustRAG: Enhancing Robustness and Trustworthiness in RAG

Title: Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Title: FullTransNet: Full Transformer with Local-Global Attention for Video Summarization

Title: Diversity Optimization for Travelling Salesman Problem via Deep Reinforcement Learning

Title: Representation in large language models

Title: Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization

Title: Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model

Title: Population Aware Diffusion for Time Series Generation

Title: Aligning LLMs with Domain Invariant Reward Models

Title: Hierarchical Vision-Language Alignment for Text-to-Image Generation via Diffusion Models

Title: On the Low-Complexity of Fair Learning for Combinatorial Multi-Armed Bandit

Title: Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition

Title: SPADE: Enhancing Adaptive Cyber Deception Strategies with Generative AI and Structured Prompt Engineering

Title: A Novel Diffusion Model for Pairwise Geoscience Data Generation with Unbalanced Training Dataset

Title: Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Title: Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion

Title: Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model

Title: Incremental Dialogue Management: Survey, Discussion, and Implications for HRI

Title: The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations

Title: CoordFlow: Coordinate Flow for Pixel-wise Neural Video Representation

Title: Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice

Title: Optimizing Noise Schedules of Generative Models in High Dimensionss

Title: Is It Still Fair? Investigating Gender Fairness in Cross-Corpus Speech Emotion Recognition

Title: Exploring Information Processing in Large Language Models: Insights from Information Bottleneck Theory

Title: Physics-informed Gaussian Processes for Safe Envelope Expansion

Title: Multi-Objective Optimization-Based Anonymization of Structured Data for Machine Learning

Title: EasySplat: View-Adaptive Learning makes 3D Gaussian Splatting Easy

Title: Prediction of Geoeffective CMEs Using SOHO Images and Deep Learning

Title: MDSF: Context-Aware Multi-Dimensional Data Storytelling Framework based on Large language Model

Title: Boosting Adversarial Transferability with Spatial Adversarial Alignment

Title: Efficient Connectivity-Preserving Instance Segmentation with Supervoxel-Based Loss Function

Title: Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer

Title: Towards Adversarially Robust Deep Metric Learning

Title: KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model

Title: State-of-the-art AI-based Learning Approaches for Deepfake Generation and Detection, Analyzing Opportunities, Threading through Pros, Cons, and Future Prospects

Title: ValuesRAG: Enhancing Cultural Alignment Through Retrieval-Augmented Contextual Learning

Title: DynamicLip: Shape-Independent Continuous Authentication via Lip Articulator Dynamics

Title: Advancing Singlish Understanding: Bridging the Gap with Datasets and Multimodal Models

Title: MSWA: Refining Local Attention with Multi-ScaleWindow Attention

Title: Event Masked Autoencoder: Point-wise Action Recognition with Event-Based Cameras

Title: Image-based Multimodal Models as Intruders: Transferable Multimodal Attacks on Video-based MLLMs

Title: Dynamic Scaling of Unit Tests for Code Reward Modeling

Title: Risks of Cultural Erasure in Large Language Models

Title: Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models

Title: FAPL-DM-BC: A Secure and Scalable FL Framework with Adaptive Privacy and Dynamic Masking, Blockchain, and XAI for the IoVs

Title: BeliN: A Novel Corpus for Bengali Religious News Headline Generation using Contextual Feature Fusion

Title: Evidential Calibrated Uncertainty-Guided Interactive Segmentation paradigm for Ultrasound Images

Title: Graph Generative Pre-trained Transformer

Title: iCNN-LSTM: A batch-based incremental ransomware detection system using Sysmon

Title: Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning

Title: Bridging Simplicity and Sophistication using GLinear: A Novel Architecture for Enhanced Time Series Prediction

Title: A Sysmon Incremental Learning System for Ransomware Analysis and Detection

Title: HoneypotNet: Backdoor Attacks Against Model Extraction

Title: EliGen: Entity-Level Controlled Image Generation with Regional Attention

Title: Long-range Brain Graph Transformer

Title: AIM: Additional Image Guided Generation of Transferable Adversarial Attacks

Title: MalCL: Leveraging GAN-Based Generative Replay to Combat Catastrophic Forgetting in Malware Classification

Title: Leverage Cross-Attention for End-to-End Open-Vocabulary Panoptic Reconstruction

Title: Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Title: Graph2text or Graph2token: A Perspective of Large Language Models for Graph Learning

Title: DuMo: Dual Encoder Modulation Network for Precise Concept Erasure

Title: Source-free Semantic Regularization Learning for Semi-supervised Domain Adaptation

Title: InDeed: Interpretable image deep decomposition with guaranteed generalizability

Title: An Inclusive Theoretical Framework of Robust Supervised Contrastive Loss against Label Noise

Title: Privacy Bills of Materials: A Transparent Privacy Information Inventory for Collaborative Privacy Notice Generation in Mobile App Development

Title: Missing Data as Augmentation in the Earth Observation Domain: A Multi-View Learning Approach

Title: Adaptive Hardness-driven Augmentation and Alignment Strategies for Multi-Source Domain Adaptations

Title: BlockDialect: Block-wise Fine-grained Mixed Format for Energy-Efficient LLM Inference

Title: PoVF: Empowering Decentralized Blockchain Systems with Verifiable Function Consensus

Title: TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions

Title: Leveraging Full Dependency Parsing Graph Information For Biomedical Event Extraction

Title: 3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer

Title: Towards Interactive Deepfake Analysis

Title: Deep Learning in Palmprint Recognition-A Comprehensive Survey

Title: Machine Learning-Based Prediction of ICU Readmissions in Intracerebral Hemorrhage Patients: Insights from the MIMIC Databases

Title: Vulnerability-Aware Spatio-Temporal Learning for Generalizable and Interpretable Deepfake Video Detection

Title: NET-SA: An Efficient Secure Aggregation Architecture Based on In-Network Computing

Title: A Game Between the Defender and the Attacker for Trigger-based Black-box Model Watermarking

Title: LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge

Title: Empirical Analysis of Nature-Inspired Algorithms for Autism Spectrum Disorder Detection Using 3D Video Dataset

Title: Real-time Cross-modal Cybersickness Prediction in Virtual Reality

Title: TabTreeFormer: Tree Augmented Tabular Data Generation using Transformers

Title: Classification of Operational Records in Aviation Using Deep Learning Approaches

Title: Conditional Consistency Guided Image Translation and Enhancement

Title: Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Title: SVFR: A Unified Framework for Generalized Video Face Restoration

Title: Automated Self-Refinement and Self-Correction for LLM-based Product Attribute Value Extraction

Title: EHCTNet: Enhanced Hybrid of CNN and Transformer Network for Remote Sensing Image Change Detection

Title: Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants

Title: SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Title: Large Language Model-Enhanced Symbolic Reasoning for Knowledge Base Completion

Title: Digital Guardians: Can GPT-4, Perspective API, and Moderation API reliably detect hate speech in reader comments of German online newspapers?

Title: CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Title: Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging

Title: Stealthy Backdoor Attack to Real-world Models in Android Apps

Title: ProgCo: Program Helps Self-Correction of Large Language Models

Title: Does a Large Language Model Really Speak in Human-Like Language?

Title: HybridTrack: A Hybrid Approach for Robust Multi-Object Tracking

Title: ToolComp: A Multi-Tool Reasoning & Process Supervision Benchmark

Title: Large Language Models for Mental Health Diagnostic Assessments: Exploring The Potential of Large Language Models for Assisting with Mental Health Diagnostic Assessments -- The Depression and Anxiety Case

Title: Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking

Title: Multi-Head Explainer: A General Framework to Improve Explainability in CNNs and Transformers

Title: SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration

Title: Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension

Title: Analysis of Security in OS-Level Virtualization

Title: CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models

Title: Aligning Large Language Models for Faithful Integrity Against Opposing Argument

Title: Machine Learning for Modeling Wireless Radio Metrics with Crowdsourced Data and Local Environment Features

Title: Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

Title: Test-time Controllable Image Generation by Explicit Spatial Constraint Enforcement

Title: Iris Recognition for Infants

Title: OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios

Title: A Unified Hyperparameter Optimization Pipeline for Transformer-Based Time Series Forecasting Models

Title: Best Transition Matrix Esitimation or Best Label Noise Robustness Classifier? Two Possible Methods to Enhance the Performance of T-revision

Title: nnY-Net: Swin-NeXt with Cross-Attention for 3D Medical Images Segmentation

Title: Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension

Title: A Multi-task Supervised Compression Model for Split Computing

Title: R-SCoRe: Revisiting Scene Coordinate Regression for Robust Large-Scale Visual Localization

Title: Multi-Modal Video Feature Extraction for Popularity Prediction

Title: Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Title: Object-level Visual Prompts for Compositional Image Generation

Title: Unifying Specialized Visual Encoders for Video Language Models

Title: VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control