2024-01-05

diffusion

Title: Can We Generate Realistic Hands Only Using Convolution?. (arXiv:2401.01951v1 [cs.CV])

Title: Instruct-Imagen: Image Generation with Multi-modal Instruction. (arXiv:2401.01952v1 [cs.CV])

Title: Improving Diffusion-Based Image Synthesis with Context Prediction. (arXiv:2401.02015v1 [cs.CV])

Title: DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection. (arXiv:2401.02032v1 [cs.CV])

Title: Preserving Image Properties Through Initializations in Diffusion Models. (arXiv:2401.02097v1 [cs.CV])

Title: Unified Diffusion-Based Rigid and Non-Rigid Editing with Text and Image Guidance. (arXiv:2401.02126v1 [cs.CV])

Title: GUESS:GradUally Enriching SyntheSis for Text-Driven Human Motion Generation. (arXiv:2401.02142v1 [cs.CV])

Title: Bring Metric Functions into Diffusion Models. (arXiv:2401.02414v1 [cs.CV])

Title: Energy based diffusion generator for efficient sampling of Boltzmann distributions. (arXiv:2401.02080v1 [cs.LG])

Title: Robust Physics Informed Neural Networks. (arXiv:2401.02300v1 [cs.LG])

Title: Integration of physics-informed operator learning and finite element method for parametric learning of partial differential equations. (arXiv:2401.02363v1 [cs.LG])

self-supervised

Title: GPS-SSL: Guided Positive Sampling to Inject Prior Into Self-Supervised Learning. (arXiv:2401.01990v1 [cs.CV])

Title: SuperEdge: Towards a Generalization Model for Self-Supervised Edge Detection. (arXiv:2401.02313v1 [cs.CV])

Title: Learning the 3D Fauna of the Web. (arXiv:2401.02400v1 [cs.CV])

Title: PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques. (arXiv:2401.02122v1 [cs.CL])

Title: SwitchTab: Switched Autoencoders Are Effective Tabular Learners. (arXiv:2401.02013v1 [cs.LG])

Title: Balancing Continual Learning and Fine-tuning for Human Activity Recognition. (arXiv:2401.02255v1 [cs.LG])

Title: Uncertainty-Aware Deep Attention Recurrent Neural Network for Heterogeneous Time Series Imputation. (arXiv:2401.02258v1 [cs.LG])

foundation model

Title: Backdoor Attack on Unpaired Medical Image-Text Foundation Models: A Pilot Study on MedCLIP. (arXiv:2401.01911v1 [cs.CV])

Title: FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene Understanding. (arXiv:2401.01970v1 [cs.CV])

Title: ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation. (arXiv:2401.02326v1 [cs.CV])

Title: LLM Augmented LLMs: Expanding Capabilities through Composition. (arXiv:2401.02412v1 [cs.LG])

Title: LLaMA Pro: Progressive LLaMA with Block Expansion. (arXiv:2401.02415v1 [cs.CL])

generative

Title: Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints. (arXiv:2401.01922v1 [cs.CV])

Title: Bayesian Intrinsic Groupwise Image Registration: Unsupervised Disentanglement of Anatomy and Geometry. (arXiv:2401.02141v1 [cs.CV])

Title: Exploring Boundary of GPT-4V on Marine Analysis: A Preliminary Case Study. (arXiv:2401.02147v1 [cs.CL])

Title: Linguistic Profiling of Deepfakes: An Open Database for Next-Generation Deepfake Detection. (arXiv:2401.02335v1 [cs.CV])

Title: What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs. (arXiv:2401.02411v1 [cs.CV])

Title: ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers. (arXiv:2401.02072v1 [cs.CL])

Title: A Robust Adversary Detection-Deactivation Method for Metaverse-oriented Collaborative Deep Learning. (arXiv:2401.01895v1 [cs.CR])

Title: Representation Learning of Multivariate Time Series using Attention and Adversarial Training. (arXiv:2401.01987v1 [cs.LG])

Title: From Function to Distribution Modeling: A PAC-Generative Approach to Offline Optimization. (arXiv:2401.02019v1 [cs.LG])

anomaly

Title: AUPIMO: Redefining Visual Anomaly Detection Benchmarks with High Speed and Low Tolerance. (arXiv:2401.01984v1 [cs.CV])

Title: Distillation-based fabric anomaly detection. (arXiv:2401.02287v1 [cs.CV])

in-context

Title: Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers. (arXiv:2401.01974v1 [cs.CV])

Title: DIALIGHT: Lightweight Multilingual Development and Evaluation of Task-Oriented Dialogue Systems with Large Language Models. (arXiv:2401.02208v1 [cs.CL])