2024-11-26

Title: Can Open-source LLMs Enhance Data Augmentation for Toxic Detection?: An Experimental Study

Title: Hybrid Gaussian Process Regression with Temporal Feature Extraction for Partially Interpretable Remaining Useful Life Interval Prediction in Aeroengine Prognostics

Title: Gradient-Weighted Feature Back-Projection: A Fast Alternative to Feature Distillation in 3D Gaussian Splatting

Title: Graph Neural Network-Based Entity Extraction and Relationship Reasoning in Complex Knowledge Graphs

Title: Adaptively Controllable Diffusion Model for Efficient Conditional Image Generation

Title: Deep Learning-Based Classification of Hyperkinetic Movement Disorders in Children

Title: Beyond Visual Understanding: Introducing PARROT-360V for Vision Language Model Benchmarking

Title: Multimodal large language model for wheat breeding: a new exploration of smart breeding

Title: Uni-Mlip: Unified Self-supervision for Medical Vision Language Pre-training

Title: Quantized symbolic time series approximation

Title: Towards Million-Scale Adversarial Robustness Evaluation With Stronger Individual Attacks

Title: LightLLM: A Versatile Large Language Model for Predictive Light Sensing

Title: Image Harmonization using Robust Restricted CDF Matching

Title: Urban Region Embeddings from Service-Specific Mobile Traffic Data

Title: S$^2$ALM: Sequence-Structure Pre-trained Large Language Model for Comprehensive Antibody Representation Learning

Title: Sampling with Adaptive Variance for Multimodal Distributions

Title: Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry

Title: Rethinking the Intermediate Features in Adversarial Attacks: Misleading Robotic Models via Adversarial Distillation

Title: Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation

Title: IterIS: Iterative Inference-Solving Alignment for LoRA Merging

Title: BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models

Title: Text Embedding is Not All You Need: Attention Control for Text-to-Image Semantic Alignment with Text Self-Attention Maps

Title: Stain-Invariant Representation for Tissue Classification in Histology Images

Title: Faithful Label-free Knowledge Distillation

Title: Is Attention All You Need For Actigraphy? Foundation Models of Wearable Accelerometer Data for Mental Health Research

Title: The Zamba2 Suite: Technical Report

Title: Adversarial Prompt Distillation for Vision-Language Models

Title: Exploring the Robustness and Transferability of Patch-Based Adversarial Attacks in Quantized Neural Networks

Title: Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Title: LocRef-Diffusion:Tuning-Free Layout and Appearance-Guided Generation

Title: The Explabox: Model-Agnostic Machine Learning Transparency & Analysis

Title: VIVID-10M: A Dataset and Baseline for Versatile and Interactive Video Local Editing

Title: MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation

Title: Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI

Title: BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques

Title: EADReg: Probabilistic Correspondence Generation with Efficient Autoregressive Diffusion Model for Outdoor Point Cloud Registration

Title: Curriculum-enhanced GroupDRO: Challenging the Norm of Avoiding Curriculum Learning in Subpopulation Shift Setups

Title: ElastiFormer: Learned Redundancy Reduction in Transformer via Self-Distillation

Title: When Spatial meets Temporal in Action Recognition

Title: Forecasting Unseen Points of Interest Visits Using Context and Proximity Priors

Title: Sycophancy in Large Language Models: Causes and Mitigations

Title: MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Title: PPLqa: An Unsupervised Information-Theoretic Quality Metric for Comparing Generative Large Language Models

Title: Exploiting Watermark-Based Defense Mechanisms in Text-to-Image Diffusion Models for Unauthorized Data Usage

Title: Transforming NLU with Babylon: A Case Study in Development of Real-time, Edge-Efficient, Multi-Intent Translation System for Automated Drive-Thru Ordering

Title: On the Impact of Fine-Tuning on Chain-of-Thought Reasoning

Title: From Jack of All Trades to Master of One: Specializing LLM-based Autoraters to a Test Set

Title: A Constrast-Agnostic Method for Ultra-High Resolution Claustrum Segmentation

Title: Gradient-Free Classifier Guidance for Diffusion Model Sampling

Title: Efficient Online Inference of Vision Transformers by Training-Free Tokenization

Title: Partial Knowledge Distillation for Alleviating the Inherent Inter-Class Discrepancy in Federated Learning

Title: A Comparative Analysis of Transformer and LSTM Models for Detecting Suicidal Ideation on Reddit

Title: Exploring Large Language Models for Multimodal Sentiment Analysis: Challenges, Benchmarks, and Future Directions

Title: FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity

Title: FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Generation

Title: Least Privilege Access for Persistent Storage Mechanisms in Web Browsers

Title: OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining

Title: LDM-Morph: Latent diffusion model guided deformable image registration

Title: Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts

Title: What Makes a Scene ? Scene Graph-based Evaluation and Feedback for Controllable Generation

Title: ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance

Title: Twin Trigger Generative Networks for Backdoor Attacks against Object Detection

Title: Unveiling the Achilles' Heel: Backdoor Watermarking Forgery Attack in Public Dataset Protection

Title: Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy

Title: TANGNN: a Concise, Scalable and Effective Graph Neural Networks with Top-m Attention Mechanism for Graph Representation Learning

Title: MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking

Title: Towards Robust Evaluation of Unlearning in LLMs via Data Transformations

Title: Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in Thai

Title: Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark

Title: Improving Factuality of 3D Brain MRI Report Generation with Paired Image-domain Retrieval and Text-domain Augmentation

Title: SilentWood: Private Inference Over Gradient-Boosting Decision Forests

Title: AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation

Title: Interactive Visual Assessment for Text-to-Image Generation Models

Title: CellPilot

Title: Enhancing Grammatical Error Detection using BERT with Cleaned Lang-8 Dataset

Title: Haar-Laplacian for directed graphs

Title: MUNBa: Machine Unlearning via Nash Bargaining

Title: Large Language Model with Region-guided Referring and Grounding for CT Report Generation

Title: Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Title: Hierarchical Cross-Attention Network for Virtual Try-On

Title: NeRF Inpainting with Geometric Diffusion Prior and Balanced Score Distillation

Title: Improving Transferable Targeted Attacks with Feature Tuning Mixup

Title: Enhancing the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation

Title: ReWind: Understanding Long Videos with Instructed Learnable Memory

Title: Reassessing Layer Pruning in LLMs: New Insights and Methods

Title: TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Title: FLD+: Data-efficient Evaluation Metric for Generative Models

Title: Transparent but Powerful: Explainability, Accuracy, and Generalizability in ADHD Detection from Social Media Data

Title: A Survey on LLM-as-a-Judge

Title: An adversarial feature learning based semantic communication method for Human 3D Reconstruction

Title: Knowledge Transfer Across Modalities with Natural Language Supervision

Title: A Scalable Approach to Covariate and Concept Drift Management via Adaptive Data Segmentation

Title: Fine-Grained Open-Vocabulary Object Recognition via User-Guided Segmentation

Title: Multi-label Sequential Sentence Classification via Large Language Model

Title: ACE: Action Concept Enhancement of Video-Language Models in Procedural Videos

Title: "All that Glitters": Approaches to Evaluations with Unreliable Model and Human Annotations

Title: Learning state and proposal dynamics in state-space models using differentiable particle filters and neural networks

Title: AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset

Title: MC-NEST -- Enhancing Mathematical Reasoning in Large Language Models with a Monte Carlo Nash Equilibrium Self-Refine Tree

Title: OCDet: Object Center Detection via Bounding Box-Aware Heatmap Prediction on Edge Devices with NPUs

Title: Machine Learning-based sEMG Signal Classification for Hand Gesture Recognition

Title: Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data

Title: Ontology-Constrained Generation of Domain-Specific Clinical Summaries

Title: Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

Title: IRSKG: Unified Intrusion Response System Knowledge Graph Ontology for Cyber Defense

Title: Semantic Shield: Defending Vision-Language Models Against Backdooring and Poisoning via Fine-grained Knowledge Alignment

Title: Can a Large Language Model Learn Matrix Functions In Context?

Title: DrugAgent: Automating AI-aided Drug Discovery Programming through LLM Multi-Agent Collaboration

Title: Deep Sparse Latent Feature Models for Knowledge Graph Completion

Title: RAMIE: Retrieval-Augmented Multi-task Information Extraction with Large Language Models on Dietary Supplements

Title: Fixing the Perspective: A Critical Examination of Zero-1-to-3

Title: Nimbus: Secure and Efficient Two-Party Inference for Transformers

Title: LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Title: Tackling Data Heterogeneity in Federated Time Series Forecasting

Title: Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks

Title: OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions

Title: Development of Pre-Trained Transformer-based Models for the Nepali Language

Title: AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea

Title: LTCF-Net: A Transformer-Enhanced Dual-Channel Fourier Framework for Low-Light Image Restoration

Title: Beyond Data Scarcity: A Frequency-Driven Framework for Zero-Shot Forecasting

Title: Integrating Deep Metric Learning with Coreset for Active Learning in 3D Segmentation

Title: LLM Online Spatial-temporal Signal Reconstruction Under Noise

Title: Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering

Title: A Method for Building Large Language Models with Predefined KV Cache Capacity

Title: Multi-Token Enhancing for Vision Representation Learning

Title: Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization

Title: Data Lineage Inference: Uncovering Privacy Vulnerabilities of Dataset Pruning

Title: LoRA-Mini : Adaptation Matrices Decomposition and Selective Training

Title: LRSAA: Large-scale Remote Sensing Image Target Recognition and Automatic Annotation

Title: FastTrackTr:Towards Fast Multi-Object Tracking with Transformers

Title: Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models

Title: Modality Alignment Meets Federated Broadcasting

Title: Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Title: Unveiling the Superior Paradigm: A Comparative Study of Source-Free Domain Adaptation and Unsupervised Domain Adaptation

Title: FedQP: Towards Accurate Federated Learning using Quadratic Programming Guided Mutation

Title: ResCLIP: Residual Attention for Training-free Dense Vision-language Inference

Title: SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition

Title: Generalizable Single-view Object Pose Estimation by Two-side Generating and Matching

Title: PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs

Title: Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation

Title: An Extensive Study on D2C: Overfitting Remediation in Deep Learning Using a Decentralized Approach

Title: ExAL: An Exploration Enhanced Adversarial Learning Algorithm

Title: Evaluating Large Language Models for Causal Modeling

Title: From Laws to Motivation: Guiding Exploration through Law-Based Reasoning and Rewards

Title: Enhancing Symbolic Regression and Universal Physics-Informed Neural Networks with Dimensional Analysis

Title: An AutoML-based approach for Network Intrusion Detection

Title: A Tunable Despeckling Neural Network Stabilized via Diffusion Equation

Title: Deep Learning for automated multi-scale functional field boundaries extraction using multi-date Sentinel-2 and PlanetScope imagery: Case Study of Netherlands and Pakistan

Title: Making Images from Images: Interleaving Denoising and Transformation

Title: Generative Context Distillation

Title: Segment to Recognize Robustly -- Enhancing Recognition by Image Decomposition

Title: MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Title: Understanding Machine Learning Paradigms through the Lens of Statistical Thermodynamics: A tutorial

Title: Partial Identifiability and Misspecification in Inverse Reinforcement Learning

Title: Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise

Title: Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors

Title: DRIVE: Dual-Robustness via Information Variability and Entropic Consistency in Source-Free Unsupervised Domain Adaptation

Title: Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown

Title: Ensuring Fair LLM Serving Amid Diverse Applications

Title: Multi-ToM: Evaluating Multilingual Theory of Mind Capabilities in Large Language Models

Title: eFedLLM: Efficient LLM Inference Based on Federated Learning

Title: M3: Mamba-assisted Multi-Circuit Optimization via MBRL with Effective Scheduling

Title: TransCompressor: LLM-Powered Multimodal Data Compression for Smart Transportation

Title: Binary Search with Distributional Predictions

Title: ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

Title: ROADS: Robust Prompt-driven Multi-Class Anomaly Detection under Domain Shift

Title: UnitedVLN: Generalizable Gaussian Splatting for Continuous Vision-Language Navigation

Title: Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training

Title: VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction

Title: Soft-TransFormers for Continual Learning

Title: Geometry Distributions

Title: SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text

Title: Debiasing Classifiers by Amplifying Bias with Latent Diffusion and Large Language Models

Title: Boosting 3D Object Generation through PBR Materials

Title: Cautious Optimizers: Improving Training with One Line of Code

Title: AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity

Title: Adaptive Circuit Behavior and Generalization in Mechanistic Interpretability

Title: LLMPirate: LLMs for Black-box Hardware IP Piracy

Title: LLM Augmentations to support Analytical Reasoning over Multiple Documents

Title: Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment Anything Model in Medical Domain

Title: DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

Title: CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Title: Context Awareness Gate For Retrieval Augmented Generation

Title: Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics

Title: DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders

Title: Graph Adapter of EEG Foundation Models for Parameter Efficient Fine Tuning

Title: VideoOrion: Tokenizing Object Dynamics in Videos

Title: MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model

Title: MixPE: Quantization and Hardware Co-design for Efficient LLM Inference

Title: Sparse patches adversarial attacks via extrapolating point-wise information

Title: Text-to-Image Synthesis: A Decade Survey

Title: BadSFL: Backdoor Attack against Scaffold Federated Learning

Title: Local and Global Feature Attention Fusion Network for Face Recognition

Title: CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

Title: Image Generation Diversity Issues and How to Tame Them

Title: U2NeRF: Unsupervised Underwater Image Restoration and Neural Radiance Fields

Title: SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

Title: Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking

Title: Fancy123: One Image to High-Quality 3D Mesh Generation via Plug-and-Play Deformation

Title: Learn from Foundation Model: Fruit Detection Model without Manual Annotation

Title: Interpreting Object-level Foundation Models via Visual Precision Search

Title: VIRES: Video Instance Repainting with Sketch and Text Guidance

Title: Video-Text Dataset Construction from Multi-AI Feedback: Promoting Weak-to-Strong Preference Learning for Video Large Language Models

Title: MH-MoE:Multi-Head Mixture-of-Experts

Title: Can Encrypted Images Still Train Neural Networks? Investigating Image Information and Random Vortex Transformation

Title: SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context

Title: SMGDiff: Soccer Motion Generation using diffusion probabilistic models

Title: Weakly supervised image segmentation for defect-based grading of fresh produce

Title: DoubleCCA: Improving Foundation Model Group Robustness with Random Sentence Embeddings

Title: CS-Eval: A Comprehensive Large Language Model Benchmark for CyberSecurity

Title: Transparent Neighborhood Approximation for Text Classifier Explanation

Title: NormXLogit: The Head-on-Top Never Lies

Title: Open-Vocabulary Octree-Graph for 3D Scene Understanding

Title: Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures

Title: Even Sparser Graph Transformers

Title: Utilizing Uncertainty in 2D Pose Detectors for Probabilistic 3D Human Mesh Recovery

Title: A Performance Increment Strategy for Semantic Segmentation of Low-Resolution Images from Damaged Roads

Title: Evaluating Rank-N-Contrast: Continuous and Robust Representations for Regression

Title: BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment

Title: DiffDesign: Controllable Diffusion with Meta Prior for Efficient Interior Design Generation

Title: Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization

Title: An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

Title: Functionality understanding and segmentation in 3D scenes

Title: Monocular Lane Detection Based on Deep Learning: A Survey

Title: One Diffusion to Generate Them All

Title: CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation

Title: Can AI grade your essays? A comparative analysis of large language models and teacher ratings in multidimensional essay scoring

Title: Preference Optimization for Reasoning with Pseudo Feedback

Title: Towards Foundation Models for Critical Care Time Series

Title: Machine learning for cerebral blood vessels' malformations

Title: A Review of Bayesian Uncertainty Quantification in Deep Probabilistic Image Segmentation

Title: Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing

Title: FineWeb-zhtw: Scalable Curation of Traditional Chinese Text Data from the Web

Title: Human-Calibrated Automated Testing and Validation of Generative Language Models

Title: A Survey of Blockchain-Based Privacy Applications: An Analysis of Consent Management and Self-Sovereign Identity Approaches

Title: Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN

Title: A Study on Unsupervised Domain Adaptation for Semantic Segmentation in the Era of Vision-Language Models

Title: Unsupervised Event Outlier Detection in Continuous Time

Title: Finding Structure in Language Models

Title: Privacy Protection in Personalized Diffusion Models via Targeted Cross-Attention Adversarial Attack

Title: AnonyNoise: Anonymizing Event Data with Smart Noise to Outsmart Re-Identification and Preserve Privacy

Title: TIFeD: a Tiny Integer-based Federated learning algorithm with Direct feedback alignment

Title: VQ-SGen: A Vector Quantized Stroke Representation for Sketch Generation

Title: Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval

Title: On the Reconstruction of Training Data from Group Invariant Networks

Title: No Identity, no problem: Motion through detection for people tracking

Title: Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency

Title: Distributed Online Optimization with Stochastic Agent Availability

Title: Distributed, communication-efficient, and differentially private estimation of KL divergence

Title: Deformable Mamba for Wide Field of View Segmentation

Title: AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning

Title: Multi-Resolution Generative Modeling of Human Motion from Limited Data

Title: Interpreting Language Reward Models via Contrastive Explanations

Title: Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis

Title: All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

Title: Guarding the Gate: ConceptGuard Battles Concept-Level Backdoors in Concept Bottleneck Models

Title: Curator Attack: When Blackbox Differential Privacy Auditing Loses Its Power

Title: Poster: From Fort to Foe: The Threat of RCE in RPKI

Title: LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation

Title: Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Title: Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings

Title: Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training

Title: Representation Collapsing Problems in Vector Quantization

Title: Generating Out-Of-Distribution Scenarios Using Language Models

Title: Enhancing Few-Shot Learning with Integrated Data and GAN Model Approaches

Title: J-CaPA : Joint Channel and Pyramid Attention Improves Medical Image Segmentation

Title: Rethinking Diffusion for Text-Driven Human Motion Generation

Title: Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision

Title: Adversarial Attacks for Drift Detection

Title: Unlocking The Potential of Adaptive Attacks on Diffusion-Based Purification

Title: Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

Title: Recent Trends in Linear Text Segmentation: a Survey

Title: GeoFormer: A Multi-Polygon Segmentation Transformer

Title: StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training

Title: Preventing Jailbreak Prompts as Malicious Tools for Cybercriminals: A Cyber Defense Perspective

Title: Exploring Discrete Flow Matching for 3D De Novo Molecule Generation

Title: Self-Generated Critiques Boost Reward Modeling for Language Models

Title: DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation

Title: Diffusion Features for Zero-Shot 6DoF Object Pose Estimation

Title: Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?

Title: Quark: Real-time, High-resolution, and General Neural View Synthesis

Title: Factorized Visual Tokenization and Generation

Title: Generative Omnimatte: Learning to Decompose Video into Layers