Usage instructions: here
Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-06-10 | IllumiNeRF: 3D Relighting without Inverse Rendering | Xiaoming Zhao et.al. | 2406.06527 | null |
2024-06-10 | Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation | Peize Sun et.al. | 2406.06525 | link |
2024-06-10 | NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing | Ting-Hsuan Chen et.al. | 2406.06523 | null |
2024-06-10 | Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer | Sigal Raab et.al. | 2406.06508 | link |
2024-06-10 | Rephasing spectral diffusion in time-bin spin-spin entanglement protocols | Mehmet T. Uysal et.al. | 2406.06497 | null |
2024-06-10 | Probing the Heights and Depths of Y Dwarf Atmospheres: A Retrieval Analysis of the JWST Spectral Energy Distribution of WISE J035934.06 |
Harshil Kothari et.al. | 2406.06493 | null |
2024-06-10 | AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction | Zhen Xing et.al. | 2406.06465 | null |
2024-06-10 | Cometh: A continuous-time discrete-state graph diffusion model | Antoine Siraudin et.al. | 2406.06449 | null |
2024-06-10 | QSSEP describes the fluctuations of quantum coherences in the Anderson model | Ludwig Hruza et.al. | 2406.06444 | null |
2024-06-10 | Margin-aware Preference Optimization for Aligning Diffusion Models without Reference | Jiwoo Hong et.al. | 2406.06424 | null |
2024-06-07 | DVOS: Self-Supervised Dense-Pattern Video Object Segmentation | Keyhan Najafian et.al. | 2406.05131 | null |
2024-06-07 | Ohms law lost and regained: observation and impact of zeros and poles | Krishna Joshi et.al. | 2406.05112 | null |
2024-06-07 | Large Generative Graph Models | Yu Wang et.al. | 2406.05109 | null |
2024-06-07 | CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion | Xingrui Wang et.al. | 2406.05082 | null |
2024-06-07 | GenHeld: Generating and Editing Handheld Objects | Chaerin Min et.al. | 2406.05059 | link |
2024-06-07 | Digital Twins of the EM Environment: Benchmark for Ray Launching Models | Michele Zhu et.al. | 2406.05042 | link |
2024-06-07 | Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs | Shentong Mo et.al. | 2406.05038 | null |
2024-06-07 | Linear stability analysis for a system of singular amplitude equations arising in biomorphology | Aric Wheeler et.al. | 2406.05037 | null |
2024-06-07 | Generative diffusion models for synthetic trajectories of heavy and light particles in turbulence | Tianyi Li et.al. | 2406.05008 | null |
2024-06-07 | CityCraft: A Real Crafter for 3D City Generation | Jie Deng et.al. | 2406.04983 | null |
2024-06-06 | GLACE: Global Local Accelerated Coordinate Encoding | Fangjinhua Wang et.al. | 2406.04340 | link |
2024-06-07 | Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion | Fangfu Liu et.al. | 2406.04338 | null |
2024-06-06 | Coherent Zero-Shot Visual Instruction Generation | Quynh Phung et.al. | 2406.04337 | null |
2024-06-06 | BitsFusion: 1.99 bits Weight Quantization of Diffusion Model | Yang Sui et.al. | 2406.04333 | link |
2024-06-06 | Simplified and Generalized Masked Diffusion for Discrete Data | Jiaxin Shi et.al. | 2406.04329 | null |
2024-06-06 | SF-V: Single Forward Video Generation Model | Zhixing Zhang et.al. | 2406.04324 | null |
2024-06-06 | ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories | Qianlan Yang et.al. | 2406.04323 | null |
2024-06-07 | DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data | Qihao Liu et.al. | 2406.04322 | link |
2024-06-06 | Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step | Zhanhao Liang et.al. | 2406.04314 | null |
2024-06-06 | ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization | Luca Eyring et.al. | 2406.04312 | link |
2024-06-05 | Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input | Joachim Ott et.al. | 2406.03439 | null |
2024-06-05 | Non-stationary Spatio-Temporal Modeling Using the Stochastic Advection-Diffusion Equation | Martin Outzen Berild et.al. | 2406.03400 | link |
2024-06-05 | Reparameterization invariance in approximate Bayesian inference | Hrittik Roy et.al. | 2406.03334 | null |
2024-06-05 | UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning | Yu Zhang et.al. | 2406.03324 | null |
2024-06-05 | Text-to-Image Rectified Flow as Plug-and-Play Priors | Xiaofeng Yang et.al. | 2406.03293 | link |
2024-06-05 | Relative Entropy for the Numerical Diffusive Limit of the Linear Jin-Xin System | Marianne Bessemoulin-Chatard et.al. | 2406.03268 | null |
2024-06-05 | Generative Diffusion Models for Fast Simulations of Particle Collisions at CERN | Mikołaj Kita et.al. | 2406.03233 | null |
2024-06-05 | Holographic drag force with translational symmetry breaking | Sara Tahery et.al. | 2406.03220 | null |
2024-06-05 | Searching Priors Makes Text-to-Video Synthesis Better | Haoran Cheng et.al. | 2406.03215 | null |
2024-06-05 | Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion | Hao Wen et.al. | 2406.03184 | link |
2024-06-04 | Dreamguider: Improved Training free Diffusion-based Conditional Generation | Nithin Gopalakrishnan Nair et.al. | 2406.02549 | null |
2024-06-05 | Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting | Inkyu Shin et.al. | 2406.02541 | null |
2024-06-04 | ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation | Tianchen Zhao et.al. | 2406.02540 | null |
2024-06-04 | CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation | Dejia Xu et.al. | 2406.02509 | null |
2024-06-04 | Guiding a Diffusion Model with a Bad Version of Itself | Tero Karras et.al. | 2406.02507 | null |
2024-06-04 | Tensor Network Space-Time Spectral Collocation Method for Solving the Nonlinear Convection Diffusion Equation | Dibyendu Adak et.al. | 2406.02505 | null |
2024-06-04 | Singular Subspace Perturbation Bounds via Rectangular Random Matrix Diffusions | Peiyao Lai et.al. | 2406.02502 | null |
2024-06-04 | Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation | Jiajun Wang et.al. | 2406.02485 | link |
2024-06-04 | Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion | Colin Hansen et.al. | 2406.02477 | null |
2024-06-04 | Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems | Jason Hu et.al. | 2406.02462 | null |
2024-05-31 | Mixed Diffusion for 3D Indoor Scene Synthesis | Siyi Hu et.al. | 2405.21066 | link |
2024-05-31 | Unified Directly Denoising for Both Variance Preserving and Variance Exploding Diffusion Models | Jingjing Wang et.al. | 2405.21059 | null |
2024-05-31 | Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models | Xinxi Zhang et.al. | 2405.21050 | null |
2024-05-31 | Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling | Jiatao Gu et.al. | 2405.21048 | null |
2024-05-31 | Beyond Conventional Parametric Modeling: Data-Driven Framework for Estimation and Prediction of Time Activity Curves in Dynamic PET Imaging | Niloufar Zakariaei et.al. | 2405.21021 | null |
2024-05-31 | Amortizing intractable inference in diffusion models for vision, language, and control | Siddarth Venkatraman et.al. | 2405.20971 | link |
2024-06-03 | Large Language Models are Zero-Shot Next Location Predictors | Ciro Beneduce et.al. | 2405.20962 | link |
2024-05-31 | Search of extended emission from HESS J1702-420 with eROSITA | Denys Malyshev et.al. | 2405.20927 | null |
2024-05-31 | Flow matching achieves minimax optimal convergence | Kenji Fukumizu et.al. | 2405.20879 | null |
2024-05-31 | MegActor: Harness the Power of Raw Video for Vivid Portrait Animation | Shurong Yang et.al. | 2405.20851 | link |
2024-05-30 | Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image | Kailu Wu et.al. | 2405.20343 | link |
2024-05-30 | OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving | Lening Wang et.al. | 2405.20337 | link |
2024-05-30 | VividDream: Generating 3D Scene with Ambient Dynamics | Yao-Chih Lee et.al. | 2405.20334 | null |
2024-05-30 | MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion | Shuyuan Tu et.al. | 2405.20325 | link |
2024-05-30 | Don't drop your samples! Coherence-aware training benefits Conditional diffusion | Nicolas Dufour et.al. | 2405.20324 | null |
2024-05-30 | Improving the Training of Rectified Flows | Sangyun Lee et.al. | 2405.20320 | link |
2024-05-30 | DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation | Zachary Novack et.al. | 2405.20289 | null |
2024-05-30 | SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow | Chaoyang Wang et.al. | 2405.20282 | link |
2024-05-30 | CV-VAE: A Compatible Video VAE for Latent Generative Video Models | Sijie Zhao et.al. | 2405.20279 | link |
2024-05-31 | KerasCV and KerasNLP: Vision and Language Power-Ups | Matthew Watson et.al. | 2405.20247 | null |
2024-05-29 | X-VILA: Cross-Modality Alignment for Large Language Model | Hanrong Ye et.al. | 2405.19335 | null |
2024-05-29 | Hilbert Space Diffusion in Systems with Approximate Symmetries | Rahel L. Baumgartner et.al. | 2405.19260 | null |
2024-05-29 | Weak Generative Sampler to Efficiently Sample Invariant Distribution of Stochastic Differential Equation | Zhiqiang Cai et.al. | 2405.19256 | null |
2024-05-29 | ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning | Ruchika Chavhan et.al. | 2405.19237 | link |
2024-05-29 | Pseudo-Gevrey Smoothing for the Passive Scalar Equations near Couette | Jacob Bedrossian et.al. | 2405.19233 | null |
2024-05-29 | DiPPeST: Diffusion-based Path Planner for Synthesizing Trajectories Applied on Quadruped Robots | Maria Stamatopoulou et.al. | 2405.19232 | null |
2024-05-29 | Contrastive-Adversarial and Diffusion: Exploring pre-training and fine-tuning strategies for sulcal identification | Michail Mamalakis et.al. | 2405.19204 | null |
2024-05-30 | Weitian Zhang et.al. | 2405.19203 | null | |
2024-05-29 | Going beyond compositional generalization, DDPMs can produce zero-shot interpolation | Justin Deschenaux et.al. | 2405.19201 | link |
2024-05-29 | Diffusion-based Dynamics Models for Long-Horizon Rollout in Offline Reinforcement Learning | Hanye Zhao et.al. | 2405.19189 | link |
2024-05-28 | On the Origin of Llamas: Model Tree Heritage Recovery | Eliahu Horwitz et.al. | 2405.18432 | link |
2024-05-28 | DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention | Lianghui Zhu et.al. | 2405.18428 | link |
2024-05-28 | Phased Consistency Model | Fu-Yun Wang et.al. | 2405.18407 | null |
2024-05-28 | RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives | Jaehong Yoon et.al. | 2405.18406 | link |
2024-05-28 | Short-time Fokker-Planck propagator beyond the Gaussian approximation | Julian Kappler et.al. | 2405.18381 | null |
2024-05-28 | A Hessian-Aware Stochastic Differential Equation for Modelling SGD | Xiang Li et.al. | 2405.18373 | null |
2024-05-28 | Simulating infinite-dimensional nonlinear diffusion bridges | Gefan Yang et.al. | 2405.18353 | link |
2024-05-28 | VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers | Jun Zheng et.al. | 2405.18326 | null |
2024-05-28 | Multi-modal Generation via Cross-Modal In-Context Learning | Amandeep Kumar et.al. | 2405.18304 | link |
2024-05-28 | CT-based brain ventricle segmentation via diffusion Schrödinger Bridge without target domain ground truths | Reihaneh Teimouri et.al. | 2405.18267 | null |
2024-05-27 | Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control | Zhengfei Kuang et.al. | 2405.17414 | null |
2024-05-27 | Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer | Ruizhi Shao et.al. | 2405.17405 | null |
2024-05-27 | A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training | Kai Wang et.al. | 2405.17403 | link |
2024-05-27 | RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control | Litu Rout et.al. | 2405.17401 | null |
2024-05-27 | EASI-Tex: Edge-Aware Mesh Texturing from Single Image | Sai Raj Kishore Perla et.al. | 2405.17393 | null |
2024-05-27 | Global existence, fast signal diffusion limit, and |
Cordula Reisch et.al. | 2405.17392 | null |
2024-05-27 | Supernova Remnants in Gamma Rays | Andrea Giuliani et.al. | 2405.17384 | null |
2024-05-27 | Muon spin relaxation in mixed perovskite (LaAlO $3$)${x}$(SrAl${0.5}$Ta${0.5}$O$3$)${1-x}$ with |
Takashi U. Ito et.al. | 2405.17371 | null |
2024-05-27 | Finite Fractal Dimension of uniform attractors for non-autonomous dynamical systems with infinite dimensional symbol space | Rafael de Oliveira Moura et.al. | 2405.17367 | null |
2024-05-27 | Emergent time crystal from a fractional Langevin equation with white and colored noise | David Santiago Quevedo et.al. | 2405.17331 | null |
2024-05-24 | Self-consistent evaluation of proximity and inverse proximity effects with pair-breaking in diffusive SN junctions | Arpit Raj et.al. | 2405.15770 | null |
2024-05-24 | FastDrag: Manipulate Anything in One Step | Xuanjia Zhao et.al. | 2405.15769 | null |
2024-05-24 | InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation | Yuchi Wang et.al. | 2405.15758 | link |
2024-05-24 | Looking Backward: Streaming Video-to-Video Translation with Feature Banks | Feng Liang et.al. | 2405.15757 | link |
2024-05-24 | Score-based generative models are provably robust: an uncertainty quantification perspective | Nikiforos Mimikos-Stamatopoulos et.al. | 2405.15754 | null |
2024-05-24 | Murray-von Neumann dimension for strictly semifinite weights | Aldo Garcia Guinto et.al. | 2405.15725 | null |
2024-05-24 | Hierarchical Uncertainty Exploration via Feedforward Posterior Trees | Elias Nehme et.al. | 2405.15719 | null |
2024-05-24 | Simulation-based inference of radio millisecond pulsars in globular clusters | Joanna Berteaud et.al. | 2405.15691 | null |
2024-05-24 | Jet Quenching of the Heavy Quarks in the Quark-Gluon Plasma and the Nonadditive Statistics | Trambak Bhattacharyya et.al. | 2405.15679 | null |
2024-05-24 | Taming Score-Based Diffusion Priors for Infinite-Dimensional Nonlinear Inverse Problems | Lorenzo Baldassari et.al. | 2405.15676 | null |
2024-05-23 | Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis | Basile Van Hoorick et.al. | 2405.14868 | null |
2024-05-23 | Improved Distribution Matching Distillation for Fast Image Synthesis | Tianwei Yin et.al. | 2405.14867 | link |
2024-05-23 | Video Diffusion Models are Training-free Motion Interpreter and Controller | Zeqi Xiao et.al. | 2405.14864 | null |
2024-05-23 | Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models | Gen Li et.al. | 2405.14861 | null |
2024-05-23 | Semantica: An Adaptable Image-Conditioned Diffusion Model | Manoj Kumar et.al. | 2405.14857 | null |
2024-05-23 | TerDiT: Ternary Diffusion Models with Transformers | Xudong Lu et.al. | 2405.14854 | link |
2024-05-23 | Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer | Shuang Wu et.al. | 2405.14832 | null |
2024-05-23 | Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models | Katherine Xu et.al. | 2405.14828 | null |
2024-05-23 | New limits on neutrino decay from high-energy astrophysical neutrinos | Victor B. Valera et.al. | 2405.14826 | null |
2024-05-23 | PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher | Dongjun Kim et.al. | 2405.14822 | null |
2024-05-21 | Personalized Residuals for Concept-Driven Text-to-Image Generation | Cusuh Ham et.al. | 2405.12978 | null |
2024-05-21 | Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control | Yue Han et.al. | 2405.12970 | null |
2024-05-21 | Differential Walk on Spheres | Bailey Miller et.al. | 2405.12964 | null |
2024-05-21 | Learning the Infinitesimal Generator of Stochastic Diffusion Processes | Vladimir R. Kostic et.al. | 2405.12940 | null |
2024-05-21 | Impact of inhomogeneous diffusion on secondary cosmic ray and antiproton local spectra | Álvaro Tovar-Pardo et.al. | 2405.12918 | null |
2024-05-21 | An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation | Zhiyu Tan et.al. | 2405.12914 | link |
2024-05-21 | Deep HST/UVIS imaging of the candidate dark galaxy CDG-1 | Pieter van Dokkum et.al. | 2405.12907 | null |
2024-05-21 | Diffusion of brightened dark excitons in a high-angle incommensurate Moiré homobilayer | Arnab Barman Ray et.al. | 2405.12901 | null |
2024-05-21 | Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images | Xiaofei Yu et.al. | 2405.12875 | link |
2024-05-21 | High-Field Microscale NMR Spectroscopy with NV Centers in Dipolarly-Coupled Samples | Carlos Munuera-Javaloy et.al. | 2405.12857 | null |
2024-05-20 | Images that Sound: Composing Images and Sounds on a Single Canvas | Ziyang Chen et.al. | 2405.12221 | null |
2024-05-20 | Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices | Nathaniel Cohen et.al. | 2405.12211 | link |
2024-05-20 | Cosmic Ray Diffusion in the Turbulent Interstellar Medium: Effects of Mirror Diffusion and Pitch Angle Scattering | Lucas Barreto-Mota et.al. | 2405.12146 | null |
2024-05-20 | Two-dimensional signal-dependent parabolic-elliptic Keller-Segel system and its means field derivation | Lukas Bol et.al. | 2405.12134 | null |
2024-05-20 | An effective advection induced by oscillating microstructure in a diffusion equation | David Wiedemann et.al. | 2405.12108 | null |
2024-05-20 | Sobolev regularity theory for stochastic reaction-diffusion-advection equations with spatially homogeneous colored noises and variable-order nonlocal operators | Jae-Hwan Choi et.al. | 2405.11969 | null |
2024-05-20 | Optimal balanced-norm error estimate of the LDG method for reaction-diffusion problems II: the two-dimensional case with layer-upwind flux | Yao Cheng et.al. | 2405.11939 | null |
2024-05-20 | Nonequilbrium physics of generative diffusion models | Zhendong Yu et.al. | 2405.11932 | null |
2024-05-20 | "Set It Up!": Functional Object Arrangement with Compositional Generative Models | Yiqing Xu et.al. | 2405.11928 | null |
2024-05-20 | Diff-BGM: A Diffusion Model for Video Background Music Generation | Sizhe Li et.al. | 2405.11913 | link |
2024-05-17 | Probabilistic transfer learning methodology to expedite high fidelity simulation of reactive flows | Bruno S. Soriano et.al. | 2405.10944 | null |
2024-05-17 | Reconstruction of Manipulated Garment with Guided Deformation Prior | Ren Li et.al. | 2405.10934 | null |
2024-05-17 | Limitations of the rate-distribution formalism in describing luminescence quenching in the presence of diffusion | Jakub Jędrak et.al. | 2405.10903 | null |
2024-05-17 | Improving face generation quality and prompt following with synthetic captions | Michail Tarasiou et.al. | 2405.10864 | null |
2024-05-17 | Diffusion Geometry | Iolo Jones et.al. | 2405.10858 | null |
2024-05-17 | Some remarks on a mathematical model for water flow in porous media with competition between transport and diffusion | Judita Runcziková et.al. | 2405.10751 | null |
2024-05-17 | Deep Data Consistency: a Fast and Robust Diffusion Model-based Solver for Inverse Problems | Hanyu Chen et.al. | 2405.10748 | link |
2024-05-17 | Eddeep: Fast eddy-current distortion correction for diffusion MRI with deep learning | Antoine Legouhy et.al. | 2405.10723 | null |
2024-05-17 | Numerical Recovery of the Diffusion Coefficient in Diffusion Equations from Terminal Measurement | Bangti Jin et.al. | 2405.10708 | null |
2024-05-17 | Ratchet-mediated resetting: Current, efficiency, and exact solution | Connor Roberts et.al. | 2405.10698 | null |
2024-05-16 | Text-to-Vector Generation with Neural Path Representation | Peiying Zhang et.al. | 2405.10317 | null |
2024-05-16 | Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model | Zheng Gu et.al. | 2405.10316 | null |
2024-05-16 | CAT3D: Create Anything in 3D with Multi-View Diffusion Models | Ruiqi Gao et.al. | 2405.10314 | null |
2024-05-16 | Societal Adaptation to Advanced AI | Jamie Bernardi et.al. | 2405.10295 | null |
2024-05-16 | Power-law relaxation of a confined diffusing particle subject to resetting with memory | Denis Boyer et.al. | 2405.10283 | null |
2024-05-16 | Interplay between Domain Walls in Type-II Superconductors and Gradients of Temperature/Spin Density | Takuma Kanakubo et.al. | 2405.10200 | null |
2024-05-16 | Fixed points of maps and nontrivial weak solutions to a class of nonlinear strongly coupled elliptic systems | Dung Le et.al. | 2405.10171 | null |
2024-05-16 | Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks | João Bordalo et.al. | 2405.10122 | null |
2024-05-16 | Advancing Set-Conditional Set Generation: Graph Diffusion for Fast Simulation of Reconstructed Particles | Dmitrii Kobylianskii et.al. | 2405.10106 | null |
2024-05-16 | Spurious reconstruction from brain activity | Ken Shirakawa et.al. | 2405.10078 | link |
2024-05-15 | MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer | Chengyu Wu et.al. | 2405.09539 | link |
2024-05-15 | A velocity-based moving mesh Discontinuous Galerkin method for the advection-diffusion equation | Ezra Rozier et.al. | 2405.09408 | null |
2024-05-15 | Probing particle acceleration in Abell 2256: from to 16 MHz to gamma rays | E. Osinga et.al. | 2405.09384 | null |
2024-05-15 | Diffusion-based Contrastive Learning for Sequential Recommendation | Ziqiang Cui et.al. | 2405.09369 | null |
2024-05-15 | DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations | Nima Fathi et.al. | 2405.09288 | link |
2024-05-15 | Searches for Galactic Neutrinos with the IceCube Neutrino observatory | A. Sandrock et.al. | 2405.09267 | null |
2024-05-15 | Dance Any Beat: Blending Beats with Visuals in Dance Video Generation | Xuanchen Wang et.al. | 2405.09266 | null |
2024-05-15 | Exact analysis of the two-dimensional asymmetric simple exclusion process with attachment and detachment of particles | Yuki Ishiguro et.al. | 2405.09261 | null |
2024-05-15 | Propagation of chaos for moderately interacting particle systems related to singular kinetic Mckean-Vlasov SDEs | Zimo Hao et.al. | 2405.09195 | null |
2024-05-15 | QMedShield: A Novel Quantum Chaos-based Image Encryption Scheme for Secure Medical Image Storage in the Cloud | Arun Amaithi Rajan et.al. | 2405.09191 | null |
2024-05-14 | The Flux Hypothesis for Odd Transport Phenomena | Cory Hargus et.al. | 2405.08798 | null |
2024-05-14 | A Generalized Curvilinear Coordinate system-based Patch Dynamics Scheme in Equation-free Multiscale Modelling | Tanay Kumar Karmakar et.al. | 2405.08764 | null |
2024-05-14 | Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding | Zhimin Li et.al. | 2405.08748 | link |
2024-05-14 | Dimensionality reduction in bulk-boundary reaction-diffusion systems | Tom Burkart et.al. | 2405.08728 | null |
2024-05-14 | Design and Analysis of Resilient Vehicular Platoon Systems over Wireless Networks | Tingyu Shui et.al. | 2405.08706 | null |
2024-05-14 | Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models | Bingdong Li et.al. | 2405.08674 | null |
2024-05-14 | Quantum Circuit Model for Lattice Boltzmann Fluid Flow Simulations | Dinesh Kumar E et.al. | 2405.08669 | null |
2024-05-14 | Anomalous Landau damping and algebraic thermalization in two-dimensional superfluids far from equilibrium | Clément Duval et.al. | 2405.08606 | null |
2024-05-14 | PTPI-DL-ROMs: pre-trained physics-informed deep learning-based reduced order models for nonlinear parametrized PDEs | Simone Brivio et.al. | 2405.08558 | null |
2024-05-14 | Pedro De la Torre Luque et.al. | 2405.08482 | null | |
2024-05-13 | Cloaking for random walks using a discrete potential theory | Trent DeGiovanni et.al. | 2405.07961 | link |
2024-05-13 | Stable Diffusion-based Data Augmentation for Federated Learning with Non-IID Data | Mahdi Morafah et.al. | 2405.07925 | null |
2024-05-13 | CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models | Nick Stracke et.al. | 2405.07913 | null |
2024-05-13 | Latest results from Super-Kamiokande | Andrew D. Santos et.al. | 2405.07900 | null |
2024-05-13 | Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging | Chi-en Amy Tai et.al. | 2405.07861 | null |
2024-05-13 | Radiogenomic biomarkers for immunotherapy in glioblastoma: A systematic review of magnetic resonance imaging studies | Prajwal Ghimire et.al. | 2405.07858 | null |
2024-05-13 | Using Multiparametric MRI with Optimized Synthetic Correlated Diffusion Imaging to Enhance Breast Cancer Pathologic Complete Response Prediction | Chi-en Amy Tai et.al. | 2405.07854 | null |
2024-05-13 | SAR Image Synthesis with Diffusion Models | Denisa Qosja et.al. | 2405.07776 | null |
2024-05-13 | LGDE: Local Graph-based Dictionary Expansion | Dominik J. Schindler et.al. | 2405.07764 | link |
2024-05-13 | FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation | Jianyi Chen et.al. | 2405.07682 | null |
2024-05-10 | OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation | Jinwei Lin et.al. | 2405.06547 | link |
2024-05-10 | Controllable Image Generation With Composed Parallel Token Prediction | Jamie Stirling et.al. | 2405.06535 | null |
2024-05-10 | SketchDream: Sketch-based Text-to-3D Generation and Editing | Feng-Lin Liu et.al. | 2405.06461 | null |
2024-05-10 | A universal phenomenology of charge-spin interconversion and dynamics in diffusive systems with spin-orbit coupling | Tim Kokkeler et.al. | 2405.06334 | null |
2024-05-10 | PUMA: margin-based data pruning | Javier Maroto et.al. | 2405.06298 | null |
2024-05-10 | Green's Function and Pointwise Space-time Behaviors of the Three-Dimensional Relativistic Boltzmann Equation | Yanchao Li et.al. | 2405.06280 | null |
2024-05-10 | Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging | Zhuchen Shao et.al. | 2405.06175 | null |
2024-05-10 | Integrability-preserving regularizations of Laplacian Growth | Razvan Teodorescu et.al. | 2405.06167 | null |
2024-05-10 | Dispersivity calculation in digital twins of multiscale porous materials using the micro-continuum approach | Julien Maes et.al. | 2405.06155 | null |
2024-05-09 | Modelling the random spreading of fake news through a two-dimensional time-inhomogeneous birth-death process | Antonio Di Crescenzo et.al. | 2405.06123 | null |
2024-05-09 | Distilling Diffusion Models into Conditional GANs | Minguk Kang et.al. | 2405.05967 | null |
2024-05-09 | Towards comprehensive coverage of chemical space: Quantum mechanical properties of 836k constitutional and conformational closed shell neutral isomers consisting of HCNOFSiPSClBr | Danish Khan et.al. | 2405.05961 | null |
2024-05-09 | Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask | Zineb Senane et.al. | 2405.05959 | link |
2024-05-09 | Frame Interpolation with Consecutive Brownian Bridge Diffusion | Zonglin Lyu et.al. | 2405.05953 | null |
2024-05-09 | Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers | Peng Gao et.al. | 2405.05945 | link |
2024-05-09 | Composable Part-Based Manipulation | Weiyu Liu et.al. | 2405.05876 | null |
2024-05-09 | Parameter identification for an uncertain reaction-diffusion equation via setpoint regulation | Gildas Besançon et.al. | 2405.05866 | null |
2024-05-09 | Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control | Gunshi Gupta et.al. | 2405.05852 | link |
2024-05-09 | Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models | Zhe Ma et.al. | 2405.05846 | null |
2024-05-09 | MSDiff: Multi-Scale Diffusion Model for Ultra-Sparse View CT Reconstruction | Pinhuang Tan et.al. | 2405.05814 | null |
2024-05-08 | Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo | Nayantara Mudur et.al. | 2405.05255 | link |
2024-05-08 | Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models | Hongjie Wang et.al. | 2405.05252 | null |
2024-05-08 | Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation | Jonas Kohler et.al. | 2405.05224 | null |
2024-05-08 | FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models | Jinglin Xu et.al. | 2405.05216 | link |
2024-05-08 | An adaptive finite element multigrid solver using GPU acceleration | Manuel Liebchen et.al. | 2405.05047 | null |
2024-05-08 | Reviewing Intelligent Cinematography: AI research for camera-based video production | Adrian Azzarelli et.al. | 2405.05039 | null |
2024-05-08 | Monitoring of neoadjuvant chemotherapy through time domain diffuse optics: Breast tissue composition changes and collagen discriminative potential | Nikhitha Mule et.al. | 2405.05035 | null |
2024-05-08 | An anti-noise seismic inversion method based on diffusion model | Yingtian Liu et.al. | 2405.05026 | null |
2024-05-08 | Stochastic spatial Lotka-Volterra predator-prey models | Uwe C. Täuber et.al. | 2405.05006 | null |
2024-05-08 | A unified theory of the self-similar supersonic Marshak wave problem | Menahem Krief et.al. | 2405.04981 | null |
2024-05-07 | Tactile-Augmented Radiance Fields | Yiming Dou et.al. | 2405.04534 | link |
2024-05-07 | Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing | Yi Zuo et.al. | 2405.04496 | null |
2024-05-07 | CloudDiff: Super-resolution ensemble retrieval of cloud properties for all day using the generative diffusion model | Haixia Xiao et.al. | 2405.04483 | null |
2024-05-07 | Derivation of kinetic and diffusion equations from a hard-sphere Rayleigh gas using collision trees and semigroups | Karsten Matthies et.al. | 2405.04449 | null |
2024-05-07 | Brownian Motion on The Spider Like Quantum Graphs | Madhumita Paul et.al. | 2405.04439 | null |
2024-05-07 | Learning local Dirichlet-to-Neumann maps of nonlinear elliptic PDEs with rough coefficients | Miranda Boutilier et.al. | 2405.04433 | null |
2024-05-07 | Josephson threshold detector in the phase diffusion regime | Dmitry A. Ladeynov et.al. | 2405.04426 | null |
2024-05-07 | Mathematical Modeling of $^{18}$F-Fluoromisonidazole ( |
Mohammad Amin Abazari et.al. | 2405.04418 | null |
2024-05-07 | Community Detection for Heterogeneous Multiple Social Networks | Ziqing Zhu et.al. | 2405.04371 | null |
2024-05-07 | Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos | Junyi Ma et.al. | 2405.04370 | link |
2024-05-06 | An Empty Room is All We Want: Automatic Defurnishing of Indoor Panoramas | Mira Slavcheva et.al. | 2405.03682 | null |
2024-05-06 | Field-of-View Extension for Diffusion MRI via Deep Generative Models | Chenyu Gao et.al. | 2405.03652 | null |
2024-05-06 | Cosine Annealing Optimized Denoising Diffusion Error Correction Codes | Congyang Ou et.al. | 2405.03638 | null |
2024-05-06 | Strang Splitting for Parametric Inference in Second-order Stochastic Differential Equations | Predrag Pilipovic et.al. | 2405.03606 | null |
2024-05-06 | Dissipative gradient nonlinearities prevent |
Tongxing Li et.al. | 2405.03586 | null |
2024-05-06 | Bridging discrete and continuous state spaces: Exploring the Ehrenfest process in time-continuous diffusion models | Ludwig Winkler et.al. | 2405.03549 | null |
2024-05-06 | CCDM: Continuous Conditional Diffusion Models for Image Generation | Xin Ding et.al. | 2405.03546 | link |
2024-05-06 | Asymptotic-preserving hybridizable discontinuous Galerkin method for the Westervelt quasilinear wave equation | Sergio Gómez et.al. | 2405.03535 | null |
2024-05-06 | Quasi-Monte Carlo for Bayesian design of experiment problems governed by parametric PDEs | Vesa Kaarnioja et.al. | 2405.03529 | null |
2024-05-06 | On anomalous dissipation induced by transport noise | Antonio Agresti et.al. | 2405.03525 | null |
2024-05-03 | DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos | Wen-Hsuan Chu et.al. | 2405.02280 | link |
2024-05-03 | Relic gravitons and non-stationary processes | Massimo Giovannini et.al. | 2405.02193 | null |
2024-05-03 | Tangentially Active Polymers in Cylindrical Channels | José Martín-Roca et.al. | 2405.02192 | null |
2024-05-03 | Characterized Diffusion and Spatial-Temporal Interaction Network for Trajectory Prediction in Autonomous Driving | Haicheng Liao et.al. | 2405.02145 | null |
2024-05-03 | Global regularity and infinite Prandtl number limit of temperature patches for the 2D Boussinesq system | Omar Lazar et.al. | 2405.02137 | null |
2024-05-03 | Multi-grid reaction-diffusion master equation: applications to morphogen gradient modelling | Radek Erban et.al. | 2405.02117 | null |
2024-05-03 | On variable annuities with surrender charges | Tiziano De Angelis et.al. | 2405.02115 | null |
2024-05-03 | Anomalous transport in the quantum East-West kinetically constrained model | Pietro Brighi et.al. | 2405.02102 | null |
2024-05-03 | Radiative and mechanical energies in galaxies I. Contributions of molecular shocks and PDRs in 3C 326 N | J. A. Villa-Vélez et.al. | 2405.02058 | null |
2024-05-03 | The CO-dark molecular gas in the cold HI arc | Gan Luo et.al. | 2405.02055 | null |
2024-05-02 | Customizing Text-to-Image Models with a Single Image Pair | Maxwell Jones et.al. | 2405.01536 | null |
2024-05-02 | The heat equation with time-correlated random potential in d=2: Edwards-Wilkinson fluctuations | Sotirios Kotitsas et.al. | 2405.01519 | null |
2024-05-02 | Effective Lifshitz black holes, hydrodynamics, and transport coefficients in fluid/gravity correspondence | D. C. Moreira et.al. | 2405.01505 | null |
2024-05-02 | LocInv: Localization-aware Inversion for Text-Guided Image Editing | Chuanming Tang et.al. | 2405.01496 | link |
2024-05-02 | Navigating Heterogeneity and Privacy in One-Shot Federated Learning with Diffusion Models | Matias Mendieta et.al. | 2405.01494 | null |
2024-05-02 | StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation | Yupeng Zhou et.al. | 2405.01434 | link |
2024-05-02 | In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies | Yunbum Kook et.al. | 2405.01425 | null |
2024-05-02 | Statistical algorithms for low-frequency diffusion data: A PDE approach | Matteo Giordano et.al. | 2405.01372 | link |
2024-05-02 | On Nanowire Morphological Instability and Pinch-Off by Surface Electromigration | Mikhail Khenner et.al. | 2405.01331 | null |
2024-05-02 | DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines | Ye Tian et.al. | 2405.01248 | null |
2024-05-01 | TexSliders: Diffusion-Based Texture Editing in CLIP Space | Julia Guerrero-Viu et.al. | 2405.00672 | null |
2024-05-01 | RGB |
Zheng Zeng et.al. | 2405.00666 | null |
2024-05-01 | Large deviations of current for the symmetric simple exclusion process on a semi-infinite line and on an infinite line with slow bonds | Kapil Sharma et.al. | 2405.00654 | null |
2024-05-01 | Stochastic fluids with transport noise: Approximating diffusion from data using SVD and ensemble forecast back-propagation | James Woodfield et.al. | 2405.00640 | null |
2024-05-01 | Vacancy-mediated transport and segregation tendencies of solutes in FCC nickel under diffusional creep: A density functional theory study | Shehab Shousha et.al. | 2405.00639 | null |
2024-05-01 | Engine-fed Kilonovae (Mergernovae) -- II. Radiation | Shunke Ai et.al. | 2405.00638 | null |
2024-05-01 | Deep Metric Learning-Based Out-of-Distribution Detection with Synthetic Outlier Exposure | Assefa Seyoum Wahd et.al. | 2405.00631 | null |
2024-05-01 | Hysteresis and Self-Oscillations in an Artificial Memristive Quantum Neuron | Finlay Potter et.al. | 2405.00624 | null |
2024-05-01 | Lane Segmentation Refinement with Diffusion Models | Antonio Ruiz et.al. | 2405.00620 | null |
2024-05-01 | Anomalous diffusion and factor ordering in (1+1)-dimensional Lorentzian quantum gravity | Elijah Sanderson et.al. | 2405.00594 | null |
2024-04-30 | MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model | Wenxun Dai et.al. | 2404.19759 | link |
2024-04-30 | Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting | Paul Engstler et.al. | 2404.19758 | null |
2024-04-30 | Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation | Ian Dunn et.al. | 2404.19739 | link |
2024-04-30 | Investigating the correlations between IceCube high-energy neutrinos and Fermi-LAT |
Ming-Xuan Lu et.al. | 2404.19730 | null |
2024-04-30 | X-Diffusion: Generating Detailed 3D MRI Volumes From a Single Image Using Cross-Sectional Diffusion Models | Emmanuelle Bourigault et.al. | 2404.19604 | null |
2024-04-30 | Cool-core, X-ray cavities and cold front revealed in RXCJ0352.9+1941 cluster by Chandra and GMRT observations | Satish S. Sonkamble et.al. | 2404.19549 | null |
2024-04-30 | Shocks in the Warm Neutral Medium I -- Theoretical model | Benjamin Godard et.al. | 2404.19533 | null |
2024-04-30 | MicroDreamer: Zero-shot 3D Generation in |
Luxi Chen et.al. | 2404.19525 | link |
2024-04-30 | Well-posedness of McKean-Vlasov SDEs with density-dependent drift | Anh-Dung Le et.al. | 2404.19499 | null |
2024-04-30 | TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models | Teng Zhou et.al. | 2404.19475 | null |
2024-04-29 | Stylus: Automatic Adapter Selection for Diffusion Models | Michael Luo et.al. | 2404.18928 | null |
2024-04-29 | TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation | Junhao Cheng et.al. | 2404.18919 | link |
2024-04-29 | Learning general Gaussian mixtures with efficient score matching | Sitan Chen et.al. | 2404.18893 | null |
2024-04-29 | A Survey on Diffusion Models for Time Series and Spatio-Temporal Data | Yiyuan Yang et.al. | 2404.18886 | link |
2024-04-29 | Learning Mixtures of Gaussians Using Diffusion Models | Khashayar Gatmiry et.al. | 2404.18869 | null |
2024-04-29 | Construction of local reduced spaces for Friedrichs' systems via randomized training | Christian Engwer et.al. | 2404.18839 | null |
2024-04-29 | Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior | Zhiyuan Li et.al. | 2404.18820 | null |
2024-04-29 | Spectral measures and iterative bounds for effective diffusivity of steady and space-time periodic flows | N. B. Murphy et.al. | 2404.18754 | null |
2024-04-29 | Diffuse scattering from dynamically compressed single-crystal zirconium following the pressure-induced |
P. G. Heighway et.al. | 2404.18740 | null |
2024-04-29 | Diffusion coefficient matrix for multiple conserved charges: a Kubo approach | Sourav Dey et.al. | 2404.18718 | null |
2024-04-26 | Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos | Zhengze Xu et.al. | 2404.17571 | null |
2024-04-26 | MaPa: Text-driven Photorealistic Material Painting for 3D Shapes | Shangzhan Zhang et.al. | 2404.17569 | null |
2024-04-26 | [OI] fine structure line profiles in Mon R2 and M17 SW: the puzzling nature of cold foreground material identified by [12CII] self-absorption | C. Guevara et.al. | 2404.17538 | null |
2024-04-26 | Reduction of the effective population size in a branching particle system in the moderate mutation-selection regime | Florin Boenkost et.al. | 2404.17527 | null |
2024-04-26 | Chemotaxis-inspired PDE model for airborne infectious disease transmission: analysis and simulations | Pierluigi Colli et.al. | 2404.17506 | null |
2024-04-26 | TextGaze: Gaze-Controllable Face Generation with Natural Language | Hengfei Wang et.al. | 2404.17486 | null |
2024-04-26 | Consistent Second Moment Methods with Scalable Linear Solvers for Radiation Transport | Samuel Olivier et.al. | 2404.17473 | null |
2024-04-26 | Quasi particle model vs lattice QCD thermodynamics: extension to |
Maria Lucia Sambataro et.al. | 2404.17459 | null |
2024-04-26 | Vaporization dynamics of a super-heated water-in-oil droplet: modeling and numerical solution | Muhammad Saeed Saleem et.al. | 2404.17457 | null |
2024-04-26 | Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation | Seungwook Kim et.al. | 2404.17419 | null |
2024-04-25 | The Third Monocular Depth Estimation Challenge | Jaime Spencer et.al. | 2404.16831 | null |
2024-04-25 | Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials | Ye Fang et.al. | 2404.16829 | null |
2024-04-25 | ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving | Jiehui Huang et.al. | 2404.16771 | link |
2024-04-25 | Analysis of Ethanol Blending Effects on Auto-Ignition and Heat Release in n-Heptane/Ethanol Non-Premixed Flames | Liang Ji et.al. | 2404.16762 | null |
2024-04-25 | Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior | Han Wang et.al. | 2404.16678 | null |
2024-04-25 | The First Estimation of the Ambipolar Diffusivity Coefficient from Multi-Scale Observations of the Class 0/I Protostar, HOPS-370 | Travis J. Thieme et.al. | 2404.16668 | null |
2024-04-25 | Inferring solid-state diffusivity in lithium-ion battery active materials: improving upon the classical GITT method | A. Emir Gumrukcuoglu et.al. | 2404.16658 | null |
2024-04-25 | Denoising: from classical methods to deep CNNs | Jean-Eric Campagne et.al. | 2404.16617 | link |
2024-04-25 | Stochastic Dissipative Euler's equations for a free body | J. A. de la Torre et.al. | 2404.16613 | null |
2024-04-25 | MuseumMaker: Continual Style Customization without Catastrophic Forgetting | Chenxi Liu et.al. | 2404.16612 | null |
2024-04-24 | Optimizing OOD Detection in Molecular Graphs: A Novel Approach with Diffusion Models | Xu Shen et.al. | 2404.15625 | null |
2024-04-24 | A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution | Zhixiong Yang et.al. | 2404.15620 | link |
2024-04-23 | Measuring topological constraint relaxation in ring-linear polymer blends | Daniel L. Vigil et.al. | 2404.15560 | null |
2024-04-23 | Thermal boundary conductance of sharp metal-diamond interfaces predicted by machine learning molecular dynamics | Khalid Zobaid Adnan et.al. | 2404.15465 | null |
2024-04-23 | ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning | Weifeng Chen et.al. | 2404.15449 | null |
2024-04-23 | GLoD: Composing Global Contexts and Local Details in Image Generation | Moyuru Yamada et.al. | 2404.15447 | null |
2024-04-23 | Thermal boundary conductance and thermal conductivity strongly depend on nearby environment | Khalid Zobaid Adnan et.al. | 2404.15439 | null |
2024-04-23 | ID-Animator: Zero-Shot Identity-Preserving Human Video Generation | Xuanhua He et.al. | 2404.15275 | link |
2024-04-23 | From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation | Zehuan Huang et.al. | 2404.15267 | null |
2024-04-23 | Score matching for sub-Riemannian bridge sampling | Erlend Grong et.al. | 2404.15258 | null |
2024-04-23 | Nucleation mechanism of multiple-order parameter ferroelectric domain wall motion in hafnia | Songsong Zhou et.al. | 2404.15251 | null |
2024-04-23 | Local well-posedness for a novel nonlocal model for cell-cell adhesion via receptor binding | Mabel Lizzy Rajendran et.al. | 2404.15222 | null |
2024-04-23 | Heat flow, log-concavity, and Lipschitz transport maps | Giovanni Brigati et.al. | 2404.15205 | null |
2024-04-23 | Signature of Particle Diffusion on the X-ray Spectra of the blazar Mkn 421 | C. Baheeja et.al. | 2404.15171 | null |
2024-04-23 | A general multi-wave quasi-resonance theory for lattice energy diffusion | Wei Lin et.al. | 2404.15147 | null |
2024-04-23 | CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method | Mingbao Lin et.al. | 2404.15141 | link |
2024-04-23 | Taming Diffusion Probabilistic Models for Character Control | Rui Chen et.al. | 2404.15121 | null |
2024-04-22 | Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses | Inhee Lee et.al. | 2404.14410 | null |
2024-04-22 | GeoDiffuser: Geometry-Based Image Editing with Diffusion Models | Rahul Sajnani et.al. | 2404.14403 | null |
2024-04-22 | Observational characterisation of large-scale transport and horizontal turbulent diffusivity in the quiet Sun | F. Rincon et.al. | 2404.14383 | null |
2024-04-22 | TAVGBench: Benchmarking Text to Audible-Video Generation | Yuxin Mao et.al. | 2404.14381 | link |
2024-04-22 | Temporal Entanglement Profiles in Dual-Unitary Clifford Circuits with Measurements | Jiangtian Yao et.al. | 2404.14374 | null |
2024-04-22 | Operando Analysis of Adsorption-Limited Hydrogen Oxidation Reaction at Palladium Surfaces | Yukun Liu et.al. | 2404.14348 | null |
2024-04-22 | Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion | Alexander Shmakov et.al. | 2404.14332 | null |
2024-04-22 | X-Ray: A Sequential 3D Representation for Generation | Tao Hu et.al. | 2404.14329 | link |
2024-04-22 | Towards Better Adversarial Purification via Adversarial Denoising Diffusion Training | Yiming Liu et.al. | 2404.14309 | null |
2024-04-22 | Collaborative Filtering Based on Diffusion Models: Unveiling the Potential of High-Order Connectivity | Yu Hou et.al. | 2404.14240 | link |
2024-04-19 | Analysis of Classifier-Free Guidance Weight Schedulers | Xi Wang et.al. | 2404.13040 | null |
2024-04-19 | A multigrain-multilayer astrochemical model with variable desorption energy for surface species | Juris Kalvans et.al. | 2404.13011 | null |
2024-04-19 | RadRotator: 3D Rotation of Radiographs with Diffusion Models | Pouria Rouzrokh et.al. | 2404.13000 | null |
2024-04-19 | Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics | Xiaofei Wang et.al. | 2404.12973 | null |
2024-04-19 | On the McKean-Vlasov SDE with branching | Julien Claisse et.al. | 2404.12964 | null |
2024-04-19 | Robust hybrid finite element methods for reaction-dominated diffusion problems | Thomas Führer et.al. | 2404.12956 | null |
2024-04-19 | Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling | Grigory Bartosh et.al. | 2404.12940 | null |
2024-04-19 | Diffusive contact between randomly driven colloidal suspensions | Galor Geva et.al. | 2404.12929 | null |
2024-04-19 | Zero-Shot Medical Phrase Grounding with Off-the-shelf Diffusion Models | Konstantinos Vilouras et.al. | 2404.12920 | null |
2024-04-19 | Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images | Santosh et.al. | 2404.12908 | link |
2024-04-18 | G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis | Yufei Ye et.al. | 2404.12383 | null |
2024-04-18 | Lazy Diffusion Transformer for Interactive Image Editing | Yotam Nitzan et.al. | 2404.12382 | null |
2024-04-18 | Learning the Domain Specific Inverse NUFFT for Accelerated Spiral MRI using Diffusion Models | Trevor J. Chan et.al. | 2404.12361 | null |
2024-04-18 | AniClipart: Clipart Animation with Text-to-Video Priors | Ronghuan Wu et.al. | 2404.12347 | null |
2024-04-18 | Customizing Text-to-Image Diffusion with Camera Viewpoint Control | Nupur Kumari et.al. | 2404.12333 | null |
2024-04-18 | Guided Discrete Diffusion for Electronic Health Record Generation | Zixiang Chen et.al. | 2404.12314 | null |
2024-04-18 | Investigation of Spin-Pumping and -Transport in the Ni80Fe20/Pt/Co Asymmetric Trilayer | Shilpa Samdani et.al. | 2404.12307 | null |
2024-04-18 | RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective | Chenxi Wang et.al. | 2404.12281 | null |
2024-04-18 | A New Computational Method for Energetic Particle Acceleration and Transport with its Feedback | Jeongbhin Seo et.al. | 2404.12276 | null |
2024-04-18 | Tree-Based Nonlinear Reduced Modeling | Diane Guignard et.al. | 2404.12262 | null |
2024-04-17 | Factorized Diffusion: Perceptual Illusions by Noise Decomposition | Daniel Geng et.al. | 2404.11615 | null |
2024-04-17 | InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior | Zhiheng Liu et.al. | 2404.11613 | null |
2024-04-17 | IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination | Xi Chen et.al. | 2404.11593 | null |
2024-04-17 | Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding | Zezhong Fan et.al. | 2404.11589 | null |
2024-04-17 | Emulators for scarce and noisy data: application to auxiliary field diffusion Monte Carlo for the deuteron | Rahul Somasundaram et.al. | 2404.11566 | null |
2024-04-17 | MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation | Kuan-Chieh et.al. | 2404.11565 | null |
2024-04-17 | Predicting Long-horizon Futures by Conditioning on Geometry and Time | Tarasha Khurana et.al. | 2404.11554 | null |
2024-04-17 | A Bayesian level-set inversion method for simultaneous reconstruction of absorption and diffusion coefficients in diffuse optical tomography | Anuj Abhishek et.al. | 2404.11552 | null |
2024-04-17 | SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening | Yu Zhong et.al. | 2404.11537 | null |
2024-04-17 | Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt | Zhanjie Zhang et.al. | 2404.11474 | link |
2024-04-16 | Searching for cold gas traced by MgII quasar absorbers in massive X-ray-selected galaxy clusters | A. Y. Fresco et.al. | 2404.10773 | null |
2024-04-16 | RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting | Ashkan Mirzaei et.al. | 2404.10765 | null |
2024-04-16 | LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? | Yuchi Wang et.al. | 2404.10763 | link |
2024-04-16 | A High-Order Conservative Cut Finite Element Method for Problems in Time-Dependent Domains | Sebastian Myrbäck et.al. | 2404.10756 | link |
2024-04-16 | GazeHTA: End-to-end Gaze Target Detection with Head-Target Association | Zhi-Yi Lin et.al. | 2404.10718 | null |
2024-04-16 | Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution | Yutao Yuan et.al. | 2404.10688 | link |
2024-04-16 | Generating Human Interaction Motions in Scenes with Text Control | Hongwei Yi et.al. | 2404.10685 | null |
2024-04-16 | StyleCity: Large-Scale 3D Urban Scenes Stylization with Vision-and-Text Reference via Progressive Optimization | Yingshu Chen et.al. | 2404.10681 | null |
2024-04-16 | Arsenic diffusion in MOVPE-Grown GaAs/Ge epitaxial structures | V. Orejuela et.al. | 2404.10669 | null |
2024-04-16 | Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay | Jinmei Liu et.al. | 2404.10662 | link |
2024-04-15 | Accurate quantum Monte Carlo forces for machine-learned force fields: Ethanol as a benchmark | Emiel Slootman et.al. | 2404.09755 | null |
2024-04-15 | Electric potential during tokamak disruptions and steady-state current drive | Allen H Boozer et.al. | 2404.09744 | null |
2024-04-15 | Equipping Diffusion Models with Differentiable Spatial Entropy for Low-Light Image Enhancement | Wenyi Lian et.al. | 2404.09735 | link |
2024-04-15 | Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models | Ziwei Luo et.al. | 2404.09732 | link |
2024-04-15 | Structure and dynamics of active string fluids and gels formed by dipolar active Brownian particles | Maria Kelidou et.al. | 2404.09693 | null |
2024-04-15 | Deformable MRI Sequence Registration for AI-based Prostate Cancer Diagnosis | Alessa Hering et.al. | 2404.09666 | null |
2024-04-15 | Impact of chirality on active Brownian particle: Exact moments in two and three dimensions | Anweshika Pattanayak et.al. | 2404.09650 | null |
2024-04-15 | All-in-one simulation-based inference | Manuel Gloeckler et.al. | 2404.09636 | link |
2024-04-15 | Branching diffusion processes and spectral properties of Feynman-Kac semigroup | Pierre Collet et.al. | 2404.09568 | null |
2024-04-15 | Entropy on the Path Space and Application to Singular Diffusions and Mean-field Models | Patrick Cattiaux et.al. | 2404.09552 | null |
2024-04-15 | Turbulent ice-ocean boundary layers in the well-mixed regime: insights from direct numerical simulations | Louis-Alexandre Couston et.al. | 2404.09545 | null |
2024-04-12 | Lossy Image Compression with Foundation Diffusion Models | Lucas Relic et.al. | 2404.08580 | null |
2024-04-12 | Functional reducibility of higher-order networks | Maxime Lucas et.al. | 2404.08547 | link |
2024-04-12 | Echoes of darkness: Supernova-neutrino-boosted dark matter from all galaxies | Yen-Hsun Lin et.al. | 2404.08528 | link |
2024-04-12 | Generalized Hydrodynamics for the Volterra lattice: Ballistic and nonballistic behavior of correlation functions | Guido Mazzuca et.al. | 2404.08499 | null |
2024-04-12 | PiRD: Physics-informed Residual Diffusion for Flow Field Reconstruction | Siming Shan et.al. | 2404.08412 | null |
2024-04-12 | Estimate of force noise from electrostatic patch potentials in LISA Pathfinder | Stefano Vitale et.al. | 2404.08340 | null |
2024-04-12 | Struggle with Adversarial Defense? Try Diffusion | Yujie Li et.al. | 2404.08273 | null |
2024-04-12 | An XRISM observation proposal: Gas velocity in the merging cluster Abell 2256 | Takayuki Tamura et.al. | 2404.08267 | null |
2024-04-12 | Balanced Mixed-Type Tabular Data Synthesis with Diffusion Models | Zeyu Yang et.al. | 2404.08254 | null |
2024-04-12 | An Asymptotically-Correct Implicit-Explicit Time Integration Scheme for Finite Volume Radiation-Hydrodynamics | Chong-Chong He et.al. | 2404.08247 | link |
2024-04-11 | OpenBias: Open-set Bias Detection in Text-to-Image Generative Models | Moreno D'Incà et.al. | 2404.07990 | link |
2024-04-11 | ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback | Ming Li et.al. | 2404.07987 | link |
2024-04-11 | View Selection for 3D Captioning via Diffusion Ranking | Tiange Luo et.al. | 2404.07984 | null |
2024-04-11 | Taming Stable Diffusion for Text to 360° Panorama Image Generation | Cheng Zhang et.al. | 2404.07949 | link |
2024-04-11 | Active Carpets in floating viscous films | Felipe A. Barros et.al. | 2404.07856 | null |
2024-04-11 | Adaptive Hyperbolic-cross-space Mapped Jacobi Method on Unbounded Domains with Applications to Solving Multidimensional Spatiotemporal Integrodifferential Equations | Yunhong Deng et.al. | 2404.07844 | null |
2024-04-11 | The Cattaneo-Christov approximation of Fourier heat-conductive compressible fluids | Timothée Crin-Barat et.al. | 2404.07809 | null |
2024-04-11 | ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model | Lifan Jiang et.al. | 2404.07773 | link |
2024-04-11 | An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization | Minshuo Chen et.al. | 2404.07771 | null |
2024-04-11 | Joint Conditional Diffusion Model for Image Restoration with Mixed Degradations | Yufeng Yue et.al. | 2404.07770 | null |
2024-04-10 | GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models | Zewei Zhang et.al. | 2404.07206 | null |
2024-04-10 | RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion | Jaidev Shriram et.al. | 2404.07199 | null |
2024-04-10 | InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models | Jiale Xu et.al. | 2404.07191 | link |
2024-04-10 | Move Anything with Layered Scene Diffusion | Jiawei Ren et.al. | 2404.07178 | null |
2024-04-10 | Understanding Dynamics in Coarse-Grained Models: IV. Connection of Fine-Grained and Coarse-Grained Dynamics with the Stokes-Einstein and Stokes-Einstein-Debye Relations | Jaehyeok Jin et.al. | 2404.07156 | null |
2024-04-10 | A conservative Eulerian finite element method for transport and diffusion in moving domains | Maxim Olshanskii et.al. | 2404.07130 | link |
2024-04-10 | Open reaction-diffusion systems: bridging probabilistic theory across scales | Mauricio J. del Razo et.al. | 2404.07119 | null |
2024-04-10 | Diffusion-based inpainting of incomplete Euclidean distance matrices of trajectories generated by a fractional Brownian motion | Alexander Lobashev et.al. | 2404.07029 | link |
2024-04-10 | On the conjugate interface conditions and Galilean invariance | Yang Hu et.al. | 2404.07025 | null |
2024-04-10 | Non-Degenerate One-Time Pad and the integrity of perfectly secret messages | Alex Shafarenko et.al. | 2404.07022 | null |
2024-04-09 | Convergence analysis of novel discontinuous Galerkin methods for a convection dominated problem | Satyajith Bommana Boyana et.al. | 2404.06490 | null |
2024-04-09 | Uncovering Tidal Treasures: Automated Classification of Faint Tidal Features in DECaLS Data | Alexander J. Gordon et.al. | 2404.06487 | null |
2024-04-09 | GeoDirDock: Guiding Docking Along Geodesic Paths | Raúl Miñán et.al. | 2404.06481 | null |
2024-04-09 | Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion | Fan Yang et.al. | 2404.06429 | null |
2024-04-09 | ZeST: Zero-Shot Material Transfer from a Single Image | Ta-Ying Cheng et.al. | 2404.06425 | null |
2024-04-09 | Policy-Guided Diffusion | Matthew Thomas Jackson et.al. | 2404.06356 | link |
2024-04-09 | Quantum State Generation with Structure-Preserving Diffusion Model | Yuchen Zhu et.al. | 2404.06336 | null |
2024-04-09 | Compensating slice emittance growth in high brightness photoinjectors using sacrificial charge | W. H. Li et.al. | 2404.06312 | null |
2024-04-09 | NoiseNCA: Noisy Seed Improves Spatio-Temporal Continuity of Neural Cellular Automata | Ehsan Pajouheshgar et.al. | 2404.06279 | null |
2024-04-09 | A Large-Scale Simulation Method for Neuromorphic Circuits | Amir Shahhosseini et.al. | 2404.06255 | null |
2024-04-08 | The neutrino background from non-jetted active galactic nuclei | P. Padovani et.al. | 2404.05690 | null |
2024-04-08 | MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation | Kunpeng Song et.al. | 2404.05674 | link |
2024-04-08 | NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement | Giordano Cicchetti et.al. | 2404.05669 | link |
2024-04-08 | YaART: Yet Another ART Rendering Technology | Sergey Kastryulin et.al. | 2404.05666 | null |
2024-04-08 | BinaryDM: Towards Accurate Binarization of Diffusion Model | Xingyu Zheng et.al. | 2404.05662 | link |
2024-04-08 | Convergence rates for the finite volume scheme of the stochastic heat equation | Niklas Sapountzoglou et.al. | 2404.05655 | null |
2024-04-09 | The persistence of high altitude non-equilibrium diffuse ionized gas in simulations of star forming galaxies | Lewis McCallum et.al. | 2404.05651 | null |
2024-04-08 | Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model | Jichang Yang et.al. | 2404.05648 | link |
2024-04-08 | eDIG-CHANGES II: Project Design and Initial Results on NGC 3556 | Jiang-Tao Li et.al. | 2404.05628 | null |
2024-04-08 | Learning a Category-level Object Pose Estimator without Pose Annotations | Fengrui Tian et.al. | 2404.05626 | null |
2024-04-05 | Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models | Sangwon Jang et.al. | 2404.04243 | null |
2024-04-05 | ToolEENet: Tool Affordance 6D Pose Estimation | Yunlong Wang et.al. | 2404.04193 | null |
2024-04-05 | Nonlocally coupled moisture model for convective self-aggregation | Tomoro Yanase et.al. | 2404.04146 | null |
2024-04-05 | Rare events, time crystals and symmetry-breaking dynamical phase transitions | Rubén Hurtado-Gutiérrez et.al. | 2404.04135 | null |
2024-04-05 | A posteriori error analysis of a space-time hybridizable discontinuous Galerkin method for the advection-diffusion problem | Yuan Wang et.al. | 2404.04130 | null |
2024-04-05 | Dynamic Prompt Optimizing for Text-to-Image Generation | Wenyi Mo et.al. | 2404.04095 | link |
2024-04-05 | A first passage model of intravitreal drug delivery and residence time, in relation to ocular geometry, individual variability, and injection location | Patricia Lamirande et.al. | 2404.04086 | null |
2024-04-05 | Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation | Mingyuan Zhou et.al. | 2404.04057 | link |
2024-04-05 | InstructHumans: Editing Animated 3D Human Textures with Instructions | Jiayin Zhu et.al. | 2404.04037 | null |
2024-04-05 | Impacts of non-thermal emission on the images of black hole shadow and extended jets in two-temperature GRMHD simulations | Mingyuan Zhang et.al. | 2404.04033 | null |
2024-04-04 | MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation | Hanzhe Hu et.al. | 2404.03656 | null |
2024-04-04 | CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching | Dongzhi Jiang et.al. | 2404.03653 | link |
2024-04-04 | The More You See in 2D, the More You Perceive in 3D | Xinyang Han et.al. | 2404.03652 | null |
2024-04-04 | DiffBody: Human Body Restoration by Imagining with Generative Diffusion Prior | Yiming Zhang et.al. | 2404.03642 | null |
2024-04-04 | LCM-Lookahead for Encoder-based Text-to-Image Personalization | Rinon Gal et.al. | 2404.03620 | null |
2024-04-04 | DiffDet4SAR: Diffusion-based Aircraft Target Detection Network for SAR Images | Zhou Jie et.al. | 2404.03595 | link |
2024-04-04 | PointInfinity: Resolution-Invariant Point Diffusion Models | Zixuan Huang et.al. | 2404.03566 | null |
2024-04-04 | Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models | Siyuan Mei et.al. | 2404.03541 | null |
2024-04-04 | Impact of the Magnetic Horizon on the Interpretation of the Pierre Auger Observatory Spectrum and Composition Data | The Pierre Auger Collaboration et.al. | 2404.03533 | null |
2024-04-04 | Significantly Enhanced Vacancy Diffusion in Mn-containing Alloys | Huaqing Guan et.al. | 2404.03339 | null |
2024-04-03 | Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction | Keyu Tian et.al. | 2404.02905 | link |
2024-04-03 | LidarDM: Generative LiDAR Simulation in a Generated World | Vlas Zyrianov et.al. | 2404.02903 | link |
2024-04-03 | MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment | Duygu Ceylan et.al. | 2404.02899 | null |
2024-04-03 | On the Scalability of Diffusion-based Text-to-Image Generation | Hao Li et.al. | 2404.02883 | null |
2024-04-03 | Uniqueness of the blow-down limit for triple junction problem | Zhiyuan Geng et.al. | 2404.02859 | null |
2024-04-03 | Efficient Quantum Circuits for Non-Unitary and Unitary Diagonal Operators with Space-Time-Accuracy trade-offs | Julien Zylberman et.al. | 2404.02819 | null |
2024-04-03 | Fast Diffusion Model For Seismic Data Noise Attenuation | Junheng Peng et.al. | 2404.02767 | null |
2024-04-03 | Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models | Wentian Zhang et.al. | 2404.02747 | link |
2024-04-03 | InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation | Haofan Wang et.al. | 2404.02733 | link |
2024-04-03 | Harnessing the Power of Large Vision Language Models for Synthetic Image Detection | Mamadou Keita et.al. | 2404.02726 | link |
2024-04-02 | Diffusion |
Zeyu Yang et.al. | 2404.02148 | link |
2024-04-02 | A Stabilized Parametric Finite Element Method for Surface Diffusion with an Arbitrary Surface Energy | Yulin Zhang et.al. | 2404.02083 | null |
2024-04-02 | WcDT: World-centric Diffusion Transformer for Traffic Scene Generation | Chen Yang et.al. | 2404.02082 | link |
2024-04-02 | Brownian Particles and Matter Waves | Nicos Makris et.al. | 2404.02016 | null |
2024-04-02 | Superionic Fluoride Gate Dielectrics with Low Diffusion Barrier for Advanced Electronics | Kui Meng et.al. | 2404.02011 | null |
2024-04-02 | AUTODIFF: Autoregressive Diffusion Modeling for Structure-based Drug Design | Xinze Li et.al. | 2404.02003 | null |
2024-04-02 | Rigorous derivation of an effective model for coupled Stokes advection, reaction and diffusion with freely evolving microstructure | Markus Gahn et.al. | 2404.01983 | null |
2024-04-02 | Bi-LORA: A Vision-Language Approach for Synthetic Image Detection | Mamadou Keita et.al. | 2404.01959 | link |
2024-04-02 | Nonlinear stability for active suspensions | Helge Dietert et.al. | 2404.01906 | null |
2024-04-02 | On the surface helium abundance of B-type hot subdwarf stars from the WD+MS channel of Type Ia supernovae | Rui-Jie Ji et.al. | 2404.01905 | null |
2024-03-29 | Relation Rectification in Diffusion Model | Yinwei Wu et.al. | 2403.20249 | null |
2024-03-29 | Graph Neural Aggregation-diffusion with Metastability | Kaiyuan Cui et.al. | 2403.20221 | null |
2024-03-29 | Scaled Brownian motion with random anomalous diffusion exponent | Hubert Woszczek et.al. | 2403.20206 | null |
2024-03-29 | Motion Inversion for Video Customization | Luozhou Wang et.al. | 2403.20193 | null |
2024-03-29 | Energy solutions of the Cauchy-Dirichlet problem for fractional nonlinear diffusion equations | Goro Akagi et.al. | 2403.20176 | null |
2024-03-29 | Na Vacancy Driven Phase Transformation and Fast Ion Conduction in W-doped Na $_3$SbS$_4$ from Machine Learning Force Fields | Johan Klarbring et.al. | 2403.20138 | null |
2024-03-29 | FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models | Barbara Toniella Corradini et.al. | 2403.20105 | null |
2024-03-29 | SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior | Zhongrui Yu et.al. | 2403.20079 | null |
2024-03-29 | Efficacy of the Sterile Insect Technique in the presence of inaccessible areas: A study using two-patch models | Pierre-Alexandre Bliman et.al. | 2403.20069 | null |
2024-03-29 | Optimal s-boxes against alternative operations | Marco Calderini et.al. | 2403.20059 | null |
2024-03-28 | GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling | Bowen Zhang et.al. | 2403.19655 | null |
2024-03-28 | Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond | Katherine Xu et.al. | 2403.19653 | link |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652 | null |
2024-03-28 | GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models | Yusuf Dalva et.al. | 2403.19645 | null |
2024-03-28 | In the driver's mind: modeling the dynamics of human overtaking decisions in interactions with oncoming automated vehicles | Samir H. A. Mohammad et.al. | 2403.19637 | null |
2024-03-28 | Generalisation of the Spectral Difference scheme for the diffused-interface five equation model | Niccolò Tonicello et.al. | 2403.19623 | null |
2024-03-28 | More on Black Holes Perceiving the Dark Dimension | Luis A. Anchordoqui et.al. | 2403.19604 | null |
2024-03-28 | Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model | Zhicai Wang et.al. | 2403.19600 | link |
2024-03-28 | Frame by Familiar Frame: Understanding Replication in Video Diffusion Models | Aimon Rahman et.al. | 2403.19593 | null |
2024-03-28 | Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics | Norman Di Palo et.al. | 2403.19578 | null |
2024-03-27 | ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion | Daniel Winter et.al. | 2403.18818 | null |
2024-03-27 | Garment3DGen: 3D Garment Stylization and Texture Generation | Nikolaos Sarafianos et.al. | 2403.18816 | null |
2024-03-28 | ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Suraj Patni et.al. | 2403.18807 | link |
2024-03-27 | Dimension-independent functional inequalities by tensorization and projection arguments | Fabrice Baudoin et.al. | 2403.18799 | null |
2024-03-27 | Object Pose Estimation via the Aggregation of Diffusion Features | Tianfu Wang et.al. | 2403.18791 | link |
2024-03-27 | ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | Chenshuang Zhang et.al. | 2403.18775 | link |
2024-03-27 | Convergence rates under a range invariance condition with application to electrical impedance tomography | Barbara Kaltenbacher et.al. | 2403.18704 | null |
2024-03-27 | A Diffusion-Based Generative Equalizer for Music Restoration | Eloi Moliner et.al. | 2403.18636 | link |
2024-03-28 | FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing | Trong-Tung Nguyen et.al. | 2403.18605 | null |
2024-03-27 | HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions | Hao Xu et.al. | 2403.18575 | link |
2024-03-26 | ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis | Muhammad Hamza Mughal et.al. | 2403.17936 | null |
2024-03-26 | SLEDGE: Synthesizing Simulation Environments for Driving Agents with Generative Models | Kashyap Chitta et.al. | 2403.17933 | null |
2024-03-26 | The instability mechanism of compact multiplanet systems | Caleb Lammers et.al. | 2403.17928 | null |
2024-03-26 | AID: Attention Interpolation of Text-to-Image Diffusion | Qiyuan He et.al. | 2403.17924 | link |
2024-03-26 | Emergent Anomalous Hydrodynamics at Infinite Temperature in a Long-Range XXZ Model | Ang Yang et.al. | 2403.17912 | null |
2024-03-26 | The Solution to an Impulse Control Problem Motivated by Optimal Harvesting | Zhesheng Liu et.al. | 2403.17875 | null |
2024-03-26 | Boosting Diffusion Models with Moving Average Sampling in Frequency Domain | Yurui Qian et.al. | 2403.17870 | null |
2024-03-26 | Universal entropy transport far from equilibrium across the BCS-BEC crossover | Jeffrey Mohan et.al. | 2403.17838 | null |
2024-03-26 | The memory of Rayleigh-Taylor turbulence | S. Thévenin et.al. | 2403.17832 | null |
2024-03-26 | DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions | Sammy Christen et.al. | 2403.17827 | null |
2024-03-25 | Exploiting Priors from 3D Diffusion Models for RGB-Based One-Shot View Planning | Sicong Pan et.al. | 2403.16803 | link |
2024-03-25 | Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise | Dilum Fernando et.al. | 2403.16790 | null |
2024-03-25 | Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases | Sophie Starck et.al. | 2403.16776 | null |
2024-03-25 | Stochastic Inertial Dynamics Via Time Scaling and Averaging | Rodrigo Maulen-Soto et.al. | 2403.16775 | null |
2024-03-25 | Multilevel Modeling as a Methodology for the Simulation of Human Mobility | Luca Serena et.al. | 2403.16745 | null |
2024-03-25 | A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models | Nils Ingelhag et.al. | 2403.16730 | null |
2024-03-25 | Improving Diffusion Models's Data-Corruption Resistance using Scheduled Pseudo-Huber Loss | Artem Khrapov et.al. | 2403.16728 | link |
2024-03-25 | The effect of inter-track coupling on H $_2$O$_2$ productions | Ramin Abolfath et.al. | 2403.16722 | null |
2024-03-25 | Phase Transformation in Lithium Niobate-Lithium Tantalate Solid Solutions (LiNb $_{1-x}$Ta$_x$O$_3$ ) | Fatima El Azzouzi et.al. | 2403.16717 | null |
2024-03-25 | The Directionality of Gravitational and Thermal Diffusive Transport in Geologic Fluid Storage | Anna Herring et.al. | 2403.16659 | null |
2024-03-22 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | Hanrong Ye et.al. | 2403.15389 | null |
2024-03-22 | LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis | Kevin Xie et.al. | 2403.15385 | null |
2024-03-22 | Energy-dependent Boosted Dark Matter from Diffuse Supernova Neutrino Background | Anirban Das et.al. | 2403.15367 | null |
2024-03-22 | Ultrasound Imaging based on the Variance of a Diffusion Restoration Model | Yuxin Zhang et.al. | 2403.15316 | null |
2024-03-22 | Controlled Training Data Generation with Diffusion Models | Teresa Yeo et.al. | 2403.15309 | null |
2024-03-22 | Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies | Nicolò Botteghi et.al. | 2403.15267 | null |
2024-03-22 | Spectral Motion Alignment for Video Motion Transfer using Diffusion Models | Geon Yeong Park et.al. | 2403.15249 | null |
2024-03-22 | Shadow Generation for Composite Image Using Diffusion model | Qingyang Liu et.al. | 2403.15234 | link |
2024-03-22 | Broad Instantaneous Bandwidth Microwave Spectrum Analyzer with a Microfabricated Atomic Vapor Cell | Yongqi Shi et.al. | 2403.15155 | null |
2024-03-22 | Oxygenation of CO and NO on Amorphous Solid Water | Meenu Upadhyay et.al. | 2403.15141 | null |
2024-03-21 | Simplified Diffusion Schrödinger Bridge | Zhicong Tang et.al. | 2403.14623 | link |
2024-03-21 | GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation | Yinghao Xu et.al. | 2403.14621 | link |
2024-03-21 | Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion | Xiang Fan et.al. | 2403.14617 | null |
2024-03-21 | DreamReward: Text-to-3D Generation with Human Preference | Junliang Ye et.al. | 2403.14613 | null |
2024-03-21 | ReNoise: Real Image Inversion Through Iterative Noising | Daniel Garibi et.al. | 2403.14602 | null |
2024-03-21 | Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors | Nikolaos Tsagkas et.al. | 2403.14526 | null |
2024-03-21 | Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting | Alicia Durrer et.al. | 2403.14499 | link |
2024-03-21 | Periodicity from X-ray sources within the inner Galactic disk | Samaresh Mondal et.al. | 2403.14480 | null |
2024-03-21 | Analysing Diffusion Segmentation for Medical Images | Mathias Öttl et.al. | 2403.14440 | null |
2024-03-21 | Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation | Mathias Öttl et.al. | 2403.14429 | null |
2024-03-20 | On Pretraining Data Diversity for Self-Supervised Learning | Hasan Abed Al Kader Hammoud et.al. | 2403.13808 | link |
2024-03-20 | Editing Massive Concepts in Text-to-Image Diffusion Models | Tianwei Xiong et.al. | 2403.13807 | link |
2024-03-20 | ZigMa: Zigzag Mamba Diffusion Model | Vincent Tao Hu et.al. | 2403.13802 | link |
2024-03-20 | TimeRewind: Rewinding Time with Image-and-Events Video Diffusion | Jingxi Chen et.al. | 2403.13800 | null |
2024-03-20 | DepthFM: Fast Monocular Depth Estimation with Flow Matching | Ming Gui et.al. | 2403.13788 | null |
2024-03-20 | Anomalous diffusion in polydisperse granular gases: Monte Carlo simulations | Anna S. Bodrova et.al. | 2403.13772 | null |
2024-03-20 | Disentangling the anisotropic radio sky: Fisher forecasts for 21cm arrays | Zheng Zhang et.al. | 2403.13768 | null |
2024-03-20 | Statistical estimation of full-sky radio maps from 21cm array visibility data using Gaussian Constrained Realisations | Katrine A. Glasscock et.al. | 2403.13766 | null |
2024-03-20 | Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation | Fu-Yun Wang et.al. | 2403.13745 | link |
2024-03-20 | Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes | Yifan Chen et.al. | 2403.13724 | null |
2024-03-19 | FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis | Linjiang Huang et.al. | 2403.12963 | link |
2024-03-19 | FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation | Shuai Yang et.al. | 2403.12962 | link |
2024-03-19 | TexTile: A Differentiable Metric for Texture Tileability | Carlos Rodriguez-Pardo et.al. | 2403.12961 | link |
2024-03-19 | GVGEN: Text-to-3D Generation with Volumetric Representation | Xianglong He et.al. | 2403.12957 | null |
2024-03-19 | Zero-Reference Low-Light Enhancement via Physical Quadruple Priors | Wenjing Wang et.al. | 2403.12933 | null |
2024-03-19 | You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs | Yihong Luo et.al. | 2403.12931 | link |
2024-03-19 | Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model | Jiajie Yang et.al. | 2403.12915 | link |
2024-03-19 | H |
I. Busa et.al. | 2403.12872 | null |
2024-03-19 | D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation | Jun Yamada et.al. | 2403.12861 | null |
2024-03-19 | Generative Enhancement for 3D Medical Images | Lingting Zhu et.al. | 2403.12852 | link |
2024-03-18 | Scaling limit of heavy tailed nearly unstable INAR( |
Yingli Wang et.al. | 2403.11773 | null |
2024-03-18 | Irradiation induced mineral changes of NWA10580 meteorite determined by infrared analysis | I. Gyollai et.al. | 2403.11725 | null |
2024-03-18 | Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models | Emilian Postolache et.al. | 2403.11706 | link |
2024-03-19 | Urban Scene Diffusion through Semantic Occupancy Map | Junge Zhang et.al. | 2403.11697 | null |
2024-03-18 | Narrow absorption lines from intervening material in supernovae I. Measurements and temporal evolution | Santiago González-Gaitán et.al. | 2403.11677 | null |
2024-03-18 | Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection | Julia Wolleb et.al. | 2403.11667 | null |
2024-03-18 | Diffusion-Based Environment-Aware Trajectory Prediction | Theodor Westny et.al. | 2403.11643 | null |
2024-03-18 | Arc2Face: A Foundation Model of Human Faces | Foivos Paraperas Papantoniou et.al. | 2403.11641 | link |
2024-03-18 | Quasinormal Modes of Near-Extremal Electric and Magnetic Black Branes | Swapnil Nitin Shah et.al. | 2403.11640 | null |
2024-03-18 | LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models | Yang Yang et.al. | 2403.11627 | link |
2024-03-15 | Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives | Ronghui Li et.al. | 2403.10518 | link |
2024-03-15 | Active transport of a passive colloid in a bath of run-and-tumble particles | Tanumoy Dhar et.al. | 2403.10508 | null |
2024-03-15 | MusicHiFi: Fast High-Fidelity Stereo Vocoding | Ge Zhu et.al. | 2403.10493 | null |
2024-03-15 | New functional inequalities with applications to the arctan-fast diffusion equation | Rafael Granero-Belinchón et.al. | 2403.10458 | null |
2024-03-15 | Variance sum rule: proofs and solvable models | Ivan Di Terlizzi et.al. | 2403.10442 | null |
2024-03-15 | SculptDiff: Learning Robotic Clay Sculpting from Humans with Goal Conditioned Diffusion Policy | Alison Bartsch et.al. | 2403.10401 | null |
2024-03-15 | Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding | Pengkun Liu et.al. | 2403.10395 | link |
2024-03-15 | Denoising Task Difficulty-based Curriculum for Training Diffusion Models | Jin-Young Kim et.al. | 2403.10348 | null |
2024-03-15 | Optimal Control of Stationary Doubly Diffusive Flows on Two and Three Dimensional Bounded Lipschitz Domains: Numerical Analysis | Jai Tushar et.al. | 2403.10282 | null |
2024-03-15 | Towards Generalizable Deepfake Video Detection with Thumbnail Layout and Graph Reasoning | Yuting Xu et.al. | 2403.10261 | link |
2024-03-14 | SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior | Huan-ang Gao et.al. | 2403.09638 | null |
2024-03-14 | 3D-VLA: A 3D Vision-Language-Action Generative World Model | Haoyu Zhen et.al. | 2403.09631 | null |
2024-03-14 | Generalized Predictive Model for Autonomous Driving | Jiazhi Yang et.al. | 2403.09630 | link |
2024-03-14 | Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation | Fangfu Liu et.al. | 2403.09625 | null |
2024-03-14 | Score-Guided Diffusion for 3D Human Recovery | Anastasis Stathopoulos et.al. | 2403.09623 | link |
2024-03-14 | Explore In-Context Segmentation via Latent Diffusion Models | Chaoyang Wang et.al. | 2403.09616 | null |
2024-03-14 | Generative reconstruction of 3D volume elements for Ti-6Al-4V basketweave microstructure by optimization of CNN-based microstructural descriptors | Vincent Blümer et.al. | 2403.09609 | null |
2024-03-14 | The effect of spatially-varying collision frequency on the development of the Rayleigh-Taylor instability | John Rodman et.al. | 2403.09591 | null |
2024-03-14 | MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models | Zunnan Xu et.al. | 2403.09471 | null |
2024-03-14 | Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing | Wonjun Kang et.al. | 2403.09468 | link |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764 | null |
2024-03-13 | Spatiotemporal Diffusion Model with Paired Sampling for Accelerated Cardiac Cine MRI | Shihan Qiu et.al. | 2403.08758 | null |
2024-03-13 | Efficient Combinatorial Optimization via Heat Diffusion | Hengyuan Ma et.al. | 2403.08757 | link |
2024-03-13 | Sticky-threshold diffusions, local time approximation and parameter estimation | Alexis Anagnostakis et.al. | 2403.08754 | null |
2024-03-13 | Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI | Shihan Qiu et.al. | 2403.08749 | null |
2024-03-14 | GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing | Jing Wu et.al. | 2403.08733 | null |
2024-03-13 | Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data | Asad Aali et.al. | 2403.08728 | link |
2024-03-13 | Historical Astronomical Diagrams Decomposition in Geometric Primitives | Syrine Kalleli et.al. | 2403.08721 | null |
2024-03-13 | Limits on the OH Molecule in the Smith High Velocity Cloud | Anthony H. Minter et.al. | 2403.08704 | null |
2024-03-13 | Diffusion-based Iterative Counterfactual Explanations for Fetal Ultrasound Image Quality Assessment | Paraskevas Pegios et.al. | 2403.08700 | null |
2024-03-12 | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Shihao Zhao et.al. | 2403.07860 | link |
2024-03-12 | Quantifying and Mitigating Privacy Risks for Tabular Generative Models | Chaoyi Zhu et.al. | 2403.07842 | null |
2024-03-12 | MPCPA: Multi-Center Privacy Computing with Predictions Aggregation based on Denoising Diffusion Probabilistic Model | Guibo Luo et.al. | 2403.07838 | null |
2024-03-12 | Fragmentation of Dense Rotation-Dominated Structures Fed by Collapsing Gravomagneto-Sheetlets and Origin of Misaligned 100 au-Scale Binaries and Multiple Systems | Yisheng Tu et.al. | 2403.07777 | null |
2024-03-13 | SemCity: Semantic Scene Generation with Triplane Diffusion | Jumin Lee et.al. | 2403.07773 | link |
2024-03-12 | A first principles study of the Stark shift effect on the zero-phonon line of the NV center in diamond | Louis Alaerts et.al. | 2403.07771 | null |
2024-03-12 | Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model | Yuxuan Zhang et.al. | 2403.07764 | null |
2024-03-13 | Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion | Dongyang Li et.al. | 2403.07721 | link |
2024-03-12 | SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces | Yuta Oshima et.al. | 2403.07711 | link |
2024-03-12 | Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal | Yijun Yang et.al. | 2403.07684 | link |
2024-03-11 | BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion | Xuan Ju et.al. | 2403.06976 | link |
2024-03-11 | Bayesian Diffusion Models for 3D Shape Reconstruction | Haiyang Xu et.al. | 2403.06973 | null |
2024-03-11 | POD-ROM methods: from a finite set of snapshots to continuous-in-time approximations | Bosco Garcia-Archilla et.al. | 2403.06967 | null |
2024-03-11 | SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data | Jialu Li et.al. | 2403.06952 | null |
2024-03-12 | DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations | Tianhao Qi et.al. | 2403.06951 | link |
2024-03-11 | Conditional Score-Based Diffusion Model for Cortical Thickness Trajectory Prediction | Qing Xiao et.al. | 2403.06940 | null |
2024-03-11 | Anderson-Higgs amplitude mode in Josephson junctions | Pierre Vallet et.al. | 2403.06878 | null |
2024-03-11 | Estimation of parameters and local times in a discretely observed threshold diffusion model | Sara Mazzonetto et.al. | 2403.06858 | null |
2024-03-11 | Orbital relaxation length from first-principles scattering calculations | Max Rang et.al. | 2403.06827 | null |
2024-03-11 | A quasilinear Keller-Segel model with saturated discontinuous advection | Maria Gualdani et.al. | 2403.06820 | null |
2024-03-08 | VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | Yabo Zhang et.al. | 2403.05438 | link |
2024-03-08 | Radiation transport methods in star formation simulations | Richard Wünsch et.al. | 2403.05410 | null |
2024-03-08 | Simulating conditioned diffusions on manifolds | Marc Corstanje et.al. | 2403.05409 | link |
2024-03-08 | An implicit algorithm for simulating the dynamics of small dust grains with smoothed particle hydrodynamics | Daniel Elsender et.al. | 2403.05345 | null |
2024-03-08 | DiffSF: Diffusion Models for Scene Flow Estimation | Yushan Zhang et.al. | 2403.05327 | link |
2024-03-08 | Disorder-induced instability of a Weyl nodal loop semimetal towards a diffusive topological metal with protected multifractal surface states | João S. Silva et.al. | 2403.05298 | null |
2024-03-08 | Neutrino fluxes from different classes of galactic sources | Silvia Gagliardini et.al. | 2403.05288 | null |
2024-03-08 | Patricia's Bad Distributions | Louigi Addario-Berry et.al. | 2403.05269 | null |
2024-03-08 | Non-additivity in many-body interactions between membrane-deforming spheres increases disorder | Ali Azadbakht et.al. | 2403.05253 | null |
2024-03-08 | Noise Level Adaptive Diffusion Model for Robust Reconstruction of Accelerated MRI | Shoujin Huang et.al. | 2403.05245 | null |
2024-03-07 | Effects of mechanical stress, chemical potential, and coverage on hydrogen solubility during hydrogen enhanced decohesion of ferritic steel grain boundaries: A first-principles study | Abril Azocar Guzman et.al. | 2403.04741 | null |
2024-03-07 | Quantum-enhanced joint estimation of phase and phase diffusion | Jayanth Jayakumar et.al. | 2403.04722 | null |
2024-03-07 | ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes | Hashmat Shadab Malik et.al. | 2403.04701 | link |
2024-03-07 | Delving into the Trajectory Long-tail Distribution for Muti-object Tracking | Sijia Chen et.al. | 2403.04700 | link |
2024-03-07 | PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation | Junsong Chen et.al. | 2403.04692 | null |
2024-03-07 | Pix2Gif: Motion-Guided Diffusion for GIF Generation | Hitesh Kandala et.al. | 2403.04634 | null |
2024-03-07 | A Domain Translation Framework with an Adversarial Denoising Diffusion Model to Generate Synthetic Datasets of Echocardiography Images | Cristiana Tiago et.al. | 2403.04612 | null |
2024-03-07 | Dynamic critical behavior of the chiral phase transition from the real-time functional renormalization group | Johannes V. Roth et.al. | 2403.04573 | null |
2024-03-07 | Rescaled Mode-Coupling Scheme for the Quantitative Description of Experimentally Observed Colloid Dynamics | Joel Diaz Maier et.al. | 2403.04556 | null |
2024-03-07 | Poisson equation with measure data, reconstruction formula and Doob classes of processes | Andrzej Rozkosz et.al. | 2403.04543 | null |
2024-03-06 | 3D Diffusion Policy | Yanjie Ze et.al. | 2403.03954 | link |
2024-03-06 | GUIDE: Guidance-based Incremental Learning with Diffusion Models | Bartosz Cywiński et.al. | 2403.03938 | link |
2024-03-06 | Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation | Xiao Ma et.al. | 2403.03890 | null |
2024-03-06 | Towards a Schauder theory for fractional viscous Hamilton--Jacobi equations | Espen R. Jakobsen et.al. | 2403.03884 | null |
2024-03-06 | Latent Dataset Distillation with Diffusion Models | Brian B. Moser et.al. | 2403.03881 | null |
2024-03-06 | Convergence rate of the Smoluchowski-Kramers approximation for diffusions with jumps | Chungang Shi et.al. | 2403.03877 | null |
2024-03-06 | Accelerating Convergence of Score-Based Diffusion Models, Provably | Gen Li et.al. | 2403.03852 | null |
2024-03-06 | Two 100 TeV neutrinos coincident with the Seyfert galaxy NGC 7469 | Giacomo Sommani et.al. | 2403.03752 | null |
2024-03-06 | Diffusion on language model embeddings for protein sequence generation | Viacheslav Meshchaninov et.al. | 2403.03726 | null |
2024-03-06 | Spectral Algorithms on Manifolds through Diffusion | Weichun Xia et.al. | 2403.03669 | null |
2024-03-05 | Moment estimates, exponential integrability, concentration inequalities and exit times estimates on evolving manifolds | Robert Baumgarth et.al. | 2403.03209 | null |
2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Patrick Esser et.al. | 2403.03206 | null |
2024-03-05 | MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets | Hossein Aboutalebi et.al. | 2403.03194 | null |
2024-03-05 | Behavior Generation with Latent Actions | Seungjae Lee et.al. | 2403.03181 | link |
2024-03-05 | The Amplitude Equation for the Space-Fractional Swift-Hohenberg Equation | Christian Kuehn et.al. | 2403.03158 | null |
2024-03-05 | On dynamics of gasless combustion in slowly varying periodic media: periodic fronts, their stability and propagation-extinction-diffusion-reignition pattern | Amanda Matson et.al. | 2403.03144 | null |
2024-03-05 | Enhanced beam-beam modeling to include longitudinal variation during weak-strong simulation | Derong Xu et.al. | 2403.03137 | null |
2024-03-05 | NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models | Zeqian Ju et.al. | 2403.03100 | null |
2024-03-05 | Proof-of-concept for a nonadditive stochastic model of supercooled liquids | Antonio Cesar do Prado Rosa Junior et.al. | 2403.03041 | null |
2024-03-05 | Global N-body Simulation of Gap Edge Structures Created by Perturbations from a Small Satellite Embedded in Saturn's Rings | Naoya Torii et.al. | 2403.03012 | null |
2024-03-02 | Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models | Neta Shaul et.al. | 2403.01329 | null |
2024-03-02 | Longtime behavior of semilinear multi-term fractional in time diffusion | Nataliya Vasylyeva et.al. | 2403.01302 | null |
2024-03-02 | Anomalous mass dependency in Hydra endoderm cell cluster diffusion | Aline Lütz et.al. | 2403.01294 | null |
2024-03-02 | On the Arnold diffusion mechanism in Medium Earth Orbit | Elisa Maria Alessi et.al. | 2403.01283 | null |
2024-03-02 | Rigidity results for group von Neumann algebras with diffuse center | Ionuţ Chifan et.al. | 2403.01280 | null |
2024-03-02 | Analyzing the transport coefficients and observables of a rotating QGP medium in kinetic theory framework with a novel approach to the collision integral | Shubhalaxmi Rath et.al. | 2403.01240 | null |
2024-03-02 | DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction | Junwen Xiong et.al. | 2403.01226 | null |
2024-03-02 | TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion | Salaheldin Mohamed et.al. | 2403.01212 | null |
2024-03-02 | Atacama Large Aperture Submillimeter Telescope (AtLAST) science: Gas and dust in nearby galaxies | Daizhong Liu et.al. | 2403.01202 | null |
2024-03-02 | Modelling ion acceleration and transport in corotating interaction regions: the mass-to-charge ratio dependence of the particle spectrum | Zheyi Ding et.al. | 2403.01201 | null |
2024-02-29 | DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models | Muyang Li et.al. | 2402.19481 | link |
2024-02-29 | Towards Generalizable Tumor Synthesis | Qi Chen et.al. | 2402.19470 | link |
2024-02-29 | Anomalous contribution to galactic rotation curves due to stochastic spacetime | Jonathan Oppenheim et.al. | 2402.19459 | null |
2024-02-29 | Listening to the Noise: Blind Denoising with Gibbs Diffusion | David Heurtel-Depeiges et.al. | 2402.19455 | link |
2024-02-29 | Structure Preserving Diffusion Models | Haoye Lu et.al. | 2402.19369 | null |
2024-02-29 | A new analytical model of the cosmic-ray energy flux for Galactic diffuse radio emission | Andrea Bracco et.al. | 2402.19367 | null |
2024-02-29 | A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation | Hanxi Li et.al. | 2402.19330 | link |
2024-02-29 | DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly | Gianluca Scarpellini et.al. | 2402.19302 | link |
2024-02-29 | Modeling the Progenitor Stars of Observed IIP Supernovae | Kai-An You et.al. | 2402.19260 | link |
2024-02-29 | Generative models struggle with kirigami metamaterials | Gerrit Felsch et.al. | 2402.19196 | null |
2024-02-28 | Logarithmic Sobolev Inequalities for Bounded Domains and Applications to Drift-Diffusion Equations | Elie Abdo et.al. | 2402.18572 | null |
2024-02-28 | Diffusion Language Models Are Versatile Protein Learners | Xinyou Wang et.al. | 2402.18567 | null |
2024-02-28 | Photon statistics of resonantly driven spectrally diffusive quantum emitters | Aymeric Delteil et.al. | 2402.18542 | null |
2024-02-28 | Optimality conditions for sparse optimal control of viscous Cahn-Hilliard systems with logarithmic potential | Pierluigi Colli et.al. | 2402.18506 | null |
2024-02-28 | Dynamical Regimes of Diffusion Models | Giulio Biroli et.al. | 2402.18491 | null |
2024-02-28 | Introducing cuDisc: a 2D code for protoplanetary disc structure and evolution calculations | Alfie Robinson et.al. | 2402.18471 | link |
2024-02-28 | Effect of a perpendicular magnetic field on bilayer graphene under dual gating | Mouhamadou Hassane Saley et.al. | 2402.18399 | null |
2024-02-28 | Deep Confident Steps to New Pockets: Strategies for Docking Generalization | Gabriele Corso et.al. | 2402.18396 | link |
2024-02-28 | Topological charge and spin Hall effects due to skyrmions in canted antiferromagnets | A. N. Zarezad et.al. | 2402.18369 | null |
2024-02-28 | Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model | Sangjoon Park et.al. | 2402.18362 | null |
2024-02-27 | Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning | Xiaoyu Zhang et.al. | 2402.17768 | null |
2024-02-27 | Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners | Yazhou Xing et.al. | 2402.17723 | null |
2024-02-27 | Structure-Guided Adversarial Training of Diffusion Models | Ling Yang et.al. | 2402.17563 | null |
2024-02-27 | Fast Lithium Ion Diffusion in Brownmillerite $\mathrm{Li}{x}\mathrm{{Sr}{2}{Co}{2}{O}{5}}$ | Xin Chen et.al. | 2402.17557 | null |
2024-02-27 | Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label | Xinliang Zhang et.al. | 2402.17555 | link |
2024-02-27 | Forming 1D Periodic J-aggregates by Mechanical Bending of BNNTs: Evidence of Activated Molecular Diffusion | J. -B. Marceau et.al. | 2402.17537 | null |
2024-02-27 | Diffusion Model-Based Image Editing: A Survey | Yi Huang et.al. | 2402.17525 | link |
2024-02-27 | Label-Noise Robust Diffusion Models | Byeonghu Na et.al. | 2402.17517 | link |
2024-02-27 | The Unwanted Dissemination of Science: The Usage of Academic Articles as Ammunition in Contested Discursive Arenas on Twitter | Richard Zhang et.al. | 2402.17495 | null |
2024-02-27 | EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | Linrui Tian et.al. | 2402.17485 | null |
2024-02-26 | Stochastic Conditional Diffusion Models for Semantic Image Synthesis | Juyeon Ko et.al. | 2402.16506 | link |
2024-02-26 | Outline-Guided Object Inpainting with Diffusion Models | Markus Pobitzer et.al. | 2402.16421 | null |
2024-02-26 | Renormalisation Group Methods for Effective Epidemiological Models | Stefan Hohenegger et.al. | 2402.16409 | null |
2024-02-26 | Entropy production for diffusion processes across a semipermeable interface | Paul C Bressloff et.al. | 2402.16403 | null |
2024-02-26 | Quantitative Propagation of Chaos for Mean Field Interacting Particle System | Xing Huang et.al. | 2402.16400 | null |
2024-02-26 | Placing Objects in Context via Inpainting for Out-of-distribution Segmentation | Pau de Jorge et.al. | 2402.16392 | link |
2024-02-26 | Generative AI in Vision: A Survey on Models, Metrics and Applications | Gaurav Raut et.al. | 2402.16369 | null |
2024-02-26 | Feedback Efficient Online Fine-Tuning of Diffusion Models | Masatoshi Uehara et.al. | 2402.16359 | null |
2024-02-26 | Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion | Xuantong Liu et.al. | 2402.16305 | null |
2024-02-26 | Graph Diffusion Policy Optimization | Yijing Liu et.al. | 2402.16302 | link |
2024-02-23 | Seamless Human Motion Composition with Blended Positional Encodings | German Barquero et.al. | 2402.15509 | link |
2024-02-23 | Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition | Chun-Hsiao Yeh et.al. | 2402.15504 | link |
2024-02-23 | Length and Velocity Scales in Protoplanetary Disk Turbulence | Debanjan Sengupta et.al. | 2402.15475 | null |
2024-02-23 | Solute transport due to periodic loading in a soft porous material | Matilde Fiori et.al. | 2402.15451 | null |
2024-02-23 | ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation | Yi Zhang et.al. | 2402.15429 | link |
2024-02-23 | Dendrites with corners | Enugala Sumanth Nani et.al. | 2402.15394 | null |
2024-02-23 | Understanding Oversmoothing in Diffusion-Based GNNs From the Perspective of Operator Semigroup Theory | Weichen Zhao et.al. | 2402.15326 | null |
2024-02-23 | Ubiquitous short-range order in multi-principal element alloys | Ying Han et.al. | 2402.15305 | null |
2024-02-23 | Let's Rectify Step by Step: Improving Aspect-based Sentiment Analysis with Diffusion Models | Shunyu Liu et.al. | 2402.15289 | link |
2024-02-23 | Generative Modelling with Tensor Train approximations of Hamilton--Jacobi--Bellman equations | David Sommer et.al. | 2402.15285 | null |
2024-02-22 | Cameras as Rays: Pose Estimation via Ray Diffusion | Jason Y. Zhang et.al. | 2402.14817 | null |
2024-02-22 | GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion | Xueyi Liu et.al. | 2402.14810 | link |
2024-02-22 | Consolidating Attention Features for Multi-view Image Editing | Or Patashnik et.al. | 2402.14792 | null |
2024-02-22 | Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models | Yixuan Ren et.al. | 2402.14780 | null |
2024-02-22 | Two-stage Cytopathological Image Synthesis for Augmenting Cervical Abnormality Screening | Zhenrong Shen et.al. | 2402.14707 | null |
2024-02-22 | PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model | Yukiya Hono et.al. | 2402.14692 | null |
2024-02-22 | Error Estimates for First- and Second-Order Lagrange-Galerkin Moving Mesh Schemes for the One-Dimensional Convection-Diffusion Equation | Kharisma Surya Putri et.al. | 2402.14691 | null |
2024-02-22 | Structure and thermodynamics of defects in Na-feldspar from a neural network potential | Alexander Gorfer et.al. | 2402.14640 | null |
2024-02-22 | Debiasing Text-to-Image Diffusion Models | Ruifei He et.al. | 2402.14577 | null |
2024-02-22 | DynGMA: a robust approach for learning stochastic differential equations from data | Aiqing Zhu et.al. | 2402.14475 | link |
2024-02-21 | D-Flow: Differentiating through Flows for Controlled Generation | Heli Ben-Hamu et.al. | 2402.14017 | null |
2024-02-21 | SDXL-Lightning: Progressive Adversarial Diffusion Distillation | Shanchuan Lin et.al. | 2402.13929 | null |
2024-02-21 | Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate | Yuchen Liang et.al. | 2402.13901 | null |
2024-02-21 | Conformal and nonminimal couplings in fractional cosmology | Kevin Marroquín et.al. | 2402.13850 | null |
2024-02-21 | The influence of thermal pressure gradients and ionization (im)balance on the ambipolar diffusion and charge-neutral drifts | M. M. Gómez-Míguez et.al. | 2402.13813 | null |
2024-02-21 | NeuralDiffuser: Controllable fMRI Reconstruction with Primary Visual Feature Guided Diffusion | Haoyu Li et.al. | 2402.13809 | null |
2024-02-21 | The Geography of Information Diffusion in Online Discourse on Europe and Migration | Elisa Leonardelli et.al. | 2402.13800 | null |
2024-02-21 | Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions | Jiayu Chen et.al. | 2402.13777 | link |
2024-02-21 | Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion | Lianghu Guo et.al. | 2402.13776 | null |
2024-02-21 | Music Style Transfer with Time-Varying Inversion of Diffusion Models | Sifei Li et.al. | 2402.13763 | null |
2024-02-20 | Nonequilibrium fluctuations of chemical reaction networks at criticality: The Schlögl model as paradigmatic case | Benedikt Remlein et.al. | 2402.13168 | null |
2024-02-20 | Neural Network Diffusion | Kai Wang et.al. | 2402.13144 | link |
2024-02-20 | Ultrafast lattice disordering can be accelerated by electronic collisional forces | Gilberto A. de la Pena Munoz et.al. | 2402.13133 | null |
2024-02-20 | How accurate are simulations and experiments for the lattice energies of molecular crystals? | Flaviano Della Pia et.al. | 2402.13059 | null |
2024-02-20 | Excited state-specific CASSCF theory for the torsion of ethylene | Sandra Saade et.al. | 2402.13046 | null |
2024-02-20 | Text-Guided Molecule Generation with Diffusion Language Model | Haisong Gong et.al. | 2402.13040 | link |
2024-02-20 | The Anomalous Long-Ranged Influence of an Inclusion in Momentum-Conserving Active Fluids | Thibaut Arnoulx de Pirey et.al. | 2402.12996 | null |
2024-02-20 | Visual Style Prompting with Swapping Self-Attention | Jaeseok Jeong et.al. | 2402.12974 | link |
2024-02-20 | CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection | Sohail Ahmed Khan et.al. | 2402.12927 | link |
2024-02-20 | RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models | Xinchen Zhang et.al. | 2402.12908 | link |
2024-02-19 | FiT: Flexible Vision Transformer for Diffusion Model | Zeyu Lu et.al. | 2402.12376 | link |
2024-02-19 | A Lower Bound for Estimating Fréchet Means | Shayan Hundrieser et.al. | 2402.12290 | null |
2024-02-19 | Analysis of Persian News Agencies on Instagram, A Words Co-occurrence Graph-based Approach | Mohammad Heydari et.al. | 2402.12272 | null |
2024-02-19 | Synthetic location trajectory generation using categorical diffusion models | Simon Dirmeier et.al. | 2402.12242 | link |
2024-02-19 | Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations | Jonas Beck et.al. | 2402.12231 | link |
2024-02-19 | Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training | Leo Hyun Park et.al. | 2402.12187 | null |
2024-02-19 | Anomalous Diffusion, Prethermalization, and Particle Binding in an Interacting Flat Band System | Mirko Daumann et.al. | 2402.12180 | null |
2024-02-19 | Human Video Translation via Query Warping | Haiming Zhu et.al. | 2402.12099 | null |
2024-02-19 | Malliavin Calculus for rough stochastic differential equations | Fabio Bugini et.al. | 2402.12056 | null |
2024-02-19 | Constraining the stellar populations of ultra-diffuse galaxies in the MATLAS survey using spectral energy distribution fitting | Maria Luisa Buzzo et.al. | 2402.12033 | null |
2024-02-16 | Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning | Chia-Ling Tsai et.al. | 2402.10894 | null |
2024-02-16 | 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations | Tsung-Wei Ke et.al. | 2402.10885 | null |
2024-02-16 | Electronic Conductivity Measurements in Solid Electrolytes Using an Ion Blocking Microelectrode: Noise Rejection Based on a Median Filter | Veyis Gunes et.al. | 2402.10883 | null |
2024-02-16 | Control Color: Multimodal Diffusion-based Interactive Image Colorization | Zhexin Liang et.al. | 2402.10855 | null |
2024-02-16 | Training Class-Imbalanced Diffusion Model Via Overlap Optimization | Divin Yan et.al. | 2402.10821 | link |
2024-02-16 | VATr++: Choose Your Words Wisely for Handwritten Text Generation | Bram Vanherle et.al. | 2402.10798 | null |
2024-02-16 | Nearly-optimal effective stability estimates around Diophantine tori of Hölder Hamiltonians | Santiago Barbieri et.al. | 2402.10764 | null |
2024-02-16 | Revisiting a Core-Jet Laboratory at High Redshift: Analysis of the Radio Jet in the Quasar PKS 2215+020 at z=3.572 | Sándor Frey et.al. | 2402.10722 | null |
2024-02-16 | Rethinking Human-like Translation Strategy: Integrating Drift-Diffusion Model with Large Language Models for Machine Translation | Hongbin Na et.al. | 2402.10699 | null |
2024-02-16 | Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm | Yuanzhen Xie et.al. | 2402.10671 | link |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210 | null |
2024-02-15 | Recovering the Pre-Fine-Tuning Weights of Generative Models | Eliahu Horwitz et.al. | 2402.10208 | link |
2024-02-15 | Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Rui Yang et.al. | 2402.10207 | link |
2024-02-15 | Radio-astronomical Image Reconstruction with Conditional Denoising Diffusion Model | Mariia Drozdova et.al. | 2402.10204 | link |
2024-02-15 | Tracer dynamics in polymer networks: generalized Langevin description | Sebastian Milster et.al. | 2402.10148 | null |
2024-02-15 | Energy Flux Decomposition in Magnetohydrodynamic Turbulence | D. Capocci et.al. | 2402.10125 | null |
2024-02-15 | A Blob Method for Mean Field Control With Terminal Constraints | Katy Craig et.al. | 2402.10124 | link |
2024-02-15 | Collision efficiency of droplets across diffusive, electrostatic and inertial regimes | Florian Poydenot et.al. | 2402.10117 | null |
2024-02-15 | Quantized Embedding Vectors for Controllable Diffusion Language Models | Cheng Kang et.al. | 2402.10107 | null |
2024-02-15 | Classification Diffusion Models | Shahar Yadin et.al. | 2402.10095 | null |
2024-02-14 | Magic-Me: Identity-Specific Video Customized Diffusion | Ze Ma et.al. | 2402.09368 | link |
2024-02-14 | Investigation of Ga interstitial and vacancy diffusion in |
Channyung Lee et.al. | 2402.09354 | null |
2024-02-14 | On the system size dependence of the diffusion coefficients in MD simulations: A simple correction formula for pure dense fluids | Sergey Khrapak et.al. | 2402.09348 | null |
2024-02-14 | Lattice B-field correlators for heavy quarks | Luis Altenkort et.al. | 2402.09337 | null |
2024-02-14 | Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio | Pablo Alonso-Jiménez et.al. | 2402.09318 | null |
2024-02-14 | Disentangling the origin of chemical differences using GHOST | C. Saffe et.al. | 2402.09278 | null |
2024-02-14 | A Modular Deep Learning-based Approach for Diffuse Optical Tomography Reconstruction | Alessandro Benfenati et.al. | 2402.09277 | null |
2024-02-14 | Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection | Pengfei Zhou et.al. | 2402.09242 | link |
2024-02-14 | Modeling of groundwater flow in porous medium layered over inclined impermeable bed | Petr Girg et.al. | 2402.09215 | null |
2024-02-14 | A universal scaling limit for diffusive amnesic step-reinforced random walks | Marco Bertenghi et.al. | 2402.09202 | null |
2024-02-13 | IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation | Luke Melas-Kyriazi et.al. | 2402.08682 | null |
2024-02-13 | Chain Reaction of Ideas: Can Radioactive Decay Predict Technological Innovation? | Guilherme S. Y. Giardini et.al. | 2402.08681 | null |
2024-02-13 | Target Score Matching | Valentin De Bortoli et.al. | 2402.08667 | null |
2024-02-13 | Learning Continuous 3D Words for Text-to-Image Generation | Ta-Ying Cheng et.al. | 2402.08654 | null |
2024-02-13 | Clustering of primordial black holes from quantum diffusion during inflation | Chiara Animali et.al. | 2402.08642 | null |
2024-02-13 | Latent Inversion with Timestep-aware Sampling for Training-free Non-rigid Editing | Yunji Jung et.al. | 2402.08601 | null |
2024-02-13 | Denoising Diffusion Restoration Tackles Forward and Inverse Problems for the Laplace Operator | Amartya Mukherjee et.al. | 2402.08563 | null |
2024-02-13 | Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases | Ziyi Zhang et.al. | 2402.08552 | link |
2024-02-13 | Branching Interval Partition Diffusions | Matthew Buckland et.al. | 2402.08548 | null |
2024-02-13 | Hyperballistic transport in dense ionized matter under external AC electric fields | Daniele Gamba et.al. | 2402.08519 | null |
2024-02-12 | Label-Efficient Model Selection for Text Generation | Shir Ashury-Tahan et.al. | 2402.07891 | null |
2024-02-12 | High-order harmonic generation in 2D Transition Metal Disulphides | Jose Manuel Iglesias et.al. | 2402.07850 | null |
2024-02-12 | Self-heating effects and switching dynamics in graphene multiterminal Josephson junctions | Máté Kedves et.al. | 2402.07831 | null |
2024-02-12 | Towards a mathematical theory for consistency training in diffusion models | Gen Li et.al. | 2402.07802 | null |
2024-02-12 | Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models | Jiacheng Ye et.al. | 2402.07754 | link |
2024-02-12 | The GALAH survey: Elemental abundances in open clusters using joint effective temperature and surface gravity photometric priors | Kevin L. Beeson et.al. | 2402.07748 | null |
2024-02-12 | Topological Edge States in Reconfigurable Multi-stable Mechanical Metamaterials | Zhen Wang et.al. | 2402.07707 | null |
2024-02-12 | Metastability and time scales for parabolic equations with drift 2: the general time scale | Claudio Landim et.al. | 2402.07695 | null |
2024-02-12 | Cosmology at the Field Level with Probabilistic Machine Learning | Adam Rouhiainen et.al. | 2402.07694 | null |
2024-02-12 | Higher-order Connection Laplacians for Directed Simplicial Complexes | Xue Gong et.al. | 2402.07631 | null |
2024-02-09 | The impact of different unravelings in a monitored system of free fermions | Giulia Piccitto et.al. | 2402.06597 | null |
2024-02-09 | Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following | Brian Yang et.al. | 2402.06559 | null |
2024-02-09 | The role of mobility in epidemics near criticality | Beatrice Nettuno et.al. | 2402.06505 | null |
2024-02-09 | Sequential Flow Matching for Generative Modeling | Jongmin Yoon et.al. | 2402.06461 | null |
2024-02-09 | ControlUDA: Controllable Diffusion-assisted Unsupervised Domain Adaptation for Cross-Weather Semantic Segmentation | Fengyi Shen et.al. | 2402.06446 | null |
2024-02-09 | Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation | Peter Hönig et.al. | 2402.06436 | null |
2024-02-09 | Enhanced bubble growth near an advancing solidification front | Jochem G. Meijer et.al. | 2402.06409 | null |
2024-02-09 | Spectral properties of the Dirichlet-to-Neumann operator for spheroids | Denis S. Grebenkov et.al. | 2402.06372 | null |
2024-02-09 | Sparse identification of nonlocal interaction kernels in nonlinear gradient flow equations via partial inversion | Jose A. Carrillo et.al. | 2402.06355 | null |
2024-02-09 | Particle Denoising Diffusion Sampler | Angus Phillips et.al. | 2402.06320 | link |
2024-02-08 | InstaGen: Enhancing Object Detection by Training on Synthetic Dataset | Chengjian Feng et.al. | 2402.05937 | null |
2024-02-08 | Time Series Diffusion in the Frequency Domain | Jonathan Crabbé et.al. | 2402.05933 | link |
2024-02-08 | Dirichlet Flow Matching with Applications to DNA Sequence Design | Hannes Stark et.al. | 2402.05841 | link |
2024-02-08 | AvatarMMC: 3D Head Avatar Generation and Editing with Multi-Modal Conditioning | Wamiq Reyaz Para et.al. | 2402.05803 | null |
2024-02-08 | Determining the significance and relative importance of parameters of a simulated quenching algorithm using statistical tools | Pedro A. Castillo et.al. | 2402.05791 | null |
2024-02-08 | Hydrogen abstraction from metal surfaces: When electron-hole pair excitations strongly affect hot-atom recombination | Oihana Galparsoro et.al. | 2402.05743 | null |
2024-02-08 | First operation of a multi-channel Q-Pix prototype: measuring transverse electron diffusion in a gas time projection chamber | Nora Hoch et.al. | 2402.05734 | null |
2024-02-08 | DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer | Zhiyuan Ma et.al. | 2402.05712 | link |
2024-02-08 | Discovery and characterisation of a new Galactic Planetary Nebula | W. E. Celnik et.al. | 2402.05658 | null |
2024-02-08 | Scalable Diffusion Models with State Space Backbone | Zhengcong Fei et.al. | 2402.05608 | link |
2024-02-07 | Nature of the diffuse emission sources in the H I supershell in the galaxy IC 1613 | Anastasiya D. Yarovova et.al. | 2402.05107 | null |
2024-02-07 | On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling | Marcin Sendera et.al. | 2402.05098 | link |
2024-02-07 | Convergence of spatial branching processes to |
Félix Foutel-Rodier et.al. | 2402.05096 | null |
2024-02-07 | Interacting particle approximation of cross-diffusion systems | Jose Antonio Carrillo et.al. | 2402.05094 | null |
2024-02-07 | NITO: Neural Implicit Fields for Resolution-free Topology Optimization | Amin Heyrani Nobari et.al. | 2402.05073 | null |
2024-02-07 | LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation | Jiaxiang Tang et.al. | 2402.05054 | null |
2024-02-07 | Non-reversible lifts of reversible diffusion processes and relaxation times | Andreas Eberle et.al. | 2402.05041 | null |
2024-02-07 | Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design | Andrew Campbell et.al. | 2402.04997 | link |
2024-02-07 | On the Cahn-Hilliard equation with kinetic rate dependent dynamic boundary conditions and non-smooth potentials: Well-posedness and asymptotic limits | Maoyin Lv et.al. | 2402.04965 | null |
2024-02-07 | Hidden non-equilibrium pathways towards crystalline perfection | A. Mangu et.al. | 2402.04962 | null |
2024-02-06 | Geometric theory of (extended) time-reversal symmetries in stochastic processes -- Part I: finite dimension | Jérémy O'Byrne et.al. | 2402.04217 | null |
2024-02-06 | Maximal regularity and optimal control for a non-local Cahn-Hilliard tumour growth model | Matteo Fornoni et.al. | 2402.04204 | null |
2024-02-06 | SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models | Yichen Shi et.al. | 2402.04178 | link |
2024-02-06 | Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning | Ruoqi Zhang et.al. | 2402.04080 | link |
2024-02-06 | Generative Modeling of Graphs via Joint Diffusion of Node and Edge Attributes | Nimrod Berman et.al. | 2402.04046 | null |
2024-02-06 | PAC-Bayesian Adversarially Robust Generalization Bounds for Graph Neural Network | Tan Sun et.al. | 2402.04038 | null |
2024-02-06 | Polyp-DDPM: Diffusion-Based Semantic Polyp Synthesis for Enhanced Segmentation | Zolnamar Dorjsembe et.al. | 2402.04031 | link |
2024-02-06 | Space Group Constrained Crystal Generation | Rui Jiao et.al. | 2402.03992 | null |
2024-02-06 | Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting | Yiming Xu et.al. | 2402.03981 | null |
2024-02-06 | Weibel- and non-resonant Whistler wave growth in an expanding plasma in a 1D simulation geometry | M E Dieckmann et.al. | 2402.03925 | null |
2024-02-05 | Do Diffusion Models Learn Semantically Meaningful and Efficient Representations? | Qiyao Liang et.al. | 2402.03305 | null |
2024-02-05 | Zero-shot Object-Level OOD Detection with Context-Aware Inpainting | Quang-Huy Nguyen et.al. | 2402.03292 | null |
2024-02-05 | InstanceDiffusion: Instance-level Control for Image Generation | Xudong Wang et.al. | 2402.03290 | link |
2024-02-05 | Estimating position-dependent and anisotropic diffusivity tensors from molecular dynamics trajectories: Existing methods and future outlook | Tiago Domingues et.al. | 2402.03285 | null |
2024-02-05 | Organic or Diffused: Can We Distinguish Human Art from AI-generated Images? | Anna Yoo Jeong Ha et.al. | 2402.03214 | null |
2024-02-05 | Light and Optimal Schrödinger Bridge Matching | Nikita Gushchin et.al. | 2402.03207 | link |
2024-02-05 | Guidance with Spherical Gaussian Constraint for Conditional Diffusion | Lingxiao Yang et.al. | 2402.03201 | link |
2024-02-05 | Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion | Shiyuan Yang et.al. | 2402.03162 | null |
2024-02-05 | Nonlinear feedback of the electrostatic instability on the blazar-induced pair beam and GeV cascade | Mahmoud Alawashra et.al. | 2402.03127 | null |
2024-02-05 | DARTS: Diffusion Approximated Residual Time Sampling for Low Variance Time-of-flight Rendering in Homogeneous Scattering Medium | Qianyue He et.al. | 2402.03106 | null |
2024-02-02 | Revealing crucial effects of reservoir environment and hydrocarbon fractions on fluid behaviour in kaolinite pores | Rixin Zhao et.al. | 2402.01633 | null |
2024-02-02 | NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties | Jingyuan Sun et.al. | 2402.01590 | null |
2024-02-02 | Transformation semigroups and their applications | Katarzyna Pichór et.al. | 2402.01572 | null |
2024-02-02 | Boximator: Generating Rich and Controllable Motions for Video Synthesis | Jiawei Wang et.al. | 2402.01566 | null |
2024-02-02 | Resolution dependence of most probable pathways with state-dependent diffusivity | Alice L. Thorneywork et.al. | 2402.01559 | null |
2024-02-02 | The galactic bubbles of starburst galaxies The influence of galactic large-scale magnetic fields | Z. Meliani et.al. | 2402.01541 | null |
2024-02-02 | Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations | Panos Kakoulidis et.al. | 2402.01520 | null |
2024-02-02 | Cross-view Masked Diffusion Transformers for Person Image Synthesis | Trung X. Pham et.al. | 2402.01516 | link |
2024-02-02 | Binomial-tree approximation for time-inconsistent stopping | Erhan Bayraktar et.al. | 2402.01482 | null |
2024-02-02 | SVI solutions to stochastic nonlinear diffusion equations on general measure spaces | Benjamin Gess et.al. | 2402.01479 | null |
2024-02-01 | AToM: Amortized Text-to-Mesh using 2D Diffusion | Guocheng Qian et.al. | 2402.00867 | null |
2024-02-01 | ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields | Jiahua Dong et.al. | 2402.00864 | link |
2024-02-01 | An Analysis of the Variance of Diffusion-based Speech Enhancement | Bunlong Lay et.al. | 2402.00811 | null |
2024-02-01 | Distilling Conditional Diffusion Models for Offline Reinforcement Learning through Trajectory Stitching | Shangzhe Li et.al. | 2402.00807 | null |
2024-02-01 | AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning | Fu-Yun Wang et.al. | 2402.00769 | link |
2024-02-01 | The Sonora Substellar Atmosphere Models. IV. Elf Owl: Atmospheric Mixing and Chemical Disequilibrium with Varying Metallicity and C/O Ratios | Sagnick Mukherjee et.al. | 2402.00756 | null |
2024-02-01 | Neutral carbon in diffuse interstellar medium: abundance matching with H2 for DLAs at high redshifts | Sergei Balashev et.al. | 2402.00714 | null |
2024-02-01 | Cylindrically symmetric diffusion model for relativistic heavy-ion collisions | Johannes Hoelck et.al. | 2402.00628 | null |
2024-02-01 | CapHuman: Capture Your Moments in Parallel Universes | Chao Liang et.al. | 2402.00627 | link |
2024-02-01 | Diffusion-based Light Field Synthesis | Ruisheng Gao et.al. | 2402.00575 | null |
2024-01-31 | Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators | Daniel Geng et.al. | 2401.18085 | null |
2024-01-31 | An electrodynamic wave model for the action potential | Vitaly L. Galinsky et.al. | 2401.18051 | null |
2024-01-31 | Reversible, Irreversible and Mixed Regimes for Periodically Driven Disks in Random Obstacle Arrays | D. Minogue et.al. | 2401.18042 | null |
2024-01-31 | Ljusternik-Schnirelmann eigenvalues for the fractional $m-$Laplacian without the |
Julian Fernandez Bonder et.al. | 2401.18041 | null |
2024-01-31 | Diagnosing the particle transport mechanism in the pulsar halo via X-ray observations | Qi-Zuo Wu et.al. | 2401.17982 | null |
2024-01-31 | Convergence Analysis for General Probability Flow ODEs of Diffusion Models in Wasserstein Distances | Xuefeng Gao et.al. | 2401.17958 | null |
2024-01-31 | Investigation of Microstructure and Corrosion Resistance of Ti-Al-V Titanium Alloys Obtained by Spark Plasma Sintering | Aleksey Nokhrin et.al. | 2401.17941 | null |
2024-01-31 | Lipolysis on Lipid Droplets: Mathematical Modelling and Numerical Discretisation | Reymart Salcedo Lagunero et.al. | 2401.17935 | link |
2024-01-31 | AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error | Jonas Ricker et.al. | 2401.17879 | link |
2024-01-31 | Multiplicity results for mass constrained Allen-Cahn equations on Riemannian manifolds with boundary | Dario Corona et.al. | 2401.17847 | null |
2024-01-30 | Study of X-ray emission from the S147 nebula with SRG/eROSITA: X-ray imaging, spectral characterization and a multiwavelength picture | Miltiadis Michailidis et.al. | 2401.17312 | null |
2024-01-30 | G321.3-3.9: a new supernova remnant observed with multi-band radio data and in the SRG/eROSITA All-Sky Surveys | S. Mantovanini et.al. | 2401.17294 | null |
2024-01-30 | Discovery of the Goat Horn complex: a |
Nicola Locatelli et.al. | 2401.17291 | null |
2024-01-30 | A new understanding of the Gemini-Monoceros X-ray enhancement from discoveries with eROSITA | Jonathan R. Knies et.al. | 2401.17289 | null |
2024-01-30 | Probing the physical properties of the IGM using SRG/eROSITA spectra from blazars | E. Gatuzz et.al. | 2401.17283 | null |
2024-01-30 | You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation | Mehdi Noroozi et.al. | 2401.17258 | null |
2024-01-30 | Stochastic motions of the two-dimensional many-body delta-Bose gas | Yu-Ting Chen et.al. | 2401.17243 | null |
2024-01-30 | ContactGen: Contact-Guided Interactive 3D Human Generation for Partners | Dongjun Gu et.al. | 2401.17212 | null |
2024-01-30 | Quantum dynamics in one and two dimensions via recursion method | Filipp Uskov et.al. | 2401.17211 | null |
2024-01-30 | Transfer Learning for Text Diffusion Models | Kehang Han et.al. | 2401.17181 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-06-10 | Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer | Sigal Raab et.al. | 2406.06508 | link |
2024-06-10 | Human Gaze and Head Rotation during Navigation, Exploration and Object Manipulation in Shared Environments with Robots | Tim Schreiter et.al. | 2406.06300 | null |
2024-06-07 | SMART: Scene-motion-aware human action recognition framework for mental disorder group | Zengyuan Lai et.al. | 2406.04649 | null |
2024-06-03 | PDP: Physics-Based Character Animation via Diffusion Policy | Takara E. Truong et.al. | 2406.00960 | null |
2024-06-02 | Unsupervised Neural Motion Retargeting for Humanoid Teleoperation | Satoshi Yagi et.al. | 2406.00727 | null |
2024-06-02 | T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences | Taeryung Lee et.al. | 2406.00636 | null |
2024-05-30 | MotionLLM: Understanding Human Behaviors from Human Motions and Videos | Ling-Hao Chen et.al. | 2405.20340 | null |
2024-05-30 | RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text | Jiaben Chen et.al. | 2405.20336 | null |
2024-05-30 | SMPLX-Lite: A Realistic and Drivable Avatar Benchmark with Rich Geometry and Texture Annotations | Yujiao Jiang et.al. | 2405.19609 | null |
2024-05-30 | Multi-Condition Latent Diffusion Network for Scene-Aware Neural Human Motion Prediction | Xuehao Gao et.al. | 2405.18700 | null |
2024-05-30 | Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson's Disease Severity in Walking Sequences | Vida Adeli et.al. | 2405.17817 | link |
2024-05-28 | MotionLLM: Multimodal Motion-Language Learning with Large Language Models | Qi Wu et.al. | 2405.17013 | null |
2024-05-27 | A Cross-Dataset Study for Text-based 3D Human Motion Retrieval | Léore Bensabath et.al. | 2405.16909 | null |
2024-05-25 | SuDA: Support-based Domain Adaptation for Sim2Real Motion Capture with Flexible Sensors | Jiawei Fang et.al. | 2405.16152 | null |
2024-05-24 | FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis | Ke Fan et.al. | 2405.15763 | null |
2024-05-24 | Learning Generalizable Human Motion Generator with Reinforcement Learning | Yunyao Mao et.al. | 2405.15541 | null |
2024-05-24 | Text-guided 3D Human Motion Generation with Keyframe-based Parallel Skip Transformer | Zichen Geng et.al. | 2405.15439 | null |
2024-05-24 | A Systematic Review on Custom Data Gloves | Valerio Belcamino et.al. | 2405.15417 | null |
2024-05-24 | On the Identification of Temporally Causal Representation with Instantaneous Dependence | Zijian Li et.al. | 2405.15325 | null |
2024-05-24 | Off-the-shelf ChatGPT is a Good Few-shot Human Motion Predictor | Haoxuan Qu et.al. | 2405.15267 | null |
2024-05-23 | Event-based dataset for the detection and classification of manufacturing assembly tasks | Laura Duarte et.al. | 2405.14626 | link |
2024-05-21 | MOSS: Motion-based 3D Clothed Human Synthesis from Monocular Video | Hongsheng Wang et.al. | 2405.12806 | null |
2024-05-21 | Towards Using Fast Embedded Model Predictive Control for Human-Aware Predictive Robot Navigation | Till Hielscher et.al. | 2405.12616 | null |
2024-05-21 | Physics-based Scene Layout Generation from Human Motion | Jianan Li et.al. | 2405.12460 | null |
2024-05-23 | Flexible Motion In-betweening with Diffusion Models | Setareh Cohan et.al. | 2405.11126 | null |
2024-05-17 | Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis | Zeyi Zhang et.al. | 2405.09814 | null |
2024-05-16 | Integrating Uncertainty-Aware Human Motion Prediction into Graph-Based Manipulator Motion Planning | Wansong Liu et.al. | 2405.09779 | null |
2024-05-24 | ContourCraft: Learning to Resolve Intersections in Neural Multi-Garment Simulations | Artur Grigorev et.al. | 2405.09522 | null |
2024-05-13 | Generating Human Motion in 3D Scenes from Text Descriptions | Zhi Cen et.al. | 2405.07784 | null |
2024-05-13 | Establishing a Unified Evaluation Framework for Human Motion Generation: A Comparative Analysis of Metrics | Ali Ismail-Fawaz et.al. | 2405.07680 | link |
2024-05-13 | Motion Keyframe Interpolation for Any Human Skeleton via Temporally Consistent Point Cloud Sampling and Reconstruction | Clinton Mo et.al. | 2405.07444 | null |
2024-05-10 | Shape Conditioned Human Motion Generation with Diffusion Model | Kebing Xue et.al. | 2405.06778 | null |
2024-05-09 | A Mixture of Experts Approach to 3D Human Motion Prediction | Edmund Shieh et.al. | 2405.06088 | link |
2024-05-09 | StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework | Yiheng Huang et.al. | 2405.05691 | null |
2024-05-08 | Audio Matters Too! Enhancing Markerless Motion Capture with Audio Signals for String Performance Capture | Yitong Jin et.al. | 2405.04963 | null |
2024-05-08 | WixUp: A General Data Augmentation Framework for Wireless Perception in Tracking of Humans | Yin Li et.al. | 2405.04804 | null |
2024-05-08 | Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches | Qing Yu et.al. | 2405.04771 | null |
2024-05-07 | Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos | Junyi Ma et.al. | 2405.04370 | link |
2024-05-06 | MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference Optimization | Massimiliano Pappa et.al. | 2405.03803 | null |
2024-05-06 | LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model | Haowen Sun et.al. | 2405.03485 | link |
2024-05-05 | Multimodal Sense-Informed Prediction of 3D Human Motions | Zhenyu Lou et.al. | 2405.02911 | null |
2024-05-05 | Efficient Text-driven Motion Generation via Latent Consistency Training | Mengxian Hu et.al. | 2405.02791 | null |
2024-05-03 | Physics-informed generative neural networks for RF propagation prediction with application to indoor body perception | Federica Fieramosca et.al. | 2405.02131 | null |
2024-04-30 | MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model | Wenxun Dai et.al. | 2404.19759 | link |
2024-04-30 | PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios | Jingbo Wang et.al. | 2404.19722 | null |
2024-04-30 | Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis | Shivam Mehta et.al. | 2404.19622 | null |
2024-04-30 | Physical Non-inertial Poser (PNP): Modeling Non-inertial Effects in Sparse-inertial Human Motion Capture | Xinyu Yi et.al. | 2404.19619 | null |
2024-04-30 | Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging | Rayan Armani et.al. | 2404.19541 | link |
2024-04-29 | 4D-DRESS: A 4D Dataset of Real-world Human Clothing with Semantic Annotations | Wenbo Wang et.al. | 2404.18630 | link |
2024-04-27 | Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs | Yiming Bao et.al. | 2404.17837 | null |
2024-04-26 | Clustering of Motion Trajectories by a Distance Measure Based on Semantic Features | Christoph Zelch et.al. | 2404.17269 | link |
2024-04-25 | SHINE: Social Homology Identification for Navigation in Crowded Environments | Diego Martinez-Baselga et.al. | 2404.16705 | null |
2024-04-23 | WANDR: Intention-guided Human Motion Generation | Markos Diomataris et.al. | 2404.15383 | null |
2024-04-20 | Efficient Verification of a RADAR SoC Using Formal and Simulation-Based Methods | Aman Kumar et.al. | 2404.15371 | null |
2024-04-19 | A Weight-aware-based Multi-source Unsupervised Domain Adaptation Method for Human Motion Intention Recognition | Xiao-Yin Liu et.al. | 2404.15366 | link |
2024-04-23 | TAAT: Think and Act from Arbitrary Texts in Text2Motion | Runqi Wang et.al. | 2404.14745 | null |
2024-04-21 | MLP: Motion Label Prior for Temporal Sentence Localization in Untrimmed 3D Human Motions | Sheng Yan et.al. | 2404.13657 | link |
2024-04-19 | Purposer: Putting Human Motion Generation in Context | Nicolas Ugrinovic et.al. | 2404.12942 | null |
2024-04-19 | MCM: Multi-condition Motion Synthesis Framework | Zeyu Ling et.al. | 2404.12886 | null |
2024-04-17 | Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion | Xinghan Wang et.al. | 2404.11375 | null |
2024-04-17 | Following the Human Thread in Social Navigation | Luca Scofano et.al. | 2404.11327 | link |
2024-04-16 | HumMUSS: Human Motion Understanding using State Space Models | Arnab Kumar Mondal et.al. | 2404.10880 | null |
2024-04-15 | in2IN: Leveraging individual Information to Generate Human INteractions | Pablo Ruiz Ponce et.al. | 2404.09988 | null |
2024-04-15 | Learning Human Motion from Monocular Videos via Cross-Modal Manifold Alignment | Shuaiying Hou et.al. | 2404.09499 | null |
2024-04-12 | Synthesis of Through-Wall Micro-Doppler Signatures of Human Motions Using Generative Adversarial Networks | Kainat Yasmeen Shobha Sundar Ram et.al. | 2404.08739 | null |
2024-04-12 | EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams | Christen Millerdurai et.al. | 2404.08640 | link |
2024-04-11 | Model Predictive Trajectory Planning for Human-Robot Handovers | Thies Oelerich et.al. | 2404.07505 | null |
2024-04-08 | Social-MAE: Social Masked Autoencoder for Multi-person Motion Representation Learning | Mahsa Ehsanpour et.al. | 2404.05578 | null |
2024-04-08 | Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning | Jaewoo Jeong et.al. | 2404.05218 | link |
2024-04-07 | A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals | Jiangnan Tang et.al. | 2404.04890 | null |
2024-04-05 | PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos | Yufei Zhang et.al. | 2404.04430 | null |
2024-04-04 | Towards more realistic human motion prediction with attention to motion coordination | Pengxiang Ding et.al. | 2404.03584 | null |
2024-04-03 | MotionChain: Conversational Motion Controllers via Multimodal Prompts | Biao Jiang et.al. | 2404.01700 | link |
2024-04-02 | Leveraging Digital Perceptual Technologies for Remote Perception and Analysis of Human Biomechanical Processes: A Contactless Approach for Workload and Joint Force Assessment | Jesudara Omidokun et.al. | 2404.01576 | null |
2024-04-01 | Large Motion Model for Unified Multi-Modal Motion Generation | Mingyuan Zhang et.al. | 2404.01284 | null |
2024-04-02 | SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering | Tao Hu et.al. | 2404.01225 | null |
2024-03-29 | A Unified Framework for Human-centric Point Cloud Video Understanding | Yiteng Xu et.al. | 2403.20031 | null |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652 | null |
2024-03-28 | RELI11D: A Comprehensive Multimodal Human Motion Dataset and Method | Ming Yan et.al. | 2403.19501 | null |
2024-03-28 | Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication | Mingze Sun et.al. | 2403.19467 | null |
2024-04-01 | BAMM: Bidirectional Autoregressive Motion Model | Ekkasit Pinyoanuntapong et.al. | 2403.19435 | link |
2024-03-30 | Egocentric Scene-aware Human Trajectory Prediction | Weizhuo Wang et.al. | 2403.19026 | null |
2024-03-26 | Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance | Zan Wang et.al. | 2403.18036 | link |
2024-03-26 | ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis | Muhammad Hamza Mughal et.al. | 2403.17936 | null |
2024-03-30 | MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors | He Zhang et.al. | 2403.17610 | null |
2024-03-28 | Gaze-guided Hand-Object Interaction Synthesis: Benchmark and Method | Jie Tian et.al. | 2403.16169 | null |
2024-03-26 | PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling | Xiaoyun Zheng et.al. | 2403.16080 | link |
2024-03-23 | Human Motion Prediction under Unexpected Perturbation | Jiangbei Yue et.al. | 2403.15891 | null |
2024-03-23 | Contact-aware Human Motion Generation from Textual Descriptions | Sihan Ma et.al. | 2403.15709 | null |
2024-03-22 | GPT-Connect: Interaction between Text-Driven Human Motion Generator and 3D Scenes in a Training-free Manner | Haoxuan Qu et.al. | 2403.14947 | null |
2024-03-21 | HCTO: Optimality-Aware LiDAR Inertial Odometry with Hybrid Continuous Time Optimization for Compact Wearable Mapping System | Jianping Li et.al. | 2403.14173 | link |
2024-03-21 | Existence Is Chaos: Enhancing 3D Human Motion Prediction with Uncertainty Consideration | Zhihao Wang et.al. | 2403.14104 | null |
2024-03-20 | CoMo: Controllable Motion Generation through Language Guided Pose Code Editing | Yiming Huang et.al. | 2403.13900 | null |
2024-03-20 | LaCE-LHMP: Airflow Modelling-Inspired Long-Term Human Motion Prediction By Enhancing Laminar Characteristics in Human Flow | Yufei Zhu et.al. | 2403.13640 | link |
2024-03-21 | LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment | Peishan Cong et.al. | 2403.13307 | link |
2024-03-20 | Map-Aware Human Pose Prediction for Robot Follow-Ahead | Qingyuan Jiang et.al. | 2403.13294 | null |
2024-03-19 | WHAC: World-grounded Humans and Cameras | Wanqi Yin et.al. | 2403.12959 | link |
2024-03-18 | Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection | Ali Karami et.al. | 2403.12172 | null |
2024-03-18 | UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling | Yujiao Jiang et.al. | 2403.11589 | null |
2024-03-17 | FORCE: Dataset and Method for Intuitive Physics Guided Human-object Interaction | Xiaohan Zhang et.al. | 2403.11237 | null |
2024-03-17 | THOR: Text to Human-Object Interaction Diffusion via Relation Intervention | Qianyang Wu et.al. | 2403.11208 | null |
2024-03-14 | GazeMotion: Gaze-guided Human Motion Forecasting | Zhiming Hu et.al. | 2403.09885 | null |
2024-03-14 | THÖR-MAGNI: A Large-scale Indoor Motion Capture Recording of Human Movement and Robot Interaction | Tim Schreiter et.al. | 2403.09285 | link |
2024-03-13 | Scaling Up Dynamic Human-Scene Interaction Modeling | Nan Jiang et.al. | 2403.08629 | null |
2024-03-12 | DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation | Chen Wang et.al. | 2403.07788 | null |
2024-03-19 | Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM | Zeyu Zhang et.al. | 2403.07487 | link |
2024-03-10 | Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation | Paweł A. Pierzchlewicz et.al. | 2403.06164 | link |
2024-03-09 | MATRIX: Multi-Agent Trajectory Generation with Diverse Contexts | Zhuo Xu et.al. | 2403.06041 | null |
2024-03-09 | Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information | Qiaochu Huang et.al. | 2403.05834 | link |
2024-03-08 | Integrating Predictive Motion Uncertainties with Distributionally Robust Risk-Aware Control for Safe Robot Navigation in Crowds | Kanghyun Ryu et.al. | 2403.05081 | link |
2024-03-11 | Fooling Neural Networks for Motion Forecasting via Adversarial Attacks | Edgar Medina et.al. | 2403.04954 | null |
2024-03-06 | HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations | Peng Dai et.al. | 2403.03561 | null |
2024-03-01 | Tri-Modal Motion Retrieval by Learning a Joint Embedding Space | Kangning Yin et.al. | 2403.00691 | null |
2024-02-21 | Context-based Interpretable Spatio-Temporal Graph Convolutional Network for Human Motion Forecasting | Edgar Medina et.al. | 2402.19237 | link |
2024-02-29 | MOSAIC: A Modular System for Assistive and Interactive Cooking | Huaxiaoyue Wang et.al. | 2402.18796 | null |
2024-02-27 | SocialCVAE: Predicting Pedestrian Trajectory via Interaction Conditioned Latents | Wei Xiang et.al. | 2402.17339 | link |
2024-02-27 | LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment | Yiming Ren et.al. | 2402.17171 | null |
2024-03-06 | Expressive Whole-Body Control for Humanoid Robots | Xuxin Cheng et.al. | 2402.16796 | null |
2024-02-23 | Seamless Human Motion Composition with Blended Positional Encodings | German Barquero et.al. | 2402.15509 | link |
2024-03-05 | 3D Kinematics Estimation from Video with a Biomechanical Model and Synthetic Training Data | Zhi-Yi Lin et.al. | 2402.13172 | null |
2024-02-20 | A Recurrent Neural Network Enhanced Unscented Kalman Filter for Human Motion Prediction | Wansong Liu et.al. | 2402.13045 | null |
2024-02-19 | Human Video Translation via Query Warping | Haiming Zhu et.al. | 2402.12099 | null |
2024-02-04 | Custom IMU-Based Wearable System for Robust 2.4 GHz Wireless Human Body Parts Orientation Tracking and 3D Movement Visualization on an Avatar | Javier González-Alonso et.al. | 2402.09459 | null |
2024-01-30 | Progress in artificial intelligence applications based on the combination of self-driven sensors and deep learning | Weixiang Wan et.al. | 2402.09442 | null |
2024-02-13 | Approximately Piecewise E(3) Equivariant Point Networks | Matan Atzmon et.al. | 2402.08529 | null |
2024-02-11 | Self-Correcting Self-Consuming Loops for Generative Model Training | Nate Gillman et.al. | 2402.07087 | link |
2024-02-06 | Bidirectional Autoregressive Diffusion Model for Dance Generation | Canyu Zhang et.al. | 2402.04356 | link |
2024-02-06 | Novel IMU-based Adaptive Estimator of the Center of Rotation of Joints for Movement Analysis | Sara García-de-Villa et.al. | 2402.04240 | null |
2024-02-05 | Replication of Impedance Identification Experiments on a Reinforcement-Learning-Controlled Digital Twin of Human Elbows | Hao Yu et.al. | 2402.02904 | null |
2024-02-01 | Transferring human emotions to robot motions using Neural Policy Style Transfer | Raul Fernandez-Fernandez et.al. | 2402.00663 | null |
2024-01-25 | Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks | Tianhe Ren et.al. | 2401.14159 | link |
2024-01-24 | Generative Human Motion Stylization in Latent Space | Chuan Guo et.al. | 2401.13505 | null |
2024-01-24 | GTAutoAct: An Automatic Datasets Generation Framework Based on Game Engine Redevelopment for Action Recognition | Xingyu Song et.al. | 2401.13414 | null |
2024-01-23 | Workspace Optimization Techniques to Improve Prediction of Human Motion During Human-Robot Collaboration | Yi-Shiuan Tung et.al. | 2401.12965 | null |
2024-01-23 | Inertial Sensors for Human Motion Analysis: A Comprehensive Review | Sara García-de-Villa et.al. | 2401.12919 | null |
2024-01-23 | A database of physical therapy exercises with variability of execution collected by wearable sensors | Sara García-de-Villa et.al. | 2401.12868 | null |
2024-01-22 | Full-Body Motion Reconstruction with Sparse Sensing from Graph Perspective | Feiyu Yao et.al. | 2401.11783 | link |
2024-01-24 | MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation | Nhat M. Hoang et.al. | 2401.11115 | link |
2024-01-19 | Equivariant Graph Neural Operator for Modeling 3D Dynamics | Minkai Xu et.al. | 2401.11037 | link |
2024-01-16 | RoHM: Robust Human Motion Reconstruction via Diffusion | Siwei Zhang et.al. | 2401.08570 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-06-10 | Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation | Peize Sun et.al. | 2406.06525 | link |
2024-06-10 | The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems | Philippe Gonzalez et.al. | 2406.06160 | null |
2024-06-10 | ProcessPainter: Learn Painting Process from Sequence Data | Yiren Song et.al. | 2406.06062 | null |
2024-06-09 | OmniControlNet: Dual-stage Integration for Conditional Image Generation | Yilin Wang et.al. | 2406.05871 | null |
2024-06-09 | Unified Text-to-Image Generation and Retrieval | Leigang Qu et.al. | 2406.05814 | null |
2024-06-11 | MLCM: Multistep Consistency Distillation of Latent Diffusion Model | Qingsong Xie et.al. | 2406.05768 | null |
2024-06-09 | PaRa: Personalizing Text-to-Image Diffusion via Parameter Rank Reduction | Shangyu Chen et.al. | 2406.05641 | null |
2024-06-09 | Can Prompt Modifiers Control Bias? A Comparative Analysis of Text-to-Image Generative Models | Philip Wootaek Shin et.al. | 2406.05602 | null |
2024-06-08 | Medical Vision Generalist: Unifying Medical Imaging Tasks in Context | Sucheng Ren et.al. | 2406.05565 | link |
2024-06-08 | Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis | Zanlin Ni et.al. | 2406.05478 | null |
2024-06-07 | GenHeld: Generating and Editing Handheld Objects | Chaerin Min et.al. | 2406.05059 | link |
2024-06-07 | GANetic Loss for Generative Adversarial Networks with a Focus on Medical Applications | Shakhnaz Akhmedova et.al. | 2406.05023 | link |
2024-06-07 | AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation | Lianyu Pang et.al. | 2406.05000 | null |
2024-06-07 | TEDi Policy: Temporally Entangled Diffusion for Robotic Control | Sigmund H. Høeg et.al. | 2406.04806 | null |
2024-06-07 | PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction | Eduard Poesina et.al. | 2406.04746 | link |
2024-06-07 | GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models | Diptanu De et.al. | 2406.04654 | null |
2024-06-07 | CLoG: Benchmarking Continual Learning of Image Generation Models | Haotian Zhang et.al. | 2406.04584 | link |
2024-06-06 | Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance | Reyhane Askari Hemmat et.al. | 2406.04551 | null |
2024-06-06 | GenAI Arena: An Open Evaluation Platform for Generative Models | Dongfu Jiang et.al. | 2406.04485 | null |
2024-06-06 | Evaluating Large Vision-Language Models' Understanding of Real-World Complexities Through Synthetic Benchmarks | Haokun Zhou et.al. | 2406.04470 | null |
2024-06-06 | Coherent Zero-Shot Visual Instruction Generation | Quynh Phung et.al. | 2406.04337 | null |
2024-06-06 | BitsFusion: 1.99 bits Weight Quantization of Diffusion Model | Yang Sui et.al. | 2406.04333 | link |
2024-06-06 | Diffusion-based image inpainting with internal learning | Nicolas Cherel et.al. | 2406.04206 | null |
2024-06-06 | Machine Learning-Driven Microwave Imaging for Soil Moisture Estimation near Leaky Pipe | Mohammad Ramezaninia et.al. | 2406.04193 | null |
2024-06-06 | Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis | Marianna Ohanyan et.al. | 2406.04032 | link |
2024-06-06 | Quantum Implicit Neural Representations | Jiaming Zhao et.al. | 2406.03873 | link |
2024-06-06 | Semantic Similarity Score for Measuring Visual Similarity at Semantic Level | Senran Fan et.al. | 2406.03865 | null |
2024-06-06 | Malware Classification Based on Image Segmentation | Wanhu Nie et.al. | 2406.03831 | null |
2024-06-07 | ReDistill: Residual Encoded Distillation for Peak Memory Reduction | Fang Chen et.al. | 2406.03744 | null |
2024-06-06 | JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits | Minzhou Pan et.al. | 2406.03720 | link |
2024-06-05 | Tackling GenAI Copyright Issues: Originality Estimation and Genericization | Hiroaki Chiba-Okabe et.al. | 2406.03341 | null |
2024-06-05 | Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion | Hao Wen et.al. | 2406.03184 | link |
2024-06-05 | Language-guided Detection and Mitigation of Unknown Dataset Bias | Zaiying Zhao et.al. | 2406.02889 | null |
2024-06-06 | Inv-Adapter: ID Customization Generation via Image Inversion and Lightweight Adapter | Peng Xing et.al. | 2406.02881 | null |
2024-06-04 | Latent Style-based Quantum GAN for high-quality Image Generation | Su Yeon Chang et.al. | 2406.02668 | null |
2024-06-04 | DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering | Zhongpai Gao et.al. | 2406.02518 | null |
2024-06-04 | Guiding a Diffusion Model with a Bad Version of Itself | Tero Karras et.al. | 2406.02507 | null |
2024-06-04 | Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation | Jiajun Wang et.al. | 2406.02485 | link |
2024-06-04 | Generative Active Learning for Long-tailed Instance Segmentation | Muzhi Zhu et.al. | 2406.02435 | link |
2024-06-05 | Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation | Clement Chadebec et.al. | 2406.02347 | link |
2024-06-04 | I4VGen: Image as Stepping Stone for Text-to-Video Generation | Xiefan Guo et.al. | 2406.02230 | null |
2024-06-04 | Analyzing the Feature Extractor Networks for Face Image Synthesis | Erdi Sarıtaş et.al. | 2406.02153 | link |
2024-06-04 | The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise | Yuanhao Ban et.al. | 2406.01970 | null |
2024-06-04 | Enhance Image-to-Image Generation with LLaVA Prompt and Negative Prompt | Zhicheng Ding et.al. | 2406.01956 | null |
2024-06-04 | Plug-and-Play Diffusion Distillation | Yi-Ting Hsiao et.al. | 2406.01954 | null |
2024-05-31 | Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling | Jiatao Gu et.al. | 2405.21048 | null |
2024-05-31 | You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet | Zhen Qin et.al. | 2405.21022 | null |
2024-05-31 | Amortizing intractable inference in diffusion models for vision, language, and control | Siddarth Venkatraman et.al. | 2405.20971 | link |
2024-05-31 | Information Theoretic Text-to-Image Alignment | Chao Wang et.al. | 2405.20759 | null |
2024-05-31 | Diffusion Models Are Innate One-Step Generators | Bowen Zheng et.al. | 2405.20750 | link |
2024-05-31 | Cyclic image generation using chaotic dynamics | Takaya Tanaka et.al. | 2405.20717 | link |
2024-05-31 | Enhancing Counterfactual Image Generation Using Mahalanobis Distance with Distribution Preferences in Feature Space | Yukai Zhang et.al. | 2405.20685 | null |
2024-05-31 | Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling | Kidist Amde Mekonnen et.al. | 2405.20675 | link |
2024-05-31 | Fourier123: One Image to High-Quality 3D Object Generation with Hybrid Fourier Score Distillation | Shuzhou Yang et.al. | 2405.20669 | link |
2024-05-31 | Learning Gaze-aware Compositional GAN | Nerea Aranjuelo et.al. | 2405.20643 | link |
2024-05-30 | SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow | Chaoyang Wang et.al. | 2405.20282 | link |
2024-05-30 | ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections | Massimo Bini et.al. | 2405.20271 | link |
2024-05-30 | Boost Your Own Human Image Generation Model via Direct Preference Optimization with AI Feedback | Sanghyeon Na et.al. | 2405.20216 | null |
2024-05-30 | RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection | Zhiyuan He et.al. | 2405.20112 | null |
2024-05-30 | Mitigating annotation shift in cancer classification using single image generative models | Marta Buetas Arcas et.al. | 2405.19754 | link |
2024-05-30 | Text Guided Image Editing with Automatic Concept Locating and Forgetting | Jia Li et.al. | 2405.19708 | null |
2024-05-30 | Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian | Wei Sun et.al. | 2405.19657 | null |
2024-05-30 | Creating Language-driven Spatial Variations of Icon Images | Xianghao Xu et.al. | 2405.19636 | null |
2024-05-29 | Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models | Venkat Venkatasubramanian et.al. | 2405.19561 | null |
2024-05-29 | MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection | Raman Dutt et.al. | 2405.19458 | null |
2024-05-29 | ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning | Ruchika Chavhan et.al. | 2405.19237 | link |
2024-05-29 | Going beyond compositional generalization, DDPMs can produce zero-shot interpolation | Justin Deschenaux et.al. | 2405.19201 | link |
2024-05-29 | The ethical situation of DALL-E 2 | Eduard Hogea et.al. | 2405.19176 | null |
2024-05-29 | Patch-enhanced Mask Encoder Prompt Image Generation | Shusong Xu et.al. | 2405.19085 | null |
2024-05-29 | EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture | Jiaqi Xu et.al. | 2405.18991 | link |
2024-05-29 | Topological Perspectives on Optimal Multimodal Embedding Spaces | Abdul Aziz A. B et.al. | 2405.18867 | null |
2024-05-29 | Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching | Yasi Zhang et.al. | 2405.18816 | null |
2024-05-29 | SketchTriplet: Self-Supervised Scenarized Sketch-Text-Image Triplet Generation | Zhenbei Wu et.al. | 2405.18801 | null |
2024-05-30 | Inpaint Biases: A Pathway to Accurate and Unbiased Image Generation | Jiyoon Myung et.al. | 2405.18762 | null |
2024-05-29 | SketchDeco: Decorating B&W Sketches with Colour | Chaitat Utintu et.al. | 2405.18716 | link |
2024-05-28 | 3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting | Qihang Zhang et.al. | 2405.18424 | null |
2024-05-28 | Phased Consistency Model | Fu-Yun Wang et.al. | 2405.18407 | null |
2024-05-28 | Multi-modal Generation via Cross-Modal In-Context Learning | Amandeep Kumar et.al. | 2405.18304 | link |
2024-05-28 | Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers? | Zebin You et.al. | 2405.18029 | null |
2024-05-28 | Cycle-YOLO: A Efficient and Robust Framework for Pavement Damage Detection | Zhengji Li et.al. | 2405.17905 | null |
2024-05-27 | RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance | Jiaojiao Fan et.al. | 2405.17661 | null |
2024-05-27 | Prompt Optimization with Human Feedback | Xiaoqiang Lin et.al. | 2405.17346 | link |
2024-05-27 | From Text to Blueprint: Leveraging Text-to-Image Tools for Floor Plan Creation | Xiaoyu Li et.al. | 2405.17236 | null |
2024-05-27 | Training-free Editioning of Text-to-Image Models | Jinqi Wang et.al. | 2405.17069 | null |
2024-05-27 | The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models | Saravanan Kandasamy et.al. | 2405.17068 | null |
2024-05-27 | Glauber Generative Model: Discrete Diffusion Models via Binary Classification | Harshit Varma et.al. | 2405.17035 | null |
2024-05-27 | Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation | Liang Shi et.al. | 2405.16895 | null |
2024-05-27 | Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks | Yunqi Zhang et.al. | 2405.16860 | link |
2024-05-27 | Unified Editing of Panorama, 3D Scenes, and Videos Through Disentangled Self-Attention Injection | Gihyun Kwon et.al. | 2405.16823 | null |
2024-05-27 | TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing | Xinyu Zhang et.al. | 2405.16803 | null |
2024-05-27 | PromptFix: You Prompt and We Fix the Photo | Yongsheng Yu et.al. | 2405.16785 | null |
2024-05-24 | FastDrag: Manipulate Anything in One Step | Xuanjia Zhao et.al. | 2405.15769 | null |
2024-05-24 | Learning to Discretize Denoising Diffusion ODEs | Vinh Tong et.al. | 2405.15506 | null |
2024-05-24 | A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence | Ali Kashefi et.al. | 2405.15406 | link |
2024-05-24 | Stochastic SR for Gaussian microtextures | Emile Pierret et.al. | 2405.15399 | null |
2024-05-24 | Challenges and Opportunities in 3D Content Generation | Ke Zhao et.al. | 2405.15335 | null |
2024-05-24 | Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model | Mingyang Yi et.al. | 2405.15330 | null |
2024-05-24 | SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance | Guibao Shen et.al. | 2405.15321 | null |
2024-05-24 | Enhancing Text-to-Image Editing via Hybrid Mask-Informed Fusion | Aoxue Li et.al. | 2405.15313 | null |
2024-05-24 | Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient | Yongliang Wu et.al. | 2405.15304 | null |
2024-05-24 | StyleMaster: Towards Flexible Stylized Image Generation with Diffusion Models | Chengming Xu et.al. | 2405.15287 | null |
2024-05-23 | Improved Distribution Matching Distillation for Fast Image Synthesis | Tianwei Yin et.al. | 2405.14867 | link |
2024-05-23 | Semantica: An Adaptable Image-Conditioned Diffusion Model | Manoj Kumar et.al. | 2405.14857 | null |
2024-05-23 | TerDiT: Ternary Diffusion Models with Transformers | Xudong Lu et.al. | 2405.14854 | link |
2024-05-23 | Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models | Katherine Xu et.al. | 2405.14828 | null |
2024-05-24 | Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation | Hongxu Jiang et.al. | 2405.14802 | link |
2024-05-23 | Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy | Shengfang Zhai et.al. | 2405.14800 | null |
2024-05-23 | RetAssist: Facilitating Vocabulary Learners with Generative Images in Story Retelling Practices | Qiaoyi Chen et.al. | 2405.14794 | null |
2024-05-23 | EditWorld: Simulating World Dynamics for Instruction-Following Image Editing | Ling Yang et.al. | 2405.14785 | link |
2024-05-23 | OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance | Shuheng Ge et.al. | 2405.14709 | null |
2024-05-23 | Learning Multi-dimensional Human Preference for Text-to-Image Generation | Sixian Zhang et.al. | 2405.14705 | null |
2024-05-21 | Personalized Residuals for Concept-Driven Text-to-Image Generation | Cusuh Ham et.al. | 2405.12978 | null |
2024-05-21 | An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation | Zhiyu Tan et.al. | 2405.12914 | link |
2024-05-21 | Leveraging Neural Radiance Fields for Pose Estimation of an Unknown Space Object during Proximity Operations | Antoine Legrand et.al. | 2405.12728 | null |
2024-05-21 | EmoEdit: Evoking Emotions through Image Manipulation | Jingyuan Yang et.al. | 2405.12661 | null |
2024-05-21 | CustomText: Customized Textual Image Generation using Diffusion Models | Shubham Paliwal et.al. | 2405.12531 | null |
2024-05-21 | Customize Your Own Paired Data via Few-shot Way | Jinshu Chen et.al. | 2405.12490 | null |
2024-05-20 | Diffusion for World Modeling: Visual Details Matter in Atari | Eloi Alonso et.al. | 2405.12399 | link |
2024-05-20 | Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices | Nathaniel Cohen et.al. | 2405.12211 | link |
2024-05-20 | Diffusion Models for Generating Ballistic Spacecraft Trajectories | Tyler Presser et.al. | 2405.11738 | link |
2024-05-19 | URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images | Zoey Chen et.al. | 2405.11656 | null |
2024-05-18 | UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers | Duo Peng et.al. | 2405.11336 | null |
2024-05-18 | On the Trajectory Regularity of ODE-based Diffusion Sampling | Defang Chen et.al. | 2405.11326 | link |
2024-05-18 | TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation | Chengcheng Feng et.al. | 2405.11236 | null |
2024-05-18 | ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing | Ying Jin et.al. | 2405.11190 | null |
2024-05-17 | Improving face generation quality and prompt following with synthetic captions | Michail Tarasiou et.al. | 2405.10864 | null |
2024-05-17 | Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image | Jianshun Zeng et.al. | 2405.10504 | null |
2024-05-17 | Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers | Rya Sanovar et.al. | 2405.10480 | null |
2024-05-16 | UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models | Sahel Sharifymoghaddam et.al. | 2405.10311 | null |
2024-05-16 | VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing | Binghui Chen et.al. | 2405.09985 | null |
2024-05-16 | KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment | Zhengxu Shi et.al. | 2405.09964 | null |
2024-05-16 | Chameleon: Mixed-Modal Early-Fusion Foundation Models | Chameleon Team et.al. | 2405.09818 | null |
2024-05-16 | MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis | Joseph Cho et.al. | 2405.09806 | null |
2024-05-16 | Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images | Memoona Aziz et.al. | 2405.09426 | null |
2024-05-15 | DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations | Nima Fathi et.al. | 2405.09288 | link |
2024-05-14 | Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding | Zhimin Li et.al. | 2405.08748 | link |
2024-05-15 | Similarity Metrics for MR Image-To-Image Translation | Melanie Dohmen et.al. | 2405.08431 | null |
2024-05-14 | Compositional Text-to-Image Generation with Dense Blob Representations | Weili Nie et.al. | 2405.08246 | null |
2024-05-13 | RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations | Chengde Lin et.al. | 2405.08114 | link |
2024-05-13 | CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models | Nick Stracke et.al. | 2405.07913 | null |
2024-05-13 | SAR Image Synthesis with Diffusion Models | Denisa Qosja et.al. | 2405.07776 | null |
2024-05-12 | Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning | Jiarui Wang et.al. | 2405.07346 | link |
2024-05-12 | Stable Signature is Unstable: Removing Image Watermark from Diffusion Models | Yuepeng Hu et.al. | 2405.07145 | null |
2024-05-12 | MAxPrototyper: A Multi-Agent Generation System for Interactive User Interface Prototyping | Mingyue Yuan et.al. | 2405.07131 | null |
2024-05-11 | Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior | Ce Wang et.al. | 2405.07044 | link |
2024-05-11 | Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation | Shengyuan Liu et.al. | 2405.06948 | null |
2024-05-10 | Deep MMD Gradient Flow without adversarial training | Alexandre Galashov et.al. | 2405.06780 | null |
2024-05-10 | Controllable Image Generation With Composed Parallel Token Prediction | Jamie Stirling et.al. | 2405.06535 | null |
2024-05-14 | SketchDream: Sketch-based Text-to-3D Generation and Editing | Feng-Lin Liu et.al. | 2405.06461 | null |
2024-05-09 | Photonic quantum generative adversarial networks for classical data | Tigran Sedrakyan et.al. | 2405.06023 | null |
2024-05-09 | Frame Interpolation with Consecutive Brownian Bridge Diffusion | Zonglin Lyu et.al. | 2405.05953 | null |
2024-05-09 | Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models | Zhe Ma et.al. | 2405.05846 | null |
2024-05-10 | MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation | Yuxiang Wei et.al. | 2405.05806 | link |
2024-05-09 | DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation | Sitian Shen et.al. | 2405.05800 | null |
2024-05-09 | Exploring Text-Guided Single Image Editing for Remote Sensing Images | Fangzhou Han et.al. | 2405.05769 | null |
2024-05-09 | End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base | Shuling Li et.al. | 2405.05738 | null |
2024-05-09 | VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis | Zhihan Ju et.al. | 2405.05667 | null |
2024-05-09 | A Survey on Personalized Content Synthesis with Diffusion Models | Xulu Zhang et.al. | 2405.05538 | null |
2024-05-08 | Cross-Modality Translation with Generative Adversarial Networks to Unveil Alzheimer's Disease Biomarkers | Reihaneh Hassanzadeh et.al. | 2405.05462 | null |
2024-05-08 | DrawL: Understanding the Effects of Non-Mainstream Dialects in Prompted Image Generation | Joshua N. Williams et.al. | 2405.05382 | link |
2024-05-08 | Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo | Nayantara Mudur et.al. | 2405.05255 | link |
2024-05-08 | Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI | Keqiang Fan et.al. | 2405.04974 | null |
2024-05-08 | HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image Synthesis | Zhihan Ju et.al. | 2405.04902 | null |
2024-05-08 | FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation | Xuehai He et.al. | 2405.04834 | null |
2024-05-07 | TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model | Yongming Zhang et.al. | 2405.04675 | null |
2024-05-07 | ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography | Syed Jamal Safdar Gardezi et.al. | 2405.04629 | null |
2024-05-07 | Towards Geographic Inclusion in the Evaluation of Text-to-Image Models | Melissa Hall et.al. | 2405.04457 | null |
2024-05-07 | Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation | Jihyun Kim et.al. | 2405.04356 | null |
2024-05-08 | Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer | Zhuoyi Yang et.al. | 2405.04312 | link |
2024-05-07 | Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map | Yuxuan Xia et.al. | 2405.04290 | null |
2024-05-07 | SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing | Yuying Ge et.al. | 2405.04007 | link |
2024-05-07 | Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model | Joo Young Choi et.al. | 2405.03958 | null |
2024-05-06 | Generated Contents Enrichment | Mahdi Naseri et.al. | 2405.03650 | null |
2024-05-06 | CCDM: Continuous Conditional Diffusion Models for Image Generation | Xin Ding et.al. | 2405.03546 | link |
2024-05-05 | Data-Efficient Molecular Generation with Hierarchical Textual Inversion | Seojin Kim et.al. | 2405.02845 | null |
2024-05-05 | ImageInWords: Unlocking Hyper-Detailed Image Descriptions | Roopal Garg et.al. | 2405.02793 | link |
2024-05-04 | U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers | Yuchuan Tian et.al. | 2405.02730 | link |
2024-05-03 | Functional Imaging Constrained Diffusion for Brain PET Synthesis from Structural MRI | Minhui Yu et.al. | 2405.02504 | null |
2024-05-03 | Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification | Siqi Yin et.al. | 2405.02155 | null |
2024-05-03 | AI-generated art perceptions with GenFrame -- an image-generating picture frame | Peter Kun et.al. | 2405.01901 | null |
2024-05-03 | Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition | Yichun Tai et.al. | 2405.01872 | null |
2024-05-02 | Long Tail Image Generation Through Feature Space Augmentation and Iterated Learning | Rafael Elberg et.al. | 2405.01705 | link |
2024-05-02 | LocInv: Localization-aware Inversion for Text-Guided Image Editing | Chuanming Tang et.al. | 2405.01496 | link |
2024-05-02 | Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance | Kelvin C. K. Chan et.al. | 2405.01356 | null |
2024-05-02 | Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration | Praveen Kumar Chandaliya et.al. | 2405.01273 | null |
2024-05-02 | DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines | Ye Tian et.al. | 2405.01248 | null |
2024-05-02 | On Mechanistic Knowledge Localization in Text-to-Image Generative Models | Samyadeep Basu et.al. | 2405.01008 | null |
2024-05-01 | SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models | Burak Can Biner et.al. | 2405.00878 | null |
2024-05-01 | Guided Conditional Diffusion Classifier (ConDiff) for Enhanced Prediction of Infection in Diabetic Foot Ulcers | Palawat Busaranuvong et.al. | 2405.00858 | null |
2024-05-01 | TexSliders: Diffusion-Based Texture Editing in CLIP Space | Julia Guerrero-Viu et.al. | 2405.00672 | null |
2024-05-01 | RGB |
Zheng Zeng et.al. | 2405.00666 | null |
2024-05-01 | UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement | Ruiquan Ge et.al. | 2405.00542 | link |
2024-05-01 | Compressive Sensing Imaging Using Caustic Lens Mask Generated by Periodic Perturbation in a Ripple Tank | Doğan Tunca Arık et.al. | 2405.00407 | null |
2024-05-01 | Streamlining Image Editing with Layered Diffusion Brushes | Peyman Gholami et.al. | 2405.00313 | null |
2024-04-30 | DOCCI: Descriptions of Connected and Contrasting Images | Yasumasa Onoe et.al. | 2404.19753 | null |
2024-04-30 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation | Yunhao Ge et.al. | 2404.19752 | null |
2024-04-30 | SwipeGANSpace: Swipe-to-Compare Image Generation via Efficient Latent Space Exploration | Yuto Nakashima et.al. | 2404.19693 | null |
2024-04-30 | TwinDiffusion: Enhancing Coherence and Efficiency in Panoramic Image Generation with Diffusion Models | Teng Zhou et.al. | 2404.19475 | null |
2024-04-30 | InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation | Chanran Kim et.al. | 2404.19427 | null |
2024-05-01 | FOTS: A Fast Optical Tactile Simulator for Sim2Real Learning of Tactile-motor Robot Manipulation Skills | Yongqiang Zhao et.al. | 2404.19217 | link |
2024-04-30 | NeRF-Insert: 3D Local Editing with Multimodal Control Signals | Benet Oriol Sabat et.al. | 2404.19204 | null |
2024-04-29 | DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing | Minghao Chen et.al. | 2404.18929 | null |
2024-04-29 | TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation | Junhao Cheng et.al. | 2404.18919 | link |
2024-04-29 | Hide and Seek: How Does Watermarking Impact Face Recognition? | Yuguang Yao et.al. | 2404.18890 | null |
2024-04-29 | Learning Mixtures of Gaussians Using Diffusion Models | Khashayar Gatmiry et.al. | 2404.18869 | null |
2024-04-29 | FlexiFilm: Long Video Generation with Flexible Conditions | Yichen Ouyang et.al. | 2404.18620 | link |
2024-04-29 | Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting | Tianyidan Xie et.al. | 2404.18598 | null |
2024-04-29 | SIDBench: A Python Framework for Reliably Assessing Synthetic Image Detection Methods | Manos Schinas et.al. | 2404.18552 | link |
2024-04-29 | Towards Image Synthesis with Photon Counting Stellar Intensity Interferometry | Alessia Spolon et.al. | 2404.18507 | null |
2024-04-29 | Autonomous Quality and Hallucination Assessment for Virtual Tissue Staining and Digital Pathology | Luzhe Huang et.al. | 2404.18458 | null |
2024-04-29 | PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images | Jiquan Yuan et.al. | 2404.18409 | link |
2024-04-26 | Spatial-frequency Dual-Domain Feature Fusion Network for Low-Light Remote Sensing Image Enhancement | Zishu Yao et.al. | 2404.17400 | null |
2024-04-26 | Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection | Jiawei Song et.al. | 2404.17254 | null |
2024-04-26 | Synthesizing Iris Images using Generative Adversarial Networks: Survey and Comparative Analysis | Shivangi Yadav et.al. | 2404.17105 | null |
2024-04-25 | REBEL: Reinforcement Learning via Regressing Relative Rewards | Zhaolin Gao et.al. | 2404.16767 | link |
2024-04-27 | Denoising: from classical methods to deep CNNs | Jean-Eric Campagne et.al. | 2404.16617 | link |
2024-04-25 | MuseumMaker: Continual Style Customization without Catastrophic Forgetting | Chenxi Liu et.al. | 2404.16612 | null |
2024-04-25 | AudioScenic: Audio-Driven Video Scene Editing | Kaixin Shen et.al. | 2404.16581 | null |
2024-04-25 | Conditional Distribution Modelling for Few-Shot Image Synthesis with Diffusion Models | Parul Gupta et.al. | 2404.16556 | null |
2024-04-25 | OpenDlign: Enhancing Open-World 3D Learning with Depth-Aligned Images | Ye Mao et.al. | 2404.16538 | null |
2024-04-25 | Cross-sensor super-resolution of irregularly sampled Sentinel-2 time series | Aimi Okabayashi et.al. | 2404.16409 | link |
2024-04-26 | Guardians of the Quantum GAN | Archisman Ghosh et.al. | 2404.16156 | null |
2024-04-24 | Spinning solar jets explained through the interplay between plasma sheets and vortex columns | Sahel Dey et.al. | 2404.16096 | null |
2024-04-24 | Editable Image Elements for Controllable Synthesis | Jiteng Mu et.al. | 2404.16029 | null |
2024-04-23 | ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning | Weifeng Chen et.al. | 2404.15449 | null |
2024-04-23 | GLoD: Composing Global Contexts and Local Details in Image Generation | Moyuru Yamada et.al. | 2404.15447 | null |
2024-04-23 | From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation | Zehuan Huang et.al. | 2404.15267 | null |
2024-04-23 | Adaptive Mixed-Scale Feature Fusion Network for Blind AI-Generated Image Quality Assessment | Tianwei Zhou et.al. | 2404.15163 | null |
2024-04-23 | Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation | Xun Wu et.al. | 2404.15100 | null |
2024-04-23 | SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models | Bo Lin et.al. | 2404.14755 | null |
2024-04-23 | FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction | Hang Hua et.al. | 2404.14715 | null |
2024-04-22 | The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking | Yuying Li et.al. | 2404.14581 | null |
2024-04-22 | GeoDiffuser: Geometry-Based Image Editing with Diffusion Models | Rahul Sajnani et.al. | 2404.14403 | null |
2024-04-22 | SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation | Yuying Ge et.al. | 2404.14396 | link |
2024-04-22 | MultiBooth: Towards Generating All Your Concepts in an Image from Text | Chenyang Zhu et.al. | 2404.14239 | link |
2024-04-22 | RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance | Chengrui Wang et.al. | 2404.13984 | null |
2024-04-23 | Accelerating Image Generation with Sub-path Linear Approximation Model | Chen Xu et.al. | 2404.13903 | null |
2024-04-22 | Towards Better Text-to-Image Generation Alignment via Attention Modulation | Yihang Wu et.al. | 2404.13899 | null |
2024-04-21 | Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation | Jensen Hwa et.al. | 2404.13798 | null |
2024-04-21 | Object-Attribute Binding in Text-to-Image Generation: Evaluation and Control | Maria Mihaela Trusca et.al. | 2404.13766 | null |
2024-04-21 | ArtNeRF: A Stylized Neural Field for 3D-Aware Cartoonized Face Synthesis | Zichen Tang et.al. | 2404.13711 | link |
2024-04-21 | Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models | Vitali Petsiuk et.al. | 2404.13706 | null |
2024-04-19 | Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images | Santosh et.al. | 2404.12908 | link |
2024-04-19 | Generative Modelling with High-Order Langevin Dynamics | Ziqiang Shi et.al. | 2404.12814 | null |
2024-04-19 | PATE-TripleGAN: Privacy-Preserving Image Synthesis with Gaussian Differential Privacy | Zepeng Jiang et.al. | 2404.12730 | null |
2024-04-19 | How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples | Dren Fazlija et.al. | 2404.12653 | null |
2024-04-18 | Lazy Diffusion Transformer for Interactive Image Editing | Yotam Nitzan et.al. | 2404.12382 | null |
2024-04-18 | Customizing Text-to-Image Diffusion with Camera Viewpoint Control | Nupur Kumari et.al. | 2404.12333 | null |
2024-04-18 | Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models | Israel A. Laurensi et.al. | 2404.12260 | null |
2024-04-18 | StyleBooth: Image Style Editing with Multimodal Instruction | Zhen Han et.al. | 2404.12154 | link |
2024-04-18 | First 2D electron density measurements using Coherence Imaging Spectroscopy in the MAST-U Super-X divertor | N. Lonigro et.al. | 2404.12021 | null |
2024-04-18 | ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model | Chao Zhou et.al. | 2404.11962 | null |
2024-04-18 | LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights | Thibault Castells et.al. | 2404.11936 | null |
2024-04-18 | EdgeFusion: On-Device Text-to-Image Generation | Thibault Castells et.al. | 2404.11925 | null |
2024-04-18 | FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models | Wei Wu et.al. | 2404.11895 | null |
2024-04-18 | Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans | Lixing Tan et.al. | 2404.11889 | null |
2024-04-17 | On the Scalability of GNNs for Molecular Graphs | Maciej Sypetkowski et.al. | 2404.11568 | null |
2024-04-17 | MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation | Kuan-Chieh et.al. | 2404.11565 | null |
2024-04-17 | SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening | Yu Zhong et.al. | 2404.11537 | null |
2024-04-17 | Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt | Zhanjie Zhang et.al. | 2404.11474 | link |
2024-04-17 | Image Generative Semantic Communication with Multi-Modal Similarity Estimation for Resource-Limited Networks | Eri Hosonuma et.al. | 2404.11280 | null |
2024-04-17 | Optical Image-to-Image Translation Using Denoising Diffusion Models: Heterogeneous Change Detection as a Use Case | João Gabriel Vinholi et.al. | 2404.11243 | null |
2024-04-17 | TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing | Sherry X. Chen et.al. | 2404.11120 | link |
2024-04-16 | LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? | Yuchi Wang et.al. | 2404.10763 | link |
2024-04-16 | Adversarial Identity Injection for Semantic Face Image Synthesis | Giuseppe Tarollo et.al. | 2404.10408 | null |
2024-04-16 | Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery | Payal Varshney et.al. | 2404.10356 | null |
2024-04-16 | CanvasPic: An Interactive Tool for Freely Generating Facial Images Based on Spatial Layout | Jiafu Wei et.al. | 2404.10352 | null |
2024-04-17 | OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model | Runyi Li et.al. | 2404.10312 | null |
2024-04-16 | OneActor: Consistent Character Generation via Cluster-Conditioned Guidance | Jiahao Wang et.al. | 2404.10267 | null |
2024-04-16 | Diffusion assisted image reconstruction in optoacoustic tomography | M. G. González et.al. | 2404.10239 | null |
2024-04-15 | Multi-objective evolutionary GAN for tabular data synthesis | Nian Ran et.al. | 2404.10176 | link |
2024-04-15 | ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis | Aashish Anantha Ramakrishnan et.al. | 2404.10141 | link |
2024-04-15 | HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing | Mude Hui et.al. | 2404.09990 | null |
2024-04-15 | Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models | Ziwei Luo et.al. | 2404.09732 | link |
2024-04-15 | In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation | Han Xue et.al. | 2404.09633 | null |
2024-04-15 | Magic Clothing: Controllable Garment-Driven Image Synthesis | Weifeng Chen et.al. | 2404.09512 | link |
2024-04-15 | Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models | Peifei Zhu et.al. | 2404.09401 | null |
2024-04-14 | DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling | Xuening Yuan et.al. | 2404.09227 | null |
2024-04-13 | InverseVis: Revealing the Hidden with Curved Sphere Tracing | Kai Lawonn et.al. | 2404.09092 | null |
2024-04-13 | Diffusion Models Meet Remote Sensing: Principles, Methods, and Perspectives | Yidan Liu et.al. | 2404.08926 | null |
2024-04-12 | E3: Ensemble of Expert Embedders for Adapting Synthetic Image Detectors to New Generators Using Limited Data | Aref Azizpour et.al. | 2404.08814 | link |
2024-04-12 | Semantic Approach to Quantifying the Consistency of Diffusion Model Image Generation | Brinnae Bent et.al. | 2404.08799 | link |
2024-04-12 | Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts | Yang Li et.al. | 2404.08341 | link |
2024-04-11 | Latent Guard: a Safety Framework for Text-to-image Generation | Runtao Liu et.al. | 2404.08031 | link |
2024-04-11 | Rethinking Artistic Copyright Infringements in the Era of Text-to-Image Generative Models | Mazda Moayeri et.al. | 2404.08030 | null |
2024-04-11 | OpenBias: Open-set Bias Detection in Text-to-Image Generative Models | Moreno D'Incà et.al. | 2404.07990 | link |
2024-04-11 | Taming Stable Diffusion for Text to 360° Panorama Image Generation | Cheng Zhang et.al. | 2404.07949 | link |
2024-04-11 | Generating Synthetic Satellite Imagery With Deep-Learning Text-to-Image Models -- Technical Challenges and Implications for Monitoring and Verification | Tuong Vy Nguyen et.al. | 2404.07754 | null |
2024-04-11 | Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models | Tuomas Kynkäänniemi et.al. | 2404.07724 | null |
2024-04-11 | Model-based Cleaning of the QUILT-1M Pathology Dataset for Text-Conditional Image Synthesis | Marc Aubreville et.al. | 2404.07676 | link |
2024-04-11 | Implicit and Explicit Language Guidance for Diffusion-based Visual Perception | Hefeng Wang et.al. | 2404.07600 | null |
2024-04-11 | ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation | Stanislav Frolov et.al. | 2404.07564 | null |
2024-04-11 | CAT: Contrastive Adapter Training for Personalized Image Generation | Jae Wan Park et.al. | 2404.07554 | link |
2024-04-10 | Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models | Yasi Zhang et.al. | 2404.07389 | null |
2024-04-10 | RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion | Jaidev Shriram et.al. | 2404.07199 | null |
2024-04-10 | A Gauss-Newton Approach for Min-Max Optimization in Generative Adversarial Networks | Neel Mishra et.al. | 2404.07172 | link |
2024-04-10 | Fine color guidance in diffusion models and its application to image compression at extremely low bitrates | Tom Bordin et.al. | 2404.06865 | null |
2024-04-10 | UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion | Junsheng Zhou et.al. | 2404.06851 | null |
2024-04-10 | MedRG: Medical Report Grounding with Multi-modal Large Language Model | Ke Zou et.al. | 2404.06798 | null |
2024-04-10 | Deep Generative Data Assimilation in Multimodal Setting | Yongquan Qu et.al. | 2404.06665 | link |
2024-04-09 | GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis | Srikumar Sastry et.al. | 2404.06637 | link |
2024-04-09 | High Noise Scheduling is a Must | Mahmut S. Gokmen et.al. | 2404.06353 | null |
2024-04-09 | Hyperparameter-Free Medical Image Synthesis for Sharing Data and Improving Site-Specific Segmentation | Alexander Chebykin et.al. | 2404.06240 | link |
2024-04-09 | DiffHarmony: Latent Diffusion Model Meets Image Harmonization | Pengfei Zhou et.al. | 2404.06139 | null |
2024-04-09 | Tackling Structural Hallucination in Image Translation with Local Diffusion | Seunghoi Kim et.al. | 2404.05980 | null |
2024-04-09 | StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion | Ming Tao et.al. | 2404.05979 | link |
2024-04-08 | SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing | Jing Gu et.al. | 2404.05717 | null |
2024-04-08 | MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation | Kunpeng Song et.al. | 2404.05674 | link |
2024-04-08 | Automatic Controllable Colorization via Imagination | Xiaoyan Cong et.al. | 2404.05661 | null |
2024-04-08 | UniFL: Improve Stable Diffusion via Unified Feedback Learning | Jiacheng Zhang et.al. | 2404.05595 | null |
2024-04-08 | Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models | Saman Motamed et.al. | 2404.05519 | null |
2024-04-08 | Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI | Hugo Caselles-Dupré et.al. | 2404.05468 | null |
2024-04-08 | Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt | Zhiqi Huang et.al. | 2404.05331 | null |
2024-04-08 | MC |
Jiaxiu Jiang et.al. | 2404.05268 | link |
2024-04-08 | Text-to-Image Synthesis for Any Artistic Styles: Advancements in Personalized Artistic Image Generation via Subdivision and Dual Binding | Junseo Park et.al. | 2404.05256 | null |
2024-04-08 | A secure and private ensemble matcher using multi-vault obfuscated templates | Babak Poorebrahim Gilkalaye et.al. | 2404.05205 | null |
2024-04-04 | No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance | Vishaal Udandarao et.al. | 2404.04125 | link |
2024-04-05 | 3D Facial Expressions through Analysis-by-Neural-Synthesis | George Retsinas et.al. | 2404.04104 | null |
2024-04-05 | Dynamic Prompt Optimizing for Text-to-Image Generation | Wenyi Mo et.al. | 2404.04095 | link |
2024-04-05 | Physics-Inspired Synthesized Underwater Image Dataset | Reina Kaneko et.al. | 2404.03998 | null |
2024-04-05 | Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models | Gihyun Kwon et.al. | 2404.03913 | null |
2024-04-04 | CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching | Dongzhi Jiang et.al. | 2404.03653 | link |
2024-04-04 | Reference-Based 3D-Aware Image Editing with Triplane | Bahri Batuhan Bilecen et.al. | 2404.03632 | null |
2024-04-04 | Robust Concept Erasure Using Task Vectors | Minh Pham et.al. | 2404.03631 | null |
2024-04-04 | Multi Positive Contrastive Learning with Pose-Consistent Generated Images | Sho Inayoshi et.al. | 2404.03256 | null |
2024-04-04 | Would Deep Generative Models Amplify Bias in Future Models? | Tianwei Chen et.al. | 2404.03242 | null |
2024-04-04 | Diverse and Tailored Image Generation for Zero-shot Multi-label Classification | Kaixin Zhang et.al. | 2404.03144 | null |
2024-04-04 | GaSpCT: Gaussian Splatting for Novel CT Projection View Synthesis | Emmanouil Nikolakakis et.al. | 2404.03126 | null |
2024-04-03 | Many-to-many Image Generation with Auto-regressive Diffusion Models | Ying Shen et.al. | 2404.03109 | null |
2024-04-03 | Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction | Keyu Tian et.al. | 2404.02905 | link |
2024-04-03 | MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment | Duygu Ceylan et.al. | 2404.02899 | null |
2024-04-03 | Deep Image Composition Meets Image Forgery | Eren Tahir et.al. | 2404.02897 | link |
2024-04-03 | On the Scalability of Diffusion-based Text-to-Image Generation | Hao Li et.al. | 2404.02883 | null |
2024-04-03 | MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation | Petru-Daniel Tudosiu et.al. | 2404.02790 | null |
2024-04-03 | InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation | Haofan Wang et.al. | 2404.02733 | link |
2024-04-03 | Model-agnostic Origin Attribution of Generated Images with Few-shot Examples | Fengyuan Liu et.al. | 2404.02697 | null |
2024-04-03 | Severity Controlled Text-to-Image Generative Model Bias Manipulation | Jordan Vice et.al. | 2404.02530 | null |
2024-04-02 | Diffusion |
Zeyu Yang et.al. | 2404.02148 | link |
2024-04-02 | 3D Congealing: 3D-Aware Image Alignment in the Wild | Yunzhi Zhang et.al. | 2404.02125 | null |
2024-04-02 | Fashion Style Editing with Generative Human Prior | Chaerin Kong et.al. | 2404.01984 | null |
2024-04-02 | Bi-LORA: A Vision-Language Approach for Synthetic Image Detection | Mamadou Keita et.al. | 2404.01959 | link |
2024-04-02 | Real, fake and synthetic faces -- does the coin have three sides? | Shahzeb Naeem et.al. | 2404.01878 | null |
2024-04-02 | Disentangled Pre-training for Human-Object Interaction Detection | Zhuolong Li et.al. | 2404.01725 | link |
2024-04-01 | PlayFutures: Imagining Civic Futures with AI and Puppets | Supratim Pait et.al. | 2404.01527 | null |
2024-04-01 | Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data | Matthias Gerstgrasser et.al. | 2404.01413 | null |
2024-04-01 | An image speaks a thousand words, but can everyone listen? On translating images for cultural relevance | Simran Khanuja et.al. | 2404.01247 | link |
2024-04-01 | Uncovering the Text Embedding in Text-to-Image Diffusion Models | Hu Yu et.al. | 2404.01154 | null |
2024-03-29 | Benchmarking Counterfactual Image Generation | Thomas Melistas et.al. | 2403.20287 | link |
2024-03-29 | FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models | Barbara Toniella Corradini et.al. | 2403.20105 | null |
2024-03-29 | SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image | Yunhao Li et.al. | 2403.20018 | link |
2024-04-02 | FairRAG: Fair Human Generation via Fair Retrieval Augmentation | Robik Shrestha et.al. | 2403.19964 | null |
2024-03-28 | Vision-Language Synthetic Data Enhances Echocardiography Downstream Tasks | Pooria Ashrafian et.al. | 2403.19880 | link |
2024-03-28 | Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization | Yuhang Li et.al. | 2403.19866 | null |
2024-03-28 | CLoRA: A Contrastive Approach to Compose Multiple LoRA Models | Tuna Han Salih Meral et.al. | 2403.19776 | null |
2024-03-28 | Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond | Katherine Xu et.al. | 2403.19653 | link |
2024-03-28 | GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models | Yusuf Dalva et.al. | 2403.19645 | null |
2024-03-28 | Collaborative Interactive Evolution of Art in the Latent Space of Deep Generative Models | Ole Hall et.al. | 2403.19620 | link |
2024-03-28 | Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model | Zhicai Wang et.al. | 2403.19600 | link |
2024-03-28 | Frame by Familiar Frame: Understanding Replication in Video Diffusion Models | Aimon Rahman et.al. | 2403.19593 | null |
2024-03-28 | Imperceptible Protection against Style Imitation from Diffusion Models | Namhyuk Ahn et.al. | 2403.19254 | null |
2024-03-28 | QNCD: Quantization Noise Correction for Diffusion Models | Huanpeng Chu et.al. | 2403.19140 | link |
2024-03-28 | Synthetic Medical Imaging Generation with Generative Adversarial Networks For Plain Radiographs | John R. McNulty et.al. | 2403.19107 | null |
2024-03-28 | Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation | Yutong He et.al. | 2403.19103 | null |
2024-03-28 | Purposeful remixing with generative AI: Constructing designer voice in multimodal composing | Xiao Tan et.al. | 2403.19095 | null |
2024-03-27 | ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion | Daniel Winter et.al. | 2403.18818 | null |
2024-03-27 | Conditional Wasserstein Distances with Applications in Bayesian OT Flow Matching | Jannis Chemseddine et.al. | 2403.18705 | link |
2024-03-27 | InstructBrush: Learning Attention-based Instruction Optimization for Image Editing | Ruoyu Zhao et.al. | 2403.18660 | null |
2024-03-28 | FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing | Trong-Tung Nguyen et.al. | 2403.18605 | null |
2024-03-27 | Attention Calibration for Disentangled Text-to-Image Personalization | Yanbing Zhang et.al. | 2403.18551 | link |
2024-03-27 | DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis | Zhongxi Chen et.al. | 2403.18471 | link |
2024-03-27 | U-Sketch: An Efficient Approach for Sketch to Image Diffusion Models | Ilias Mitsouras et.al. | 2403.18425 | null |
2024-03-27 | ECNet: Effective Controllable Text-to-Image Diffusion Models | Sicheng Li et.al. | 2403.18417 | null |
2024-03-27 | Ship in Sight: Diffusion Models for Ship-Image Super Resolution | Luigi Sigillo et.al. | 2403.18370 | link |
2024-03-26 | Tutorial on Diffusion Models for Imaging and Vision | Stanley H. Chan et.al. | 2403.18103 | null |
2024-03-26 | Boosting Diffusion Models with Moving Average Sampling in Frequency Domain | Yurui Qian et.al. | 2403.17870 | null |
2024-03-26 | CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation | Yongrui Yu et.al. | 2403.17770 | null |
2024-03-26 | LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection | Yunpeng Luo et.al. | 2403.17465 | null |
2024-03-25 | DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment | Stella Bounareli et.al. | 2403.17217 | null |
2024-03-25 | FlashFace: Human Image Personalization with High-fidelity Identity Preservation | Shilong Zhang et.al. | 2403.17008 | null |
2024-03-25 | SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer | Rui Zhu et.al. | 2403.17004 | null |
2024-03-25 | Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation | Omer Dahary et.al. | 2403.16990 | null |
2024-03-25 | Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance | Jingyuan Zhu et.al. | 2403.16954 | null |
2024-03-25 | Iso-Diffusion: Improving Diffusion Probabilistic Models Using the Isotropy of the Additive Gaussian Noise | Dilum Fernando et.al. | 2403.16790 | null |
2024-03-25 | Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases | Sophie Starck et.al. | 2403.16776 | null |
2024-03-25 | Multi-Scale Texture Loss for CT denoising with GANs | Francesco Di Feola et.al. | 2403.16640 | link |
2024-03-25 | SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions | Yuda Song et.al. | 2403.16627 | link |
2024-03-25 | An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models | Zizhao Hu et.al. | 2403.16530 | null |
2024-03-25 | Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation | Sanyam Lakhanpal et.al. | 2403.16422 | null |
2024-03-25 | Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation | Yingshan Chang et.al. | 2403.16394 | null |
2024-03-25 | FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models | Lin Zhao et.al. | 2403.16379 | null |
2024-03-23 | Feature Manipulation for DDPM based Change Detection | Zhenglin Li et.al. | 2403.15943 | null |
2024-03-23 | Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content | Zhicheng Du et.al. | 2403.15876 | link |
2024-03-22 | DragAPart: Learning a Part-Level Motion Prior for Articulated Objects | Ruining Li et.al. | 2403.15382 | null |
2024-03-22 | Long-CLIP: Unlocking the Long-Text Capability of CLIP | Beichen Zhang et.al. | 2403.15378 | link |
2024-03-22 | Controlled Training Data Generation with Diffusion Models | Teresa Yeo et.al. | 2403.15309 | null |
2024-03-22 | A Multimodal Approach for Cross-Domain Image Retrieval | Lucas Iijima et.al. | 2403.15152 | null |
2024-03-22 | MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration | Zhichao Wei et.al. | 2403.15059 | null |
2024-03-22 | Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning | Bumsoo Kim et.al. | 2403.15048 | null |
2024-03-22 | Generative Active Learning for Image Synthesis Personalization | Xulu Zhang et.al. | 2403.14987 | link |
2024-03-22 | CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model | Seungdae Han et.al. | 2403.14944 | link |
2024-03-22 | Geometric Generative Models based on Morphological Equivariant PDEs and GANs | El Hadji S. Diop et.al. | 2403.14897 | null |
2024-03-21 | Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing | Alberto Baldrati et.al. | 2403.14828 | link |
2024-03-21 | ReNoise: Real Image Inversion Through Iterative Noising | Daniel Garibi et.al. | 2403.14602 | null |
2024-03-21 | DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing | Yueru Jia et.al. | 2403.14487 | link |
2024-03-22 | AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks | Max Ku et.al. | 2403.14468 | null |
2024-03-21 | Analysing Diffusion Segmentation for Medical Images | Mathias Öttl et.al. | 2403.14440 | null |
2024-03-21 | Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation | Mathias Öttl et.al. | 2403.14429 | null |
2024-03-21 | Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models | Pablo Marcos-Manchón et.al. | 2403.14291 | link |
2024-03-21 | Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations | Xun Lin et.al. | 2403.14250 | null |
2024-03-21 | StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN | Jongwoo Choi et.al. | 2403.14186 | link |
2024-03-21 | QSMDiff: Unsupervised 3D Diffusion Models for Quantitative Susceptibility Mapping | Zhuang Xiong et.al. | 2403.14070 | null |
2024-03-21 | LeFusion: Synthesizing Myocardial Pathology on Cardiac MRI via Lesion-Focus Diffusion Models | Hantao Zhang et.al. | 2403.14066 | link |
2024-03-20 | Learning from Models and Data for Visual Grounding | Ruozhen He et.al. | 2403.13804 | null |
2024-03-20 | Step-Calibrated Diffusion for Biomedical Optical Image Restoration | Yiwei Lyu et.al. | 2403.13680 | link |
2024-03-20 | ReGround: Improving Textual and Spatial Grounding at No Cost | Yuseung Lee et.al. | 2403.13589 | null |
2024-03-20 | Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute Editing | Hangeol Chang et.al. | 2403.13551 | link |
2024-03-20 | Diversity-aware Channel Pruning for StyleGAN Compression | Jiwoo Chung et.al. | 2403.13548 | link |
2024-03-21 | IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models | Siying Cui et.al. | 2403.13535 | null |
2024-03-20 | Deepfake Detection without Deepfakes: Generalization via Synthetic Frequency Patterns Injection | Davide Alessandro Coccomini et.al. | 2403.13479 | null |
2024-03-20 | S2DM: Sector-Shaped Diffusion Models for Video Generation | Haoran Lang et.al. | 2403.13408 | null |
2024-03-20 | IIDM: Image-to-Image Diffusion Model for Semantic Image Synthesis | Feng Liu et.al. | 2403.13378 | link |
2024-03-20 | AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation | Jingkun An et.al. | 2403.13352 | null |
2024-03-19 | FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis | Linjiang Huang et.al. | 2403.12963 | link |
2024-03-19 | Segment Anything for comprehensive analysis of grapevine cluster architecture and berry properties | Efrain Torres-Lomas et.al. | 2403.12935 | null |
2024-03-19 | You Only Sample Once: Taming One-Step Text-To-Image Synthesis by Self-Cooperative Diffusion GANs | Yihong Luo et.al. | 2403.12931 | link |
2024-03-19 | Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model | Jiajie Yang et.al. | 2403.12915 | link |
2024-03-19 | Generative Enhancement for 3D Medical Images | Lingting Zhu et.al. | 2403.12852 | link |
2024-03-19 | How Spammers and Scammers Leverage AI-Generated Images on Facebook for Audience Growth | Renee DiResta et.al. | 2403.12838 | null |
2024-03-19 | Total Disentanglement of Font Images into Style and Character Class Features | Daichi Haraguchi et.al. | 2403.12784 | null |
2024-03-19 | Towards Controllable Face Generation with Semantic Latent Diffusion Models | Alex Ergasti et.al. | 2403.12743 | link |
2024-03-19 | Tuning-Free Image Customization with Image and Text Guidance | Pengzhi Li et.al. | 2403.12658 | null |
2024-03-19 | LASPA: Latent Spatial Alignment for Fast Training-free Single Image Editing | Yazeed Alharbi et.al. | 2403.12585 | null |
2024-03-19 | Urban Scene Diffusion through Semantic Occupancy Map | Junge Zhang et.al. | 2403.11697 | null |
2024-03-18 | Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection | Julia Wolleb et.al. | 2403.11667 | null |
2024-03-18 | QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation | Zhizhen Zhou et.al. | 2403.11626 | null |
2024-03-18 | CRS-Diff: Controllable Generative Remote Sensing Foundation Model | Datao Tang et.al. | 2403.11614 | link |
2024-03-18 | EffiVED:Efficient Video Editing via Text-instruction Diffusion Models | Zhenghao Zhang et.al. | 2403.11568 | link |
2024-03-18 | Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors | Ruicheng Wang et.al. | 2403.11503 | null |
2024-03-18 | DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation | Jeongsol Kim et.al. | 2403.11415 | null |
2024-03-17 | StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining | Tushar Kataria et.al. | 2403.11340 | null |
2024-03-17 | Fast Personalized Text-to-Image Syntheses With Attention Injection | Yuxuan Zhang et.al. | 2403.11284 | null |
2024-03-17 | Understanding Diffusion Models by Feynman's Path Integral | Yuji Hirono et.al. | 2403.11262 | null |
2024-03-15 | Denoising Task Difficulty-based Curriculum for Training Diffusion Models | Jin-Young Kim et.al. | 2403.10348 | null |
2024-03-15 | Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder | Jinseok Kim et.al. | 2403.10255 | null |
2024-03-15 | SemanticHuman-HD: High-Resolution Semantic Disentangled 3D Human Generation | Peng Zheng et.al. | 2403.10166 | null |
2024-03-15 | E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance | Tianrui Huang et.al. | 2403.10133 | null |
2024-03-15 | Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling | Baoquan Zhang et.al. | 2403.10071 | null |
2024-03-15 | SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model | Tao Wu et.al. | 2403.10044 | null |
2024-03-15 | ST-LDM: A Universal Framework for Text-Grounded Object Generation in Real Images | Xiangtian Xue et.al. | 2403.10004 | null |
2024-03-14 | SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior | Huan-ang Gao et.al. | 2403.09638 | null |
2024-03-14 | Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering | Zeyu Liu et.al. | 2403.09622 | null |
2024-03-14 | PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation | Yuhan Guo et.al. | 2403.09615 | null |
2024-03-14 | Counterfactual contrastive learning: robust representations via causal image synthesis | Melanie Roschewitz et.al. | 2403.09605 | link |
2024-03-14 | Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing | Wonjun Kang et.al. | 2403.09468 | link |
2024-03-14 | Mitigating attribute amplification in counterfactual image generation | Tian Xia et.al. | 2403.09422 | null |
2024-03-14 | Machine Learning Processes as Sources of Ambiguity: Insights from AI Art | Christian Sivertsen et.al. | 2403.09374 | null |
2024-03-14 | Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction | Hanyu Chen et.al. | 2403.09355 | null |
2024-03-14 | Video Editing via Factorized Diffusion Distillation | Uriel Singer et.al. | 2403.09334 | null |
2024-03-14 | Noise Dimension of GAN: An Image Compression Perspective | Ziran Zhu et.al. | 2403.09196 | null |
2024-03-13 | iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer | Dinh-Khoi Vo et.al. | 2403.08746 | link |
2024-03-13 | HAIFIT: Human-Centered AI for Fashion Image Translation | Jianan Jiang et.al. | 2403.08651 | link |
2024-03-13 | An Analysis of Human Alignment of Latent Diffusion Models | Lorenz Linhardt et.al. | 2403.08469 | null |
2024-03-13 | Generating Synthetic Computed Tomography for Radiotherapy: SynthRAD2023 Challenge Report | Evi M. C. Huijben et.al. | 2403.08447 | null |
2024-03-13 | Iterative Online Image Synthesis via Diffusion Model for Imbalanced Classification | Shuhan Li et.al. | 2403.08407 | null |
2024-03-13 | Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation | Tianyi Chu et.al. | 2403.08294 | null |
2024-03-13 | VIGFace: Virtual Identity Generation Model for Face Image Synthesis | Minsoo Kim et.al. | 2403.08277 | null |
2024-03-13 | Make Me Happier: Evoking Emotions Through Image Diffusion Models | Qing Lin et.al. | 2403.08255 | null |
2024-03-12 | Pix2Pix-OnTheFly: Leveraging LLMs for Instruction-Guided Image Editing | Rodrigo Santos et.al. | 2403.08004 | null |
2024-03-12 | Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Shihao Zhao et.al. | 2403.07860 | link |
2024-03-12 | Quantifying and Mitigating Privacy Risks for Tabular Generative Models | Chaoyi Zhu et.al. | 2403.07842 | null |
2024-03-12 | BraSyn 2023 challenge: Missing MRI synthesis and the effect of different learning objectives | Ivo M. Baltruschat et.al. | 2403.07800 | null |
2024-03-12 | Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model | Yuxuan Zhang et.al. | 2403.07764 | null |
2024-03-12 | Synth |
Sahand Sharifzadeh et.al. | 2403.07750 | null |
2024-03-13 | Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion | Dongyang Li et.al. | 2403.07721 | link |
2024-03-12 | SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces | Yuta Oshima et.al. | 2403.07711 | link |
2024-03-12 | Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation | Michael Ogezi et.al. | 2403.07605 | null |
2024-03-12 | Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation | Likun Li et.al. | 2403.07500 | null |
2024-03-12 | Backdoor Attack with Mode Mixture Latent Modification | Hongwei Zhang et.al. | 2403.07463 | null |
2024-03-11 | Surface-aware Mesh Texture Synthesis with Pre-trained 2D CNNs | Áron Samuel Kovács et.al. | 2403.06855 | null |
2024-03-11 | Medical Image Synthesis via Fine-Grained Image-Text Alignment and Anatomy-Pathology Prompting | Wenting Chen et.al. | 2403.06835 | null |
2024-03-11 | Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection | Chuangchuang Tan et.al. | 2403.06803 | link |
2024-03-11 | FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation | Pengchong Qiao et.al. | 2403.06775 | link |
2024-03-11 | Distribution-Aware Data Expansion with Diffusion Models | Haowei Zhu et.al. | 2403.06741 | link |
2024-03-11 | Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback | Adarsh N L et.al. | 2403.06735 | null |
2024-03-11 | FFAD: A Novel Metric for Assessing Generated Time Series Data Utilizing Fourier Transform and Auto-encoder | Yang Chen et.al. | 2403.06576 | null |
2024-03-11 | Active Generation for Image Classification | Tao Huang et.al. | 2403.06517 | null |
2024-03-11 | Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning | Woojung Han et.al. | 2403.06516 | null |
2024-03-11 | 3D-aware Image Generation and Editing with Multi-modal Conditions | Bo Li et.al. | 2403.06470 | null |
2024-03-08 | Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola | Yijiang Li et.al. | 2403.05523 | null |
2024-03-08 | A Data Augmentation Pipeline to Generate Synthetic Labeled Datasets of 3D Echocardiography Images using a GAN | Cristiana Tiago et.al. | 2403.05384 | null |
2024-03-08 | Fine-tuning a Multiple Instance Learning Feature Extractor with Masked Context Modelling and Knowledge Distillation | Juan I. Pisula et.al. | 2403.05325 | null |
2024-03-08 | Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation | Junyan Wang et.al. | 2403.05239 | null |
2024-03-08 | Synthetic Privileged Information Enhances Medical Image Representation Learning | Lucas Farndale et.al. | 2403.05220 | null |
2024-03-08 | Denoising Autoregressive Representation Learning | Yazhe Li et.al. | 2403.05196 | null |
2024-03-08 | ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment | Xiwei Hu et.al. | 2403.05135 | null |
2024-03-08 | Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation | Joseph Cho et.al. | 2403.05131 | null |
2024-03-08 | Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis | Muxi Chen et.al. | 2403.05125 | null |
2024-03-08 | CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion | Wendi Zheng et.al. | 2403.05121 | null |
2024-03-07 | Photonic probabilistic machine learning using quantum vacuum noise | Seou Choi et.al. | 2403.04731 | null |
2024-03-07 | PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation | Junsong Chen et.al. | 2403.04692 | null |
2024-03-07 | StableDrag: Stable Dragging for Point-based Image Editing | Yutao Cui et.al. | 2403.04437 | null |
2024-03-07 | Discriminative Probing and Tuning for Text-to-Image Generation | Leigang Qu et.al. | 2403.04321 | null |
2024-03-06 | PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement | Zhijie Wang et.al. | 2403.04014 | link |
2024-03-06 | Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer | Naifu Xue et.al. | 2403.03736 | null |
2024-03-06 | Seamless Virtual Reality with Integrated Synchronizer and Synthesizer for Autonomous Driving | He Li et.al. | 2403.03541 | null |
2024-03-06 | NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging | Takahiro Shirakawa et.al. | 2403.03485 | link |
2024-03-06 | FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion | Hao Wang et.al. | 2403.03463 | null |
2024-03-07 | DLP-GAN: learning to draw modern Chinese landscape photos with generative adversarial network | Xiangquan Gui et.al. | 2403.03456 | null |
2024-03-06 | Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing | Bingyan Liu et.al. | 2403.03431 | null |
2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | Patrick Esser et.al. | 2403.03206 | null |
2024-03-05 | Behavior Generation with Latent Actions | Seungjae Lee et.al. | 2403.03181 | link |
2024-03-05 | Doubly Abductive Counterfactual Inference for Text-based Image Editing | Xue Song et.al. | 2403.02981 | link |
2024-03-05 | Bias in Generative AI | Mi Zhou et.al. | 2403.02726 | null |
2024-03-04 | Transformer for Times Series: an Application to the S&P500 | Pierre Brugiere et.al. | 2403.02523 | null |
2024-03-04 | NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function | Abdullah Nazhat Abdullah et.al. | 2403.02411 | link |
2024-03-04 | ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models | Jiaxiang Cheng et.al. | 2403.02084 | link |
2024-03-04 | PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis | Zhengyao Lv et.al. | 2403.01852 | link |
2024-03-04 | ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models | Lukas Höllein et.al. | 2403.01807 | link |
2024-03-05 | AtomoVideo: High Fidelity Image-to-Video Generation | Litong Gong et.al. | 2403.01800 | null |
2024-03-02 | Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models | Neta Shaul et.al. | 2403.01329 | null |
2024-03-02 | TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion | Salaheldin Mohamed et.al. | 2403.01212 | null |
2024-03-01 | Improving Android Malware Detection Through Data Augmentation Using Wasserstein Generative Adversarial Networks | Kawana Stalin et.al. | 2403.00890 | null |
2024-03-01 | Diff-Plugin: Revitalizing Details for Diffusion-based Low-level Tasks | Yuhao Liu et.al. | 2403.00644 | null |
2024-03-01 | Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset | Ander Salaberria et.al. | 2403.00587 | link |
2024-03-01 | Rethinking cluster-conditioned diffusion models | Nikolas Adaloglou et.al. | 2403.00570 | null |
2024-03-01 | VisionLLaMA: A Unified LLaMA Interface for Vision Tasks | Xiangxiang Chu et.al. | 2403.00522 | link |
2024-03-01 | An Ordinal Diffusion Model for Generating Medical Images with Different Severity Levels | Shumpei Takezaki et.al. | 2403.00452 | null |
2024-03-01 | LoMOE: Localized Multi-Object Editing via Multi-Diffusion | Goirik Chakrabarty et.al. | 2403.00437 | null |
2024-03-01 | ChartReformer: Natural Language-Driven Chart Image Editing | Pengyu Yan et.al. | 2403.00209 | null |
2024-02-29 | A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation | Hanxi Li et.al. | 2402.19330 | link |
2024-02-29 | Disentangling representations of retinal images with generative models | Sarah Müller et.al. | 2402.19186 | null |
2024-02-29 | Trajectory Consistency Distillation | Jianbin Zheng et.al. | 2402.19159 | link |
2024-02-29 | Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection | Christos Koutlis et.al. | 2402.19091 | null |
2024-02-29 | WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis | Paul Friedrich et.al. | 2402.19043 | link |
2024-02-29 | ViewFusion: Towards Multi-View Consistency via Interpolated Denoising | Xianghui Yang et.al. | 2402.18842 | link |
2024-02-29 | A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D | Xiaohan Fei et.al. | 2402.18780 | null |
2024-02-28 | FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes | Ziying Pan et.al. | 2402.18331 | link |
2024-02-28 | Balancing Act: Distribution-Guided Debiasing in Diffusion Models | Rishubh Parihar et.al. | 2402.18206 | null |
2024-02-28 | VulMCI : Code Splicing-based Pixel-row Oversampling for More Continuous Vulnerability Image Generation | Tao Peng et.al. | 2402.18189 | link |
2024-02-28 | Block and Detail: Scaffolding Sketch-to-Image Generation | Vishnu Sarukkai et.al. | 2402.18116 | null |
2024-02-28 | Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis | Yanzuo Lu et.al. | 2402.18078 | link |
2024-02-28 | SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model | Bin Cao et.al. | 2402.18068 | null |
2024-02-27 | CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing | Chufeng Xiao et.al. | 2402.17624 | null |
2024-02-27 | Structure-Guided Adversarial Training of Diffusion Models | Ling Yang et.al. | 2402.17563 | null |
2024-02-27 | Diffusion Model-Based Image Editing: A Survey | Yi Huang et.al. | 2402.17525 | link |
2024-02-28 | DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models | Shyam Marjit et.al. | 2402.17412 | null |
2024-02-27 | Accelerating Diffusion Sampling with Optimized Time Steps | Shuchen Xue et.al. | 2402.17376 | null |
2024-02-27 | One-Shot Structure-Aware Stylized Image Synthesis | Hansam Cho et.al. | 2402.17275 | link |
2024-02-27 | Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation | Daiqing Li et.al. | 2402.17245 | null |
2024-02-27 | Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System | Majid Memari et.al. | 2402.17204 | null |
2024-02-27 | Transparent Image Layer Diffusion using Latent Transparency | Lvmin Zhang et.al. | 2402.17113 | link |
2024-02-27 | T-HITL Effectively Addresses Problematic Associations in Image Generation and Maintains Overall Visual Quality | Susan Epstein et.al. | 2402.17101 | null |
2024-02-26 | Stochastic Conditional Diffusion Models for Semantic Image Synthesis | Juyeon Ko et.al. | 2402.16506 | link |
2024-02-26 | Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion | Xuantong Liu et.al. | 2402.16305 | null |
2024-02-25 | Towards Efficient Quantum Hybrid Diffusion Models | Francesca De Falco et.al. | 2402.16147 | null |
2024-02-23 | Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition | Chun-Hsiao Yeh et.al. | 2402.15504 | link |
2024-02-23 | BSPA: Exploring Black-box Stealthy Prompt Attacks against Image Generators | Yu Tian et.al. | 2402.15218 | null |
2024-02-23 | The Surprising Effectiveness of Skip-Tuning in Diffusion Sampling | Jiajun Ma et.al. | 2402.15170 | null |
2024-02-22 | LLMBind: A Unified Modality-Task Integration Framework | Bin Zhu et.al. | 2402.14891 | null |
2024-02-22 | Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis | Willi Menapace et.al. | 2402.14797 | null |
2024-02-22 | Consolidating Attention Features for Multi-view Image Editing | Or Patashnik et.al. | 2402.14792 | null |
2024-02-25 | Two-stage Cytopathological Image Synthesis for Augmenting Cervical Abnormality Screening | Zhenrong Shen et.al. | 2402.14707 | null |
2024-02-22 | Visual Hallucinations of Multi-modal Large Language Models | Wen Huang et.al. | 2402.14683 | link |
2024-02-22 | Semantic Image Synthesis with Unconditional Generator | Jungwoo Chae et.al. | 2402.14395 | null |
2024-02-22 | MVD |
Xin-Yang Zheng et.al. | 2402.14253 | null |
2024-02-21 | T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching | Zizheng Pan et.al. | 2402.14167 | link |
2024-02-21 | SDXL-Lightning: Progressive Adversarial Diffusion Distillation | Shanchuan Lin et.al. | 2402.13929 | null |
2024-02-21 | SRNDiff: Short-term Rainfall Nowcasting with Condition Diffusion Model | Xudong Ling et.al. | 2402.13737 | null |
2024-02-21 | Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving | Mehdi Azarafza et.al. | 2402.13602 | link |
2024-02-21 | Contrastive Prompts Improve Disentanglement in Text-to-Image Diffusion Models | Chen Wu et.al. | 2402.13490 | null |
2024-02-20 | Layout-to-Image Generation with Localized Descriptions using ControlNet with Cross-Attention Control | Denis Lukovnikov et.al. | 2402.13404 | null |
2024-02-20 | CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples | Jianrui Zhang et.al. | 2402.13254 | link |
2024-02-20 | UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing | Jianhong Bai et.al. | 2402.13185 | null |
2024-02-21 | Visual Style Prompting with Swapping Self-Attention | Jaeseok Jeong et.al. | 2402.12974 | link |
2024-02-20 | RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models | Xinchen Zhang et.al. | 2402.12908 | link |
2024-02-20 | Two-stage Rainfall-Forecasting Diffusion Model | XuDong Ling et.al. | 2402.12779 | link |
2024-02-20 | A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis | Nailei Hei et.al. | 2402.12760 | link |
2024-02-20 | MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion | Sen Li et.al. | 2402.12741 | link |
2024-02-20 | MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction | Shitao Tang et.al. | 2402.12712 | null |
2024-02-20 | Robust-Wide: Robust Watermarking against Instruction-driven Image Editing | Runyi Hu et.al. | 2402.12688 | null |
2024-02-19 | The (R)Evolution of Multimodal Large Language Models: A Survey | Davide Caffagni et.al. | 2402.12451 | null |
2024-02-19 | Revisiting registration-based synthesis: A focus on unsupervised MR image synthesis | Savannah P. Hays et.al. | 2402.12288 | null |
2024-02-19 | Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability | Xuelin Qian et.al. | 2402.12225 | null |
2024-02-19 | Groot: Adversarial Testing for Generative Text-to-Image Models with Tree-based Semantic Transformation | Yi Liu et.al. | 2402.12100 | null |
2024-02-19 | DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation | Chong Zeng et.al. | 2402.11929 | null |
2024-02-18 | SDiT: Spiking Diffusion Model with Transformer | Shu Yang et.al. | 2402.11588 | null |
2024-02-18 | Visual Concept-driven Image Generation with Text-to-Image Diffusion Model | Tanzila Rahman et.al. | 2402.11487 | null |
2024-02-18 | Deep learning methods for Hamiltonian parameter estimation and magnetic domain image generation in twisted van der Waals magnets | Woo Seok Lee et.al. | 2402.11434 | null |
2024-02-17 | TC-DiffRecon: Texture coordination MRI reconstruction method based on diffusion model and modified MF-UNet method | Chenyan Zhang et.al. | 2402.11274 | link |
2024-02-16 | The Male CEO and the Female Assistant: Probing Gender Biases in Text-To-Image Models Through Paired Stereotype Test | Yixin Wan et.al. | 2402.11089 | null |
2024-02-16 | Universal Prompt Optimizer for Safe Text-to-Image Generation | Zongyu Wu et.al. | 2402.10882 | link |
2024-02-16 | Training Class-Imbalanced Diffusion Model Via Overlap Optimization | Divin Yan et.al. | 2402.10821 | link |
2024-02-16 | Exploring Precision and Recall to assess the quality and diversity of LLMs | Le Bronnec Florian et.al. | 2402.10693 | null |
2024-02-16 | UMAIR-FPS: User-aware Multi-modal Animation Illustration Recommendation Fusion with Painting Style | Yan Kang et.al. | 2402.10381 | link |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210 | null |
2024-02-15 | Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning | Euclid Collaboration et.al. | 2402.10187 | link |
2024-02-15 | Classification Diffusion Models | Shahar Yadin et.al. | 2402.10095 | null |
2024-02-15 | Accelerating Parallel Sampling of Diffusion Models | Zhiwei Tang et.al. | 2402.09970 | link |
2024-02-15 | Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation | Junjie Shentu et.al. | 2402.09966 | link |
2024-02-15 | Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Community | Arman Isajanyan et.al. | 2402.09872 | link |
2024-02-14 | Magic-Me: Identity-Specific Video Customized Diffusion | Ze Ma et.al. | 2402.09368 | link |
2024-02-14 | Switch EMA: A Free Lunch for Better Flatness and Sharpness | Siyuan Li et.al. | 2402.09240 | link |
2024-02-14 | L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects | Yutaro Yamada et.al. | 2402.09052 | null |
2024-02-14 | Multi-modality transrectal ultrasound vudei classification for identification of clinically significant prostate cancer | Hong Wu et.al. | 2402.08987 | link |
2024-02-13 | Towards the Detection of AI-Synthesized Human Face Images | Yuhang Lu et.al. | 2402.08750 | null |
2024-02-13 | IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation | Luke Melas-Kyriazi et.al. | 2402.08682 | null |
2024-02-13 | Learning Continuous 3D Words for Text-to-Image Generation | Ta-Ying Cheng et.al. | 2402.08654 | null |
2024-02-13 | Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models | Jason Tang et.al. | 2402.08532 | null |
2024-02-12 | Using AI for Wavefront Estimation with the Rubin Observatory Active Optics System | John Franklin Crenshaw et.al. | 2402.08094 | null |
2024-02-14 | Score-based generative models break the curse of dimensionality in learning a family of sub-Gaussian probability distributions | Frank Cole et.al. | 2402.08082 | null |
2024-02-12 | Trustworthy SR: Resolving Ambiguity in Image Super-resolution via Diffusion Models and Human Feedback | Cansu Korkmaz et.al. | 2402.07597 | null |
2024-02-12 | Discovering Universal Semantic Triggers for Text-to-Image Synthesis | Shengfang Zhai et.al. | 2402.07562 | null |
2024-02-11 | The Aleph & Other Metaphors for Image Generation | Gonzalo Ramos et.al. | 2402.07104 | null |
2024-02-10 | Disentangled Latent Energy-Based Style Translation: An Image-Level Structural MRI Harmonization Framework | Mengqi Wu et.al. | 2402.06875 | null |
2024-02-09 | Cardiac ultrasound simulation for autonomous ultrasound navigation | Abdoul Aziz Amadou et.al. | 2402.06463 | null |
2024-02-08 | Collaborative Control for Geometry-Conditioned PBR Image Generation | Shimon Vainer et.al. | 2402.05919 | null |
2024-02-08 | CTGAN: Semantic-guided Conditional Texture Generator for 3D Shapes | Yi-Ting Pan et.al. | 2402.05728 | null |
2024-02-08 | Scalable Diffusion Models with State Space Backbone | Zhengcong Fei et.al. | 2402.05608 | link |
2024-02-08 | Minecraft-ify: Minecraft Style Image Generation with Text-guided Image Editing for In-Game Application | Bumsoo Kim et.al. | 2402.05448 | null |
2024-02-08 | Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport | Jaemoo Choi et.al. | 2402.05443 | null |
2024-02-08 | MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis | Dewei Zhou et.al. | 2402.05408 | link |
2024-02-07 | Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion Models | Nicholas Konz et.al. | 2402.05210 | link |
2024-02-07 | ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 | Liuqing Chen et.al. | 2402.04975 | null |
2024-02-07 | Noise Map Guidance: Inversion with Spatial Context for Real Image Editing | Hansam Cho et.al. | 2402.04625 | link |
2024-02-07 | Text2Street: Controllable Text-to-image Generation for Street Views | Jinming Su et.al. | 2402.04504 | null |
2024-02-07 | ColorSwap: A Color and Word Order Dataset for Multimodal Evaluation | Jirayu Burapacheep et.al. | 2402.04492 | link |
2024-02-06 | FoolSDEdit: Deceptively Steering Your Edits Towards Targeted Attribute-aware Distribution | Qi Zhou et.al. | 2402.03705 | null |
2024-02-06 | QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning | Haoxuan Wang et.al. | 2402.03666 | link |
2024-02-05 | Assessing the Efficacy of Invisible Watermarks in AI-Generated Medical Images | Xiaodan Xing et.al. | 2402.03473 | null |
2024-02-05 | Do Diffusion Models Learn Semantically Meaningful and Efficient Representations? | Qiyao Liang et.al. | 2402.03305 | null |
2024-02-05 | InstanceDiffusion: Instance-level Control for Image Generation | Xudong Wang et.al. | 2402.03290 | link |
2024-02-05 | Training-Free Consistent Text-to-Image Generation | Yoad Tewel et.al. | 2402.03286 | null |
2024-02-05 | IGUANe: a 3D generalizable CycleGAN for multicenter harmonization of brain MR images | Vincent Roca et.al. | 2402.03227 | link |
2024-02-05 | InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions | Yiyuan Zhang et.al. | 2402.03040 | link |
2024-02-05 | SynthVision -- Harnessing Minimal Input for Maximal Output in Computer Vision Models using Synthetic Image data | Yudara Kularathne et.al. | 2402.02826 | null |
2024-02-04 | DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing | Chong Mou et.al. | 2402.02583 | link |
2024-02-04 | M |
Mohammadreza Mofayezi et.al. | 2402.02369 | null |
2024-02-03 | Diffusion Cross-domain Recommendation | Yuner Xuan et.al. | 2402.02182 | link |
2024-02-03 | S-NeRF++: Autonomous Driving Simulation via Neural Reconstruction and Generation | Yurui Chen et.al. | 2402.02112 | null |
2024-02-02 | The galactic bubbles of starburst galaxies The influence of galactic large-scale magnetic fields | Z. Meliani et.al. | 2402.01541 | null |
2024-02-02 | Cross-view Masked Diffusion Transformers for Person Image Synthesis | Trung X. Pham et.al. | 2402.01516 | link |
2024-02-02 | Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors | Dingcheng Yang et.al. | 2402.01369 | link |
2024-02-02 | Can MLLMs Perform Text-to-Image In-Context Learning? | Yuchen Zeng et.al. | 2402.01293 | link |
2024-02-02 | Can Shape-Infused Joint Embeddings Improve Image-Conditioned 3D Diffusion? | Cristian Sbrolli et.al. | 2402.01241 | null |
2024-02-01 | Unconditional Latent Diffusion Models Memorize Patient Imaging Data | Salman Ul Hassan Dar et.al. | 2402.01054 | null |
2024-02-01 | AI-generated faces free from racial and gender stereotypes | Nouar AlDahoul et.al. | 2402.01002 | link |
2024-02-01 | Examining the Influence of Digital Phantom Models in Virtual Imaging Trials for Tomographic Breast Imaging | Amar Kavuri et.al. | 2402.00812 | null |
2024-02-01 | AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning | Fu-Yun Wang et.al. | 2402.00769 | link |
2024-01-31 | SeFi-IDE: Semantic-Fidelity Identity Embedding for Personalized Diffusion-Based Generation | Yang Li et.al. | 2402.00631 | null |
2024-02-01 | CapHuman: Capture Your Moments in Parallel Universes | Chao Liang et.al. | 2402.00627 | link |
2024-02-01 | Masked Conditional Diffusion Model for Enhancing Deepfake Detection | Tiewen Chen et.al. | 2402.00541 | null |
2024-02-01 | High-Quality Medical Image Generation from Free-hand Sketch | Quan Huu Cap et.al. | 2402.00353 | null |
2024-02-01 | Machine Unlearning for Image-to-Image Generative Models | Guihong Li et.al. | 2402.00351 | link |
2024-01-31 | Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators | Daniel Geng et.al. | 2401.18085 | null |
2024-01-31 | Image Anything: Towards Reasoning-coherent and Training-free Multi-modal Image Generation | Yuanhuiyi Lyu et.al. | 2401.17664 | null |
2024-01-31 | Head and Neck Tumor Segmentation from [18F]F-FDG PET/CT Images Based on 3D Diffusion Model | Yafei Dong et.al. | 2401.17593 | null |
2024-01-31 | Task-Oriented Diffusion Model Compression | Geonung Kim et.al. | 2401.17547 | null |
2024-01-31 | Fréchet Distance for Offline Evaluation of Information Retrieval Systems with Sparse Labels | Negar Arabzadeh et.al. | 2401.17543 | null |
2024-01-30 | OmniSCV: An Omnidirectional Synthetic Image Generator for Computer Vision | Bruno Berenguel-Baeta et.al. | 2401.17061 | link |
2024-01-30 | Repositioning the Subject within Image | Yikai Wang et.al. | 2401.16861 | link |
2024-01-30 | X-ray Image Generation as a Method of Performance Prediction for Real-Time Inspection: a Case Study | Vladyslav Andriiashen et.al. | 2401.16847 | link |
2024-01-30 | LATENTPATCH: A Non-Parametric Approach for Face Generation and Editing | Benjamin Samuth et.al. | 2401.16830 | null |
2024-01-29 | Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors | Shiyin Dong et.al. | 2401.16459 | null |
2024-01-29 | Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models | Zhongjie Duan et.al. | 2401.16224 | null |
2024-01-29 | Spatial-Aware Latent Initialization for Controllable Image Generation | Wenqiang Sun et.al. | 2401.16157 | null |
2024-01-29 | Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You | Felix Friedrich et.al. | 2401.16092 | link |
2024-01-29 | Diffusion Facial Forgery Detection | Harry Cheng et.al. | 2401.15859 | link |
2024-01-29 | 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D | Yizheng Chen et.al. | 2401.15841 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-06-10 | NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing | Ting-Hsuan Chen et.al. | 2406.06523 | null |
2024-06-10 | FRAG: Frequency Adapting Group for Diffusion Video Editing | Sunjae Yoon et.al. | 2406.06044 | null |
2024-06-09 | Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion | Ge Ya Luo et.al. | 2406.05630 | null |
2024-06-08 | Training-Free Robust Interactive Video Object Segmentation | Xiaoli Wei et.al. | 2406.05485 | null |
2024-06-08 | MotionClone: Training-Free Motion Cloning for Controllable Video Generation | Pengyang Ling et.al. | 2406.05338 | null |
2024-06-07 | CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion | Xingrui Wang et.al. | 2406.05082 | null |
2024-06-07 | Zero-Shot Video Editing through Adaptive Sliding Score Distillation | Lianghan Zhu et.al. | 2406.04888 | null |
2024-06-07 | Ada-VE: Training-Free Consistent Video Editing Using Adaptive Motion Prior | Tanvir Mahmud et.al. | 2406.04873 | null |
2024-06-07 | Online Continual Learning of Video Diffusion Models From a Single Video Stream | Jason Yoo et.al. | 2406.04814 | null |
2024-06-06 | GenAI Arena: An Open Evaluation Platform for Generative Models | Dongfu Jiang et.al. | 2406.04485 | null |
2024-06-06 | ShareGPT4Video: Improving Video Understanding and Generation with Better Captions | Lin Chen et.al. | 2406.04325 | null |
2024-06-06 | SF-V: Single Forward Video Generation Model | Zhixing Zhang et.al. | 2406.04324 | null |
2024-06-06 | VideoTetris: Towards Compositional Text-to-Video Generation | Ye Tian et.al. | 2406.04277 | link |
2024-06-05 | VideoPhy: Evaluating Physical Commonsense for Video Generation | Hritik Bansal et.al. | 2406.03520 | null |
2024-06-05 | Searching Priors Makes Text-to-Video Synthesis Better | Haoran Cheng et.al. | 2406.03215 | null |
2024-06-05 | Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control | Jingyun Xue et.al. | 2406.03035 | null |
2024-06-05 | Controllable Talking Face Generation by Implicit Facial Keypoints Editing | Dong Zhao et.al. | 2406.02880 | null |
2024-06-06 | Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting | Inkyu Shin et.al. | 2406.02541 | null |
2024-06-04 | ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation | Tianchen Zhao et.al. | 2406.02540 | null |
2024-06-04 | V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation | Cong Wang et.al. | 2406.02511 | null |
2024-06-04 | CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation | Dejia Xu et.al. | 2406.02509 | null |
2024-06-04 | I4VGen: Image as Stepping Stone for Text-to-Video Generation | Xiefan Guo et.al. | 2406.02230 | null |
2024-06-04 | Learning Temporally Consistent Video Depth from Video Diffusion Priors | Jiahao Shao et.al. | 2406.01493 | null |
2024-06-03 | DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors | Tianyu Huang et.al. | 2406.01476 | link |
2024-06-04 | Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation | Enhui Ma et.al. | 2406.01349 | null |
2024-06-03 | UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation | Xiang Wang et.al. | 2406.01188 | null |
2024-06-03 | ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation | Shaoshu Yang et.al. | 2406.00908 | link |
2024-05-31 | 4Diffusion: Multi-view Video Diffusion Model for 4D Generation | Haiyu Zhang et.al. | 2405.20674 | null |
2024-05-30 | MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion | Shuyuan Tu et.al. | 2405.20325 | link |
2024-05-30 | Improving the Training of Rectified Flows | Sangyun Lee et.al. | 2405.20320 | link |
2024-05-30 | CV-VAE: A Compatible Video VAE for Latent Generative Video Models | Sijie Zhao et.al. | 2405.20279 | link |
2024-06-02 | MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model | Muyao Niu et.al. | 2405.20222 | link |
2024-05-30 | Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion | Jiangkai Wu et.al. | 2405.20032 | null |
2024-05-30 | Streaming Video Diffusion: Online Video Editing with Diffusion Models | Feng Chen et.al. | 2405.19726 | link |
2024-05-30 | DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark | Haoxing Chen et.al. | 2405.19707 | link |
2024-05-29 | EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture | Jiaqi Xu et.al. | 2405.18991 | link |
2024-05-29 | T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback | Jiachen Li et.al. | 2405.18750 | null |
2024-05-28 | Phased Consistency Model | Fu-Yun Wang et.al. | 2405.18407 | null |
2024-05-28 | RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives | Jaehong Yoon et.al. | 2405.18406 | link |
2024-05-28 | VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers | Jun Zheng et.al. | 2405.18326 | null |
2024-05-28 | EG4D: Explicit Generation of 4D Object without Score Distillation | Qi Sun et.al. | 2405.18132 | link |
2024-05-28 | MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling | Bowen Zhang et.al. | 2405.18003 | link |
2024-05-28 | Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation | Akio Hayakawa et.al. | 2405.17842 | null |
2024-05-27 | RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance | Jiaojiao Fan et.al. | 2405.17661 | null |
2024-05-27 | ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance | Jiannan Huang et.al. | 2405.17532 | link |
2024-05-27 | Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control | Zhengfei Kuang et.al. | 2405.17414 | null |
2024-05-27 | Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer | Ruizhi Shao et.al. | 2405.17405 | null |
2024-05-27 | Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability | Shenyuan Gao et.al. | 2405.17398 | link |
2024-05-28 | Controllable Longer Image Animation with Diffusion Models | Qiang Wang et.al. | 2405.17306 | null |
2024-05-27 | Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation | Zhoujie Fu et.al. | 2405.16849 | null |
2024-05-27 | Unified Editing of Panorama, 3D Scenes, and Videos Through Disentangled Self-Attention Injection | Gihyun Kwon et.al. | 2405.16823 | null |
2024-05-27 | Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels | Yikai Wang et.al. | 2405.16822 | null |
2024-05-26 | Towards Multi-Task Multi-Modal Models: A Video Generative Perspective | Lijun Yu et.al. | 2405.16728 | null |
2024-05-26 | I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models | Wenqi Ouyang et.al. | 2405.16537 | null |
2024-05-28 | Disentangling Foreground and Background Motion for Enhanced Realism in Human Video Generation | Jinlin Liu et.al. | 2405.16393 | null |
2024-05-24 | A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence | Ali Kashefi et.al. | 2405.15406 | link |
2024-05-24 | iVideoGPT: Interactive VideoGPTs are Scalable World Models | Jialong Wu et.al. | 2405.15223 | link |
2024-05-23 | Video Diffusion Models are Training-free Motion Interpreter and Controller | Zeqi Xiao et.al. | 2405.14864 | null |
2024-05-24 | Fisher Flow Matching for Generative Modeling over Discrete Data | Oscar Davis et.al. | 2405.14664 | null |
2024-05-24 | PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control | Yong Zhong et.al. | 2405.14582 | null |
2024-05-23 | MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes | Ruiyuan Gao et.al. | 2405.14475 | null |
2024-05-22 | ReVideo: Remake a Video with Motion and Content Control | Chong Mou et.al. | 2405.13865 | null |
2024-05-22 | MotionCraft: Physics-based Zero-Shot Video Generation | Luca Savant Aira et.al. | 2405.13557 | null |
2024-05-22 | Enhanced Creativity and Ideation through Stable Video Synthesis | Elijah Miller et.al. | 2405.13357 | null |
2024-05-21 | CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers | Andrew Marmon et.al. | 2405.13195 | null |
2024-05-21 | OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models | Zhaojian Yu et.al. | 2405.12843 | link |
2024-05-21 | DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control | Hong Chen et.al. | 2405.12796 | null |
2024-05-20 | Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices | Nathaniel Cohen et.al. | 2405.12211 | link |
2024-05-20 | ViViD: Video Virtual Try-on using Diffusion Models | Zixun Fang et.al. | 2405.11794 | null |
2024-05-19 | FIFO-Diffusion: Generating Infinite Videos from Text without Training | Jihwan Kim et.al. | 2405.11473 | link |
2024-05-17 | From Sora What We Can See: A Survey of Text-to-Video Generation | Rui Sun et.al. | 2405.10674 | link |
2024-05-16 | MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis | Joseph Cho et.al. | 2405.09806 | null |
2024-05-15 | Dance Any Beat: Blending Beats with Visuals in Dance Video Generation | Xuanchen Wang et.al. | 2405.09266 | null |
2024-05-13 | The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective | Andrew Shin et.al. | 2405.08720 | null |
2024-05-10 | OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation | Jinwei Lin et.al. | 2405.06547 | link |
2024-05-08 | Reviewing Intelligent Cinematography: AI research for camera-based video production | Adrian Azzarelli et.al. | 2405.05039 | null |
2024-05-15 | TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation | Hritik Bansal et.al. | 2405.04682 | null |
2024-05-07 | Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing | Yi Zuo et.al. | 2405.04496 | null |
2024-05-07 | Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation | Dogucan Yaman et.al. | 2405.04327 | null |
2024-05-07 | Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models | Fan Bao et.al. | 2405.04233 | null |
2024-05-07 | Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models | Zhixuan Chu et.al. | 2405.04180 | link |
2024-05-07 | Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method | Peisong He et.al. | 2405.04133 | null |
2024-05-06 | Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond | Zheng Zhu et.al. | 2405.03520 | link |
2024-05-06 | Video Diffusion Models: A Survey | Andrew Melnik et.al. | 2405.03150 | null |
2024-05-10 | Matten: Video Generation with Mamba-Attention | Yu Gao et.al. | 2405.03025 | null |
2024-05-02 | StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation | Yupeng Zhou et.al. | 2405.01434 | link |
2024-05-05 | VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization | Yuliang Liu et.al. | 2404.19652 | link |
2024-04-30 | Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model | Wentao Lei et.al. | 2404.19277 | null |
2024-04-29 | FlexiFilm: Long Video Generation with Flexible Conditions | Yichen Ouyang et.al. | 2404.18620 | link |
2024-04-25 | Synthesizing Audio from Silent Video using Sequence to Sequence Modeling | Hugo Garrido-Lestache Belinchon et.al. | 2404.17608 | link |
2024-04-25 | V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection | Xuanyu Zhang et.al. | 2404.16824 | null |
2024-04-25 | TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models | Haomiao Ni et.al. | 2404.16306 | null |
2024-04-26 | Semantically consistent Video-to-Audio Generation using Multimodal Language Large Model | Gehui Chen et.al. | 2404.16305 | null |
2024-04-24 | Beyond Deepfake Images: Detecting AI-Generated Videos | Danial Samadi Vahdati et.al. | 2404.15955 | null |
2024-05-01 | MotionMaster: Training-free Camera Motion Transfer For Video Generation | Teng Hu et.al. | 2404.15789 | null |
2024-04-23 | ID-Animator: Zero-Shot Identity-Preserving Human Video Generation | Xuanhua He et.al. | 2404.15275 | link |
2024-04-22 | TAVGBench: Benchmarking Text to Audible-Video Generation | Yuxin Mao et.al. | 2404.14381 | link |
2024-04-23 | Accelerating Image Generation with Sub-path Linear Approximation Model | Chen Xu et.al. | 2404.13903 | null |
2024-04-27 | Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap | Bowen Qu et.al. | 2404.13573 | link |
2024-04-21 | Motion-aware Latent Diffusion Models for Video Frame Interpolation | Zhilin Huang et.al. | 2404.13534 | null |
2024-04-20 | Music Consistency Models | Zhengcong Fei et.al. | 2404.13358 | null |
2024-04-19 | PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation | Tianyuan Zhang et.al. | 2404.13026 | null |
2024-04-19 | ConCLVD: Controllable Chinese Landscape Video Generation via Diffusion Model | Dingming Liu et.al. | 2404.12903 | null |
2024-04-18 | GenVideo: One-shot Target-image and Shape Aware Video Editing using T2I Diffusion Models | Sai Sree Harsha et.al. | 2404.12541 | null |
2024-04-18 | On the Content Bias in Fréchet Video Distance | Songwei Ge et.al. | 2404.12391 | null |
2024-04-18 | RoboDreamer: Learning Compositional World Models for Robot Imagination | Siyuan Zhou et.al. | 2404.12377 | null |
2024-04-18 | AniClipart: Clipart Animation with Text-to-Video Priors | Ronghuan Wu et.al. | 2404.12347 | null |
2024-04-15 | Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model | Han Lin et.al. | 2404.09967 | null |
2024-04-16 | LoopAnimate: Loopable Salient Object Animation | Fanyi Wang et.al. | 2404.09172 | null |
2024-04-13 | THQA: A Perceptual Quality Assessment Database for Talking Heads | Yingjie Zhou et.al. | 2404.09003 | link |
2024-04-16 | LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field | Jiyang Li et.al. | 2404.08966 | link |
2024-04-11 | S3Editor: A Sparse Semantic-Disentangled Self-Training Framework for Face Video Editing | Guangzhi Wang et.al. | 2404.08111 | null |
2024-04-10 | A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos | Suleyman Ozdel et.al. | 2404.07351 | null |
2024-04-16 | Deep Generative Data Assimilation in Multimodal Setting | Yongquan Qu et.al. | 2404.06665 | link |
2024-04-10 | Flying with Photons: Rendering Novel Views of Propagating Light | Anagh Malik et.al. | 2404.06493 | null |
2024-04-08 | Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models | Saman Motamed et.al. | 2404.05519 | null |
2024-04-08 | Action-conditioned video data improves predictability | Meenakshi Sarkar et.al. | 2404.05439 | null |
2024-04-07 | MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators | Shenghai Yuan et.al. | 2404.05014 | link |
2024-04-07 | AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment | Yuanfeng Xu et.al. | 2404.04946 | null |
2024-04-03 | Translation-based Video-to-Video Synthesis | Pratim Saha et.al. | 2404.04283 | null |
2024-04-03 | MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy | John J. Han et.al. | 2404.02999 | null |
2024-04-02 | CameraCtrl: Enabling Camera Control for Text-to-Video Generation | Hao He et.al. | 2404.02101 | link |
2024-04-02 | Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model | Xu He et.al. | 2404.01862 | link |
2024-03-28 | A Review of Multi-Modal Large Language and Vision Models | Kilian Carolan et.al. | 2404.01322 | null |
2024-04-01 | Evaluating Text-to-Visual Generation with Image-to-Text Generation | Zhiqiu Lin et.al. | 2404.01291 | link |
2024-03-30 | Grid Diffusion Models for Text-to-Video Generation | Taegyeong Lee et.al. | 2404.00234 | null |
2024-03-29 | Motion Inversion for Video Customization | Luozhou Wang et.al. | 2403.20193 | null |
2024-04-03 | MI-NeRF: Learning a Single Face NeRF from Multiple Identities | Aggelina Chatziagapi et.al. | 2403.19920 | null |
2024-03-28 | Frame by Familiar Frame: Understanding Replication in Video Diffusion Models | Aimon Rahman et.al. | 2403.19593 | null |
2024-03-27 | TextCraftor: Your Text Encoder Can be Image Quality Controller | Yanyu Li et.al. | 2403.18978 | null |
2024-03-26 | Tutorial on Diffusion Models for Imaging and Vision | Stanley H. Chan et.al. | 2403.18103 | null |
2024-03-26 | TC4D: Trajectory-Conditioned Text-to-4D Generation | Sherwin Bahmani et.al. | 2403.17920 | null |
2024-03-26 | Annotated Biomedical Video Generation using Denoising Diffusion Probabilistic Models and Flow Fields | Rüveyda Yilmaz et.al. | 2403.17808 | null |
2024-03-26 | ExpressEdit: Video Editing with Natural Language and Sketching | Bekzat Tilekbay et.al. | 2403.17693 | null |
2024-03-25 | TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models | Zhongwei Zhang et.al. | 2403.17005 | null |
2024-03-25 | A Survey on Long Video Generation: Challenges, Methods, and Prospects | Chengxuan Li et.al. | 2403.16407 | null |
2024-03-24 | Opportunities and challenges in the application of large artificial intelligence models in radiology | Liangrui Pan et.al. | 2403.16112 | null |
2024-03-24 | EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing | Xiangpeng Yang et.al. | 2403.16111 | null |
2024-03-24 | Edit3K: Universal Representation Learning for Video Editing Components | Xin Gu et.al. | 2403.16048 | null |
2024-03-23 | Adaptive Super Resolution For One-Shot Talking-Head Generation | Luchuan Song et.al. | 2403.15944 | link |
2024-03-22 | Spectral Motion Alignment for Video Motion Transfer using Diffusion Models | Geon Yeong Park et.al. | 2403.15249 | null |
2024-03-21 | StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text | Roberto Henschel et.al. | 2403.14773 | link |
2024-03-22 | Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion | Xiang Fan et.al. | 2403.14617 | null |
2024-03-21 | Explorative Inbetweening of Time and Space | Haiwen Feng et.al. | 2403.14611 | null |
2024-03-22 | AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks | Max Ku et.al. | 2403.14468 | null |
2024-03-21 | Enabling Visual Composition and Animation in Unsupervised Video Generation | Aram Davtyan et.al. | 2403.14368 | null |
2024-03-21 | StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN | Jongwoo Choi et.al. | 2403.14186 | link |
2024-03-21 | Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition | Sihyun Yu et.al. | 2403.14148 | null |
2024-03-20 | Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation | Fu-Yun Wang et.al. | 2403.13745 | link |
2024-03-20 | VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis | Yumeng Li et.al. | 2403.13501 | link |
2024-03-20 | S2DM: Sector-Shaped Diffusion Models for Video Generation | Haoran Lang et.al. | 2403.13408 | null |
2024-03-20 | Mora: Enabling Generalist Video Generation via A Multi-Agent Framework | Zhengqing Yuan et.al. | 2403.13248 | link |
2024-03-19 | AnimateDiff-Lightning: Cross-Model Diffusion Distillation | Shanchuan Lin et.al. | 2403.12706 | null |
2024-03-18 | CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility | Bojia Zi et.al. | 2403.12035 | null |
2024-03-18 | Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation | Axel Sauer et.al. | 2403.12015 | null |
2024-03-18 | VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model | Qi Zuo et.al. | 2403.12010 | null |
2024-03-18 | DreamMotion: Space-Time Self-Similarity Score Distillation for Zero-Shot Video Editing | Hyeonho Jeong et.al. | 2403.12002 | null |
2024-03-19 | Subjective-Aligned Dateset and Metric for Text-to-Video Quality Assessment | Tengchuan Kou et.al. | 2403.11956 | link |
2024-03-18 | Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing | Juan Zhang et.al. | 2403.11700 | null |
2024-03-18 | EffiVED:Efficient Video Editing via Text-instruction Diffusion Models | Zhenghao Zhang et.al. | 2403.11568 | link |
2024-03-17 | Endora: Video Generation Models as Endoscopy Simulators | Chenxin Li et.al. | 2403.11050 | null |
2024-03-15 | DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers | Xuanlei Zhao et.al. | 2403.10266 | link |
2024-03-15 | Animate Your Motion: Turning Still Images into Dynamic Videos | Mingxiao Li et.al. | 2403.10179 | null |
2024-03-14 | Video Editing via Factorized Diffusion Distillation | Uriel Singer et.al. | 2403.09334 | null |
2024-03-17 | Intention-driven Ego-to-Exo Video Generation | Hongchen Luo et.al. | 2403.09194 | null |
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764 | null |
2024-03-13 | Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts | Yue Ma et.al. | 2403.08268 | link |
2024-03-12 | AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production | Jiuniu Wang et.al. | 2403.07952 | null |
2024-03-12 | SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces | Yuta Oshima et.al. | 2403.07711 | link |
2024-03-15 | DragAnything: Motion Control for Anything using Entity Representation | Weijia Wu et.al. | 2403.07420 | link |
2024-03-11 | Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions | Lan Wang et.al. | 2403.07198 | null |
2024-03-11 | DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation | Guosheng Zhao et.al. | 2403.06845 | null |
2024-03-11 | A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos | Weixia Zhang et.al. | 2403.06421 | link |
2024-03-11 | Video Generation with Consistency Tuning | Chaoyi Wang et.al. | 2403.06356 | null |
2024-03-10 | FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing | Youyuan Zhang et.al. | 2403.06269 | null |
2024-03-10 | BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering | Xinmin Qiu et.al. | 2403.06243 | null |
2024-03-10 | VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models | Wenhao Wang et.al. | 2403.06098 | link |
2024-03-10 | Reframe Anything: LLM Agent for Open World Video Reframing | Jiawang Cao et.al. | 2403.06070 | null |
2024-03-08 | VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | Yabo Zhang et.al. | 2403.05438 | link |
2024-03-08 | Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation | Joseph Cho et.al. | 2403.05131 | null |
2024-03-07 | A spatiotemporal style transfer algorithm for dynamic visual stimulus generation | Antonino Greco et.al. | 2403.04940 | null |
2024-03-08 | Pix2Gif: Motion-Guided Diffusion for GIF Generation | Hitesh Kandala et.al. | 2403.04634 | null |
2024-03-05 | Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation | Weijie Li et.al. | 2403.02827 | null |
2024-03-06 | UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control | Xuweiyi Chen et.al. | 2403.02332 | link |
2024-03-05 | AtomoVideo: High Fidelity Image-to-Video Generation | Litong Gong et.al. | 2403.01800 | null |
2024-03-02 | SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code | Ziniu Hu et.al. | 2403.01248 | null |
2024-03-01 | Abductive Ego-View Accident Video Understanding for Safe Driving Perception | Jianwu Fang et.al. | 2403.00436 | null |
2024-02-29 | Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Tsai-Shien Chen et.al. | 2402.19479 | null |
2024-02-28 | Context-aware Talking Face Video Generation | Meidai Xuanyuan et.al. | 2402.18092 | null |
2024-02-27 | EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | Linrui Tian et.al. | 2402.17485 | null |
2024-02-27 | Sora Generates Videos with Stunning Geometrical Consistency | Xuanyi Li et.al. | 2402.17403 | null |
2024-02-28 | Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models | Yixin Liu et.al. | 2402.17177 | link |
2024-02-27 | Video as the New Language for Real-World Decision Making | Sherry Yang et.al. | 2402.17139 | null |
2024-03-04 | Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing | Ling Yang et.al. | 2402.16627 | link |
2024-02-22 | Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis | Willi Menapace et.al. | 2402.14797 | null |
2024-02-22 | Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models | Yixuan Ren et.al. | 2402.14780 | null |
2024-02-22 | Place Anything into Any Video | Ziling Liu et.al. | 2402.14316 | null |
2024-02-21 | Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation | Kihong Kim et.al. | 2402.13729 | null |
2024-02-24 | UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing | Jianhong Bai et.al. | 2402.13185 | null |
2024-02-20 | Neural Network Diffusion | Kai Wang et.al. | 2402.13144 | link |
2024-02-20 | VGMShield: Mitigating Misuse of Video Generative Models | Yan Pang et.al. | 2402.13126 | link |
2024-02-19 | Dynamic and Super-Personalized Media Ecosystem Driven by Generative AI: Unpredictable Plays Never Repeating The Same | Sungjun Ahn et.al. | 2402.12412 | null |
2024-02-19 | Human Video Translation via Query Warping | Haiming Zhu et.al. | 2402.12099 | null |
2024-02-16 | Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation | Lanqing Guo et.al. | 2402.10491 | link |
2024-02-15 | LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing | Bryan Wang et.al. | 2402.10294 | null |
2024-02-14 | Magic-Me: Identity-Specific Video Customized Diffusion | Ze Ma et.al. | 2402.09368 | link |
2024-02-10 | Denoising Diffusion Probabilistic Models in Six Simple Steps | Richard E. Turner et.al. | 2402.04384 | null |
2024-02-06 | ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation | Weiming Ren et.al. | 2402.04324 | link |
2024-02-05 | Projected Generative Diffusion Models for Constraint Satisfaction | Jacob K Christopher et.al. | 2402.03559 | null |
2024-02-05 | Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion | Shiyuan Yang et.al. | 2402.03162 | null |
2024-02-05 | InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions | Yiyuan Zhang et.al. | 2402.03040 | link |
2024-02-04 | Video Editing for Video Retrieval | Bin Zhu et.al. | 2402.02335 | null |
2024-02-06 | DeCoF: Generated Video Detection via Frame Consistency | Long Ma et.al. | 2402.02085 | null |
2024-02-02 | NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties | Jingyuan Sun et.al. | 2402.01590 | null |
2024-02-02 | Boximator: Generating Rich and Controllable Motions for Video Synthesis | Jiawei Wang et.al. | 2402.01566 | null |
2024-02-01 | AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning | Fu-Yun Wang et.al. | 2402.00769 | link |
2024-02-01 | DRSM: efficient neural 4d decomposition for dynamic reconstruction in stationary monocular cameras | Weixing Xie et.al. | 2402.00740 | null |
2024-01-30 | Anything in Any Scene: Photorealistic Video Object Insertion | Chen Bai et.al. | 2401.17509 | null |
2024-01-31 | Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling | Xiaoyu Shi et.al. | 2401.15977 | null |
2024-01-28 | Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes | Weifeng Liu et.al. | 2401.15668 | link |
2024-01-29 | Generative Video Diffusion for Unseen Cross-Domain Video Moment Retrieval | Dezhao Luo et.al. | 2401.13329 | null |
2024-01-23 | Lumiere: A Space-Time Diffusion Model for Video Generation | Omer Bar-Tal et.al. | 2401.12945 | null |
2024-01-19 | Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion | Zuoyue Li et.al. | 2401.10786 | null |
2024-01-18 | Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution | Xin Yuan et.al. | 2401.10404 | null |
2024-01-22 | Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation | Changgu Chen et.al. | 2401.10150 | null |
2024-01-18 | WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens | Xiaofeng Wang et.al. | 2401.09985 | null |
2024-01-18 | CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects | Zhao Wang et.al. | 2401.09962 | null |
2024-01-17 | Vlogger: Make Your Dream A Vlog | Shaobin Zhuang et.al. | 2401.09414 | link |