Repository navigation
cross-attention
- Website
- Wikipedia
Unofficial implementation of "Prompt-to-Prompt Image Editing with Cross Attention Control" with Stable Diffusion
[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!
🚀 Cross attention map tools for huggingface/diffusers
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
1-shot image segmentation using Stable Diffusion
This is the project for the paper of "Low-Light Video Enhancement via Spatial-Temporal Consistent Decomposition" in IJCAI2025
[IV 2025, Oral] Official code of "6Img-to-3D: Few-Image Large-Scale Outdoor Novel View Synthesis"
This is the implementation of the paper Enhanced Photovoltaic Power Forecasting: An iTransformer and LSTM-Based Model Integrating Temporal and Covariate Interactions
Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.
[NeurIPS 2023] Official implementation of the paper "CAST: Cross-Attention in Space and Time for Video Action Recognition"
The official repository of "Energy-Based Cross Attention for Bayesian Context Update in Text-to-Image Diffusion Models".
[ITSC-2023] HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection
A lightweight PyTorch implementation of the Transformer-XL architecture proposed by Dai et al. (2019)
Tensorflow implementation of 'Robust Image Watermarking based on Cross-Attention and Invariant Domain Learning'
[ICIP 2025] Official implementation of RT-X Net: RGB-Thermal cross attention network for Low-Light Image Enhancement
Detect Deepfaked Faces Using Multiple Deeplearning Models
Transcription factor binding site prediction for novel DNA sequence data aiding in mutation identification and drug discovery
SOVL System (Self-Organizing Virtual Lifeform): A complex, purpose-agnostic autonomous agent with continuous, asynchronous learning capabilities via a dynamic scaffolded LLM and a frozen base LLM