Repository navigation
pretraining
- Website
- Wikipedia
General technology for enabling AI capabilities w/ LLMs and MLLMs
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).
[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
Official Repository for the Uni-Mol Series Methods
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper
Pretraining code for a large-scale depth-recurrent language model
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Official repository for "Craw4LLM: Efficient Web Crawling for LLM Pretraining"
The official implementation of MARS: Unleashing the Power of Variance Reduction for Training Large Models
PITI: Pretraining is All You Need for Image-to-Image Translation
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
[ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.