Repository navigation

rft

Website
Wikipedia

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, InternVL3, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).

大语言模型 lora llama sft deploy multimodal peft internvl liger qwen2-vl rft deepseek-r1 embedding grpo open-r1 megatron omni llama4 qwen3 qwen3-moe

Python

9384

825

7 小时前

dafyddg / RFA

Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectrum and spectrogram.

rft

Python

2 年前

LifeCoachRay / My-Pocket-Token-Foundation

The My Pocket Token Foundation will make the blockchain better, by bridging the blockchain with the worldwide web. Some of the best Developers in the world.

blockchain-technology developer-tools 加密货币 tokens blogging 安全 rft nftools nfts nft nft-gallery

Solidity

3 年前

flint-xf-fan / Federated-RLHF

[AAMAS 2025] Privacy-preserving and Personalized RLHF, with convergence guarantees. The Code contains experiments for training multiple instances of GPT-2 for personalized sentiment aligned text generation.

大语言模型 reinforcement-learning-from-human-feedback rft rlhf

Python

4 个月前

anasshad / Refungible-Tokens-Fractional-NFT

Smart contract and unit tests for Refungible Token / Fractional NFT

Solidity smart-contracts erc721 erc20 nft rft

JavaScript

4 年前

XxFChen / awesome-reinforcement-fine-tuning

Awesome Reinforcement Fine Tuning

rft fine-tuning finetuning

8 个月前

JohnTheCoolingFan / RandomFactorioThings

Random Factorio Things mod for Factorio

Factorio mod rft

Lua

8 个月前

Masoudjafaripour / llm-hf-planning

A small Hugging Face LLM for planning and reasoning

fine-tuning 大语言模型 planning rft sft

Python

6 个月前

Azaijah / Syllogimind

Syllogimind is a application developed in Go, designed to engage users in enhancing their logical reasoning capabilities through the generation and solving of syllogisms.

rft

1 年前

aman-maurya / OfficeExporter

Generate MsWord file using php

PHP php-library msword rft office-tools

PHP

5 年前

rft-kolcsonzo / kolcsonzo-api

rft university PHP REST API API Docker

PHP

7 年前

HINNOTN / syllogisms

Algorithmic Truth Table Method for Proving Validity of Argument Forms

大语言模型 philosophy Python reasoning rft statement validation

TeX

1 天前

ksm26 / Reinforcement-Fine-Tuning-LLMs-with-GRPO

The course teaches how to fine-tune LLMs using Group Relative Policy Optimization (GRPO)—a reinforcement learning method that improves model reasoning with minimal data. Learn RFT concepts, reward design, LLM-as-a-judge evaluation, and deploy jobs on the Predibase platform.

grpo reinforcement-learning rft rlhf language-model 机器学习

Jupyter Notebook

2 个月前