Repository navigation

#

rft

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, InternVL3, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).

Python
9384
7 小时前

Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectrum and spectrogram.

Python
16
2 年前

The My Pocket Token Foundation will make the blockchain better, by bridging the blockchain with the worldwide web. Some of the best Developers in the world.

Solidity
15
3 年前

[AAMAS 2025] Privacy-preserving and Personalized RLHF, with convergence guarantees. The Code contains experiments for training multiple instances of GPT-2 for personalized sentiment aligned text generation.

Python
10
4 个月前

Smart contract and unit tests for Refungible Token / Fractional NFT

JavaScript
4
4 年前

Awesome Reinforcement Fine Tuning

3
8 个月前

Random Factorio Things mod for Factorio

Lua
2
8 个月前

A small Hugging Face LLM for planning and reasoning

Python
2
6 个月前

Syllogimind is a application developed in Go, designed to engage users in enhancing their logical reasoning capabilities through the generation and solving of syllogisms.

Go
0
1 年前

Generate MsWord file using php

PHP
0
5 年前

Algorithmic Truth Table Method for Proving Validity of Argument Forms

TeX
0
1 天前

The course teaches how to fine-tune LLMs using Group Relative Policy Optimization (GRPO)—a reinforcement learning method that improves model reasoning with minimal data. Learn RFT concepts, reward design, LLM-as-a-judge evaluation, and deploy jobs on the Predibase platform.

Jupyter Notebook
0
2 个月前