Repository navigation
rft
- Website
- Wikipedia
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, InternVL3, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).
Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectrum and spectrogram.
The My Pocket Token Foundation will make the blockchain better, by bridging the blockchain with the worldwide web. Some of the best Developers in the world.
[AAMAS 2025] Privacy-preserving and Personalized RLHF, with convergence guarantees. The Code contains experiments for training multiple instances of GPT-2 for personalized sentiment aligned text generation.
Smart contract and unit tests for Refungible Token / Fractional NFT
Awesome Reinforcement Fine Tuning
A small Hugging Face LLM for planning and reasoning
Syllogimind is a application developed in Go, designed to engage users in enhancing their logical reasoning capabilities through the generation and solving of syllogisms.
Generate MsWord file using php
Algorithmic Truth Table Method for Proving Validity of Argument Forms
The course teaches how to fine-tune LLMs using Group Relative Policy Optimization (GRPO)—a reinforcement learning method that improves model reasoning with minimal data. Learn RFT concepts, reward design, LLM-as-a-judge evaluation, and deploy jobs on the Predibase platform.