Repository navigation

#

chunking

chonkie-ai/chonkie

🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library

Python
2872
1 个月前

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.

Python
1895
3 年前
C
1517
1 年前

An extensible Java framework for building event-driven applications that break up XML and non-XML data into chunks for data integration

Java
400
6 天前

Alternative casync implementation

Go
349
3 天前

Fully neural approach for text chunking

Python
310
1 天前

A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.

Python
290
24 天前

A package for parsing PDFs and analyzing their content using LLMs.

Python
269
8 个月前

The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.

Python
243
12 天前

A TensorFlow implementation of Neural Sequence Labeling model, which is able to tackle sequence labeling tasks such as POS Tagging, Chunking, NER, Punctuation Restoration and etc.

Python
234
6 年前

A new chunking strategy developed by ZeroEntropy for general semantic chunking using Llama-70B.

Python
174
3 个月前

Live TS segmenter and HLS manifest creation in Go

Go
94
3 年前

An LLM GUI application; enables you to interact with your files, offering dynamic parameters that can modify response behavior during runtime.

Python
92
1 年前

webpack 2, react hotloader 3, react router v4, code splitting and more

JavaScript
85
8 年前

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

JavaScript
83
2 个月前

📑 Split Laravel jobs into multiple separate job chunks

PHP
83
1 年前

An asynchronous event-driven HTTP client based on netty.

Java
82
3 年前

Грамматический Словарь Русского Языка (+ английский, японский, etc)

C++
75
5 年前

Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with extensive tests, a fuzz test and a benchmark.

JavaScript
74
5 年前