Repository navigation

#

data-centric

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

Rust
5231
7 小时前

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python
566
8 个月前

The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling.

Python
453
3 个月前

Rust implementation of the Data Distribution Service (DDS)

Rust
122
1 个月前

Simulator framework for analysis of performance, energy consumption, area and cost of multi-node multi-chiplet tile-based manycore designs

C++
68
1 年前

🔥🔥🔥 KDD2024 Best Student Paper

Python
61
6 个月前

[ICLR'23] Implementation of "Empowering Graph Representation Learning with Test-Time Graph Transformation"

Python
60
2 年前

Vue Form with Laravel Inspired Validation and Simply Enjoyable Error Messages Api. (Form Api, Validator Api, Rules Api, Error Messages Api)

JavaScript
41
3 年前

A list of data-efficient and data-centric LLM (Large Language Model) papers. Our Survey Paper: Towards Efficient LLM Post Training: A Data-centric Perspective

35
6 个月前

An observer is a wrapper over JSON data, that provides an interface to know when data is changed, with a focus on performance and memory efficiency.

JavaScript
24
4 年前

Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI

Jupyter Notebook
23
4 年前

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)

Jupyter Notebook
16
2 年前

From local functions to cloud deployed pipelines

Python
16
2 年前

Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information"

Python
16
2 年前

Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)

Jupyter Notebook
9
2 年前