Repository navigation

#

data-centric

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

Rust
4494
1 天前

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Python
546
4 个月前

The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling.

Python
449
11 天前

Rust implementation of the Data Distribution Service (DDS)

Rust
114
2 天前

Simulator framework for analysis of performance, energy consumption, area and cost of multi-node multi-chiplet tile-based manycore designs

C++
64
10 个月前

[ICLR'23] Implementation of "Empowering Graph Representation Learning with Test-Time Graph Transformation"

Python
56
2 年前

🔥🔥🔥 KDD2024 Best Student Paper

Python
54
2 个月前

Vue Form with Laravel Inspired Validation and Simply Enjoyable Error Messages Api. (Form Api, Validator Api, Rules Api, Error Messages Api)

JavaScript
41
3 年前

A list of data-efficient and data-centric LLM (Large Language Model) papers. Our Survey Paper: Towards Efficient LLM Post Training: A Data-centric Perspective

29
2 个月前

An observer is a wrapper over JSON data, that provides an interface to know when data is changed, with a focus on performance and memory efficiency.

JavaScript
24
4 年前

Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI

Jupyter Notebook
22
4 年前

Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)

Jupyter Notebook
16
2 年前

From local functions to cloud deployed pipelines

Python
16
2 年前

Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information"

Python
16
2 年前

Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)

Jupyter Notebook
10
2 年前