Repository navigation

#

ai4code

[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct

Python
2024
10 个月前

Inference code and configs for the ReplitLM model family

Python
991
2 年前

A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering

715
1 年前

Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context". `multispy` is a lsp client library in Python intended to be used to build applications around language servers.

Python
270
1 年前

Open-source Self-Instruction Tuning Code LLM

Python
169
2 年前

[ICLR 2025 Oral] This is the official repo for the paper "LLM-SR" on Scientific Equation Discovery and Symbolic Regression with Large Language Models

Python
157
19 天前

[EMNLP 2023] The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation

Jupyter Notebook
98
1 年前

✅SRepair: Powerful LLM-based Program Repairer with $0.029/Fixed Bug

Python
68
1 年前

[ICML2025 Oral] LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models

Python
66
19 天前

This repository contains the core methods and models described in the paper “Represent Code as Action Sequence for Predicting Next Method Call.” It uses action sequence modeling to predict method calls in Python code based on developer intentions, treating code editing as a sequence of human-like actions.

Python
56
1 年前

[ICLR 2021] "Generating Adversarial Computer Programs using Optimized Obfuscations" by Shashank Srikant, Sijia Liu, Tamara Mitrovska, Shiyu Chang, Quanfu Fan, Gaoyuan Zhang, and Una-May O'Reilly

Python
30
4 年前

[FORGE 2025] Predicting Program Behavior with Dynamic Dependencies Learning

Python
24
1 年前

[AAAI 2025] The official code of the paper "InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with Inverse-Instruct"(https://arxiv.org/abs/2407.05700).

Python
12
1 年前

[NAACL 2025] Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning

Jupyter Notebook
10
6 个月前

[ACL'2025 Findings] DynaCode: A Dynamic Complexity-Aware Code Benchmark for Evaluating Large Language Models in Code Generation

Python
7
1 个月前

This paper explores the idea of using heterogeneous graph neural networks (Het-GNN) to partition old legacy monoliths into candidate microservices. We additionally take membership constraints that come from a subject matter expert who has deep domain knowledge of the application.

Python
6
3 年前