Cypher_Generator

This repository contains code for generating a supervised fine-tuning dataset consisting of question-Cypher query pairs. Each question is a function of node labels, properties, or relationship types along with their properties. While these questions may appear more mechanical, they can effectively complement naturally phrased questions in the fine-tuning datasets.

Our approach employs approximately 100 generating functions. The question-Cypher queries are generated using a Neo4j graph database by extracting its knowledge graph schema along with several node and relationship instances.

To facilitate ease of use and transparency, the dataset generation process is provided in a notebook format. To generate the dataset, obtain your Neo4j knowledge graph credentials and follow the steps outlined in the notebook: SFT_Functional_Data_Builder.ipynb. Many steps within the notebook are adjustable to cater to specific user needs. Some functionalities rely on modules found in the utils directory.

Additionally, we include two fine-tuning notebooks that utilize QLoRA to ease computational demands, along with PEFT and TRL from HuggingFace, and using CodeLlama-13B and StarCoder2-3B large languge models.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
datas		datas
utils		utils
.gitignore		.gitignore
CodeLlama_13B_Supervised_FineTuning_Text2Cypher.ipynb		CodeLlama_13B_Supervised_FineTuning_Text2Cypher.ipynb
LICENSE		LICENSE
README.md		README.md
SFT_Functional_Data_Builder.ipynb		SFT_Functional_Data_Builder.ipynb
StarCoder2_3B_Supervised_FineTuning_Text2Cypher.ipynb		StarCoder2_3B_Supervised_FineTuning_Text2Cypher.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cypher_Generator

About

Releases

Packages

Languages

License

SolanaO/Cypher_Generator

Folders and files

Latest commit

History

Repository files navigation

Cypher_Generator

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages