Block or Report
Block or report Je-Cp
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
A modular graph-based Retrieval-Augmented Generation (RAG) system
New file format for storage of large columnar datasets.
CLI to manage your datacontract.yaml files
liubin / docker-hadoop
Forked from F21/hadoopFor distributed HA HDFS only.
Avro XML Mapper is a Java library that converts XML formatted data to Apache Avro format
A vulnerability scanner for container images and filesystems
Apache Atlas development image for the Rokku project: https://github.com/ing-bank/rokku
Cluster in docker with Apache Atlas and a minimal Hadoop ecosystem to perform some basic experiments.
This end-end demo will walk the users through the process of extracting text from encoded PDF documents at scale using Apache PDFBox and Databricks using Scala and Spark.
Simple project to expose a catalog over REST using a Java catalog backend
A tool that makes it easy to run modular Trino environments locally.
📊 Cube — The Semantic Layer for Building Data Applications
An orchestration platform for the development, production, and observation of data assets.
High quality resources & applications for LLMs, multi-modal models and VectorDBs
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Trino connectors for accessing APIs with an OpenAPI spec
dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
A GitOps based Anthos Multi Cloud installer framework.
MicroK8s is a small, fast, single-package Kubernetes for datacenters and the edge.