Highlights
- Pro
Starred repositories
AssertJ is a library providing easy to use rich typed assertions
Lakekeeper: A Rust native Iceberg REST Catalog
This repository contains the dbt-glue adapter
PartiQL libraries and tools in Kotlin.
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…
Open, Multi-modal Catalog for Data & AI
DuckDB is an analytical in-process SQL database management system
これからApache Icebergを学びたい人向けの実践的なハンズオンです。コンテナが動く端末1台で始められます
Fancy stream processing made operationally mundane
Awaitility is a small Java DSL for synchronizing asynchronous operations
Specification for storing geospatial vector data (point, line, polygon) in Parquet
Repository for the book "Crafting Interpreters"
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.
Extendable version manager with support for Ruby, Node.js, Elixir, Erlang & more
Your favorite language gets closer to bare metal.
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
A composable and fully extensible C++ execution engine library for data management systems.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Serverless ETL and Analytics with AWS Glue, published by Packt