- Norway
Block or Report
Block or report Murhaf
Contact GitHub support about this userโs behavior. Learn more about reporting abuse.
Report abuseLists (2)
Sort Name ascending (A-Z)
Language
Sort by: Recently starred
Starred repositories
The most streamlined road map to learn ML for free.
A reactive notebook for Python โ run reproducible experiments, execute as a script, deploy as an app, and version with git.
Reconquer the canvas: beautiful Tikz figures without clunky Tikz code
Python module (C extension and plain python) implementing Aho-Corasick algorithm
Fast lexical search library implementing BM25 in Python using Scipy (on average 2x faster than Elasticsearch in single-threaded setting)
Fast & Simple repository for pre-training and fine-tuning T5-style models
โ๏ธ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Efficient few-shot learning with Sentence Transformers
Experiments for efforts to train a new and improved t5
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Paper List for Contrastive Learning for Natural Language Processing
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)
MTEB: Massive Text Embedding Benchmark
Sparsity-aware deep learning inference runtime for CPUs
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Cโฆ
AraT5: Text-to-Text Transformers for Arabic Language Understanding
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Sentiment Corpus for Swedish ๐ธ๐ช Norwegian ๐ณ๐ด Danish ๐ฉ๐ฐ Finnish ๐ซ๐ฎ (and English ๐ด๓ ง๓ ข๓ ฅ๓ ฎ๓ ง๓ ฟ)
Modeling, training, eval, and inference code for OLMo
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Leveraging BERT and c-TF-IDF to create easily interpretable topics.