Block or Report
Block or report cluo1989
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to o…
MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
List of references and online resources related to data science, machine learning and deep learning.
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
The official PyTorch implementation of CVPR 2020 paper "Improving Convolutional Networks with Self-Calibrated Convolutions"
CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer Arxiv - HAT: Hybrid Attention Transformer for Image Restoration
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022 Poster).
自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
🎨 数学公式识别增强版:中英文手写印刷公式、支持初级符号推导(数据结构基于 LaTeX 抽象语法树)Math Formula OCR Pro, supports handwrite, Chinese-mixed formulas and simple symbol reasoning (based on LaTeX AST).
Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)
基于pytorch的ocr算法库,包括 psenet, pan, dbnet, sast , crnn
Pytorch implementation of cnn network
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
(CRNN) Chinese Characters Recognition.
Large Language Models: In this repository Language models are introduced covering both theoretical and practical aspects.
Practical course about Large Language Models.
NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along with their corresponding binary masks.
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Official PyTorch implementation of `Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition`
Convert scans of handwritten notes to beautiful, compact PDFs
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
The official code for “Deep Unrestricted Document Image Rectification”, TMM, 2023.