-
MarkAny
- Seoul, Korea
- https://www.linkedin.com/in/yonghye-kwon-91641a174/
Block or Report
Block or report developer0hye
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (20)
Sort Name ascending (A-Z)
Stars
Language
Sort by: Recently starred
Claude Engineer is an interactive command-line interface (CLI) that leverages the power of Anthropic's Claude-3.5-Sonnet model to assist with software development tasks. This tool combines the capa…
PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"
Fast and memory-efficient exact attention
This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".
Official implementation of Faceptor: A Generalist Model for Face Perception.
Official Pytorch Implementation of the paper, "SwinFace: A Multi-task Transformer for Face Recognition, Facial Expression Recognition, Age Estimation and Face Attribute Estimation"
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Python library for romanizing Korean text. Converts '안녕하세요' to 'annyeonghaseyo' and more.
A Perceptual Image Sharpness Metric Based on Local Edge Gradient Analysis
ONNX-compatible Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Official implementation of CVPR2020 paper "VIBE: Video Inference for Human Body Pose and Shape Estimation"
This repository is a curated collection of the most exciting and influential CVPR 2024 papers. 🔥 [Paper + Code + Demo]
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
[Arxiv-2024] MotionLLM: Understanding Human Behaviors from Human Motions and Videos
From zero to hero CUDA for accelerating maths and machine learning on GPU.
YOLOv10: Real-Time End-to-End Object Detection
[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition
llama3 implementation one matrix multiplication at a time
code for CVPR2024 paper: DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear Prediction
Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place!
Official PyTorch implementation of CorrespondentDream: Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences (CVPR 2024 Poster)
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"