Native and Compact Structured Latents for 3D Generation
Official inference repo for FLUX.2 models
Towards Human-Level Text-to-Speech through Style Diffusion
A lossless video/GIF/image upscaler achieved with waifu2x, Anime4K
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Ready-to-use OCR with 80+ supported languages
Library for OCR-related tasks powered by Deep Learning
Models and examples built with TensorFlow
Operating LLMs in production
AI-driven neuro-symbolic solver for high-school geometry problems
A speech-text foundation model for real time dialogue
State-of-the-art TTS model under 25MB
Video understanding codebase from FAIR for reproducing video models
Deep learning library
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Usable Implementation of "Bootstrap Your Own Latent" self-supervised
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
Image-to-Image Translation in PyTorch
MII makes low-latency and high-throughput inference possible
Audio metadata editor with MusicBrainz integration.
RetroVision, convertidor retro de imagenes a Pixel Art
MMEditing is a low-level vision toolbox based on PyTorch
Unlimited, private and free Speech-To-Text program
FEATool Multiphysics is an easy-to-use FEA and CFD Simulation Toolbox
Sequence-to-sequence framework, focused on Neural Machine Translation