Install, Run & ControlEverythingon Your Computerwith 1 Click.
Pinokio is a browser that lets you install, run, and manage ANY server application, locally.
Verified
Scripts from Verified Publishers
browser-use
Run AI Agent in your browser. https://github.com/browser-use/web-ui
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fbrowser-use%2Fmain%2Ficon.png&w=256&q=75)
YuE
[NVIDIA ONLY] YuEGP--A Web UI for YuE, an Open Full-song Generation Foundation Model (10G VRAM required), via https://github.com/deepbeepmeep/YuEGP
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fyue%2Fmain%2Ficon.png&w=256&q=75)
Open WebUI
User-friendly WebUI for LLMs, supported LLM runners include Ollama and OpenAI-compatible APIs https://github.com/open-webui/open-webui
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fopen-webui%2Fmain%2Ficon.png&w=256&q=75)
Hunyuan3D-2
[NVIDIA ONLY] Requires 24GB VRAM (Use the lowvram option, it has the same quality). High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/Tencent/Hunyuan3D-2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2FHunyuan3D-2%2Fmain%2Ficon.png&w=256&q=75)
Hunyuan3D-2-LowVRAM
[NVIDIA ONLY] Run Hunyuan3D-2 with 6GB VRAM: High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/deepbeepmeep/Hunyuan3D-2GP
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2FHunyuan3D-2-lowvram%2Fmain%2Ficon.png&w=256&q=75)
bolt.diy
Prompt, run, edit, and deploy full-stack web apps. https://github.com/stackblitz-labs/bolt.diy
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fbolt%2Fmain%2Ficon.png&w=256&q=75)
StyleTTS2 Studio
Build your own voice for StyleTTS2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2FStyleTTS2_Studio%2Fmain%2Ficon.png&w=256&q=75)
FaceFusion 3.1.1
Industry leading face manipulation platform
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Ffacefusion%2Ffacefusion-pinokio%2Fmaster%2Ffacefusion.png&w=256&q=75)
MMAudio
Generate synchronized audio from video and/or text inputs https://github.com/hkchengrex/MMAudio
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2FMMAudio%2Fmain%2Ficon.png&w=256&q=75)
PSP
Pinokio System Programming: Make your own custom Pinokio
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiocomputer%2Fpsp%2Fmain%2Fweb%2Fpublic%2Ficon.png&w=256&q=75)
ai-video-composer
The ultimate video editor powered by natural language and FFMPEG https://huggingface.co/spaces/huggingface-projects/ai-video-composer
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fai-video-composer%2Fmain%2Ficon.png&w=256&q=75)
echomimic2
[NVIDIA ONLY] Make virtual avatars talk whatever you want with an image and an audio clip https://github.com/antgroup/echomimic_v2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fechomimic2%2Fmain%2Ficon.jpeg&w=256&q=75)
Comfyui
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. https://github.com/comfyanonymous/ComfyUI
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fcomfy%2Fmain%2Ficon.jpeg&w=256&q=75)
Clarity Refiners UI
An enhanced local port of finegrain-image-enhancer powered by Refiners (https://huggingface.co/spaces/finegrain/finegrain-image-enhancer), which was adapted from philz1337x's Clarity Upscaler (https://github.com/philz1337x/clarity-upscaler)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fclarity-refiners-ui%2Fmain%2Ficon.png&w=256&q=75)
pyramidflow
Pyramd Flow Video Generation AI (text-to-video & image-to-video) https://github.com/jy0205/Pyramid-Flow
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fpyramidflow%2Fmain%2Ficon.png&w=256&q=75)
RMBG-2-Studio
Enhanced background remove and replace app built around BRIA-RMBG-2.0 https://huggingface.co/briaai/RMBG-2.0
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2FRMBG-2-Studio%2Fmain%2Ficon.png&w=256&q=75)
InstantIR
restore low-res images, restore broken images, recreate a new version of the image with a prompt https://huggingface.co/spaces/fffiloni/InstantIR
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Finstantir%2Fmain%2Ficon.jpeg&w=256&q=75)
Hallucinator
[NVIDIA ONLY] Autocomplete any voice(s), powered by Hertz AI (Standard Intelligence)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fhallucinator%2Fmain%2Ficon.gif&w=256&q=75)
fish
Multilingual Text-to-Speech with Voice Cloning (Supports: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish) https://github.com/fishaudio/fish-speech
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Ffish%2Fmain%2Ficon.png&w=256&q=75)
MFLUX-WEBUI
[MAC ONLY] A powerful and user-friendly web interface for FLUX, powered by MLX and Gradio via MFLUX
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2FMFLUX-WEBUI%2Fmain%2Ficon.png&w=256&q=75)
Allegro-txt2vid
[NVIDIA ONLY] Generate videos with Allegro txt2vid model https://github.com/rhymes-ai/Allegro
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2FAllegro-txt2vid-install%2Fmain%2Ficon.png&w=256&q=75)
omnigen
A unified image generation model that you can use to perform various tasks, including but not limited to text-to-image generation, subject-driven generation, Identity-Preserving Generation, and image-conditioned generation. https://huggingface.co/spaces/Shitao/OmniGen
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fomnigen%2Fmain%2Ficon.png&w=256&q=75)
ditto
the simplest self-building coding agent https://github.com/yoheinakajima/ditto
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fditto%2Fmain%2Ficon.jpeg&w=256&q=75)
e2-f5-tts
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching https://huggingface.co/spaces/mrfakename/E2-F5-TTS
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fe2-f5-tts%2Fmain%2Ficon.png&w=256&q=75)
diamond
Diffusion for World Modeling https://diamond-wm.github.io/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fdiamond%2Fmain%2Ficon.gif&w=256&q=75)
facepoke
[NVIDIA Only] Select a portrait, click to move the head around https://github.com/jbilcke-hf/FacePoke
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Ffacepoke%2Fmain%2Ficon.gif&w=256&q=75)
MLX-Video-Transcription
[Mac Only] Super Fast MLX Powered Video Transcription https://github.com/RayFernando1337/MLX-Auto-Subtitled-Video-Generator/ by https://x.com/RayFernando1337
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fmlx-video-transcription%2Fmain%2Ficon.jpeg&w=256&q=75)
Invoke
The Gen AI Platform for Pro Studios https://github.com/invoke-ai/InvokeAI
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Finvoke%2Fmain%2Ficon.png&w=256&q=75)
diffusers-image-fill
Remove objects from an image https://huggingface.co/spaces/OzzyGT/diffusers-image-fill
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fdiffusers-image-fill%2Fmain%2Ficon.gif&w=256&q=75)
Whisper-WebUI
A Web UI for easy subtitle using whisper model.
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fwhisper-webui%2Fmain%2Ficon.png&w=256&q=75)
CogStudio
[NVIDIA ONLY] Advanced Web UI for CogVideo (text to video, image to video, video to video, extend video, etc) -- Generate videos with less than 10GB VRAM
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fcogstudio%2Fmain%2Ficon.png&w=256&q=75)
moshi
[Mac only] a speech-text foundation model for real time dialogue https://github.com/kyutai-labs/moshi
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fmoshi%2Fmain%2Ficon.png&w=256&q=75)
Applio
A simple, high-quality voice conversion tool focused on ease of use and performance. https://github.com/IAHispano/Applio
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fapplio%2Fmain%2Ficon.png&w=256&q=75)
fluxgym
[NVIDIA Only] Dead simple web UI for training FLUX LoRA with LOW VRAM support (From 12GB)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Ffluxgym%2Fmain%2Ficon.png&w=256&q=75)
cogvideo
[NVIDIA ONLY] Generate videos with less than 10GB VRAM https://github.com/THUDM/CogVideo
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fcogvideo%2Fmain%2Ficon.png&w=256&q=75)
Forge
[NVIDIA ONLY] The most efficient way to run FLUX (Optimized to run even on low memory machines, as low as 3GB VRAM with 512x512 resolution) https://github.com/lllyasviel/stable-diffusion-webui-forge
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fstable-diffusion-webui-forge%2Fmain%2Ficon.jpeg&w=256&q=75)
LivePortrait
Bring portraits to life! https://github.com/KwaiVGI/LivePortrait
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fliveportrait%2Fmain%2Ficon.png&w=256&q=75)
flux-webui
Minimal Flux Web UI powered by Gradio & Diffusers (Flux Schnell + Flux Merged)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fflux-webui%2Fmain%2Ficon.png&w=256&q=75)
aura-sr-upscaler
AuraSR-v2 - An open reproduction of the GigaGAN Upscaler from fal.ai https://huggingface.co/spaces/gokaygokay/AuraSR-v2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Faura-sr-upscaler%2Fmain%2Ficon.webp&w=256&q=75)
audiocraft_plus
AudioCraft Plus is an all-in-one WebUI for the original AudioCraft, adding many quality features on top https://github.com/GrandaddyShmax/audiocraft_plus
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Faudiocraft_plus%2Fmain%2Ficon.jpeg&w=256&q=75)
artist
Artist is a training-free text-driven image stylization method. You give an image and input a prompt describing the desired style, Artist give you the stylized image in that style. The detail of the original image and the style you provide is harmonically integrated https://huggingface.co/spaces/fffiloni/Artist
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fartist%2Fmain%2Fartist.gif&w=256&q=75)
RC Stable Audio Tools
Advanced Gradio UI for Stable Audio https://github.com/RoyalCities/RC-stable-audio-tools
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Frc-stableaudio%2Fmain%2Ficon.webp&w=256&q=75)
PhotoMaker2
Customizing Realistic Human Photos via Stacked ID Embedding https://huggingface.co/spaces/TencentARC/PhotoMaker-V2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fphotomaker2%2Fmain%2Ficon.png&w=256&q=75)
Fooocus
Minimal Stable Diffusion UI
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Ffooocus%2Fmain%2Ficon.jpeg&w=256&q=75)
autogpt
AutoGPT is a powerful tool that lets you create and run intelligent agents https://github.com/Significant-Gravitas/AutoGPT
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fautogpt%2Fmain%2Ficon.png&w=256&q=75)
gepeto
Generate Pinokio Launchers, Instantly. https://gepeto.pinokio.computer
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fgepeto%2Fmain%2Ficon.jpeg&w=256&q=75)
Florence2
An advanced vision foundation model from MicroSoft https://huggingface.co/spaces/gokaygokay/Florence-2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fflorence2%2Fmain%2Ficon.webp&w=256&q=75)
hallo
[NVIDIA Only] Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation https://github.com/fudan-generative-vision/hallo
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fhallo%2Fmain%2Ficon.gif&w=256&q=75)
chat-with-mlx
[Mac Onlyl] An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework. https://github.com/qnguyen3/chat-with-mlx
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fmlx%2Fmain%2Ficon.png&w=256&q=75)
flashdiffusion
Accelerating any conditional diffusion model for few steps image generation https://gojasper.github.io/flash-diffusion-project/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fflashdiffusion%2Fmain%2Ficon.webp&w=256&q=75)
StableAudio
An Open Source Model for Audio Samples and Sound Design https://github.com/Stability-AI/stable-audio-tools
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fstableaudio%2Fmain%2Ficon.jpeg&w=256&q=75)
PCM
Phased Consistency Model - generate high quality images with 2 steps https://huggingface.co/spaces/radames/Phased-Consistency-Model-PCM
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fpcm%2Fmain%2Ficon.png&w=256&q=75)
SillyTavern
a local-install interface that allows you to interact with text generation AIs (LLMs) to chat and roleplay with custom characters. https://docs.sillytavern.app/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fsillytavern%2Fmain%2Ficon.png&w=256&q=75)
AITown
Build and customize your own version of AI town - a virtual town where AI characters live, chat and socialize https://github.com/a16z-infra/ai-town
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Faitown%2Fmain%2Ficon.png&w=256&q=75)
LlamaFactory
Unify Efficient Fine-Tuning of 100+ LLMs https://github.com/hiyouga/LLaMA-Factory
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fllamafactory%2Fmain%2Ficon.jpeg&w=256&q=75)
openui
Describe UI and see it rendered live. Ask for changes and convert HTML to React, Svelte, Web Components, etc. Like vercel v0, but open source https://github.com/wandb/openui
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fpinokiofactory%2Fopenui%2Fmain%2Ficon.png&w=256&q=75)
StoryDiffusion Comics
create a story by generating consistent images https://github.com/HVision-NKU/StoryDiffusion
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fstorydiffusion-comics%2Fmain%2Ficon.png&w=256&q=75)
ZeST
ZeST: Zero-Shot Material Transfer from a Single Image. Local port of https://huggingface.co/spaces/fffiloni/ZeST (Project: https://ttchengab.github.io/zest/)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fzest%2Fmain%2Ficon.png&w=256&q=75)
Openvoice2
Openvoice 2 Web UI - A local web UI for Openvoice2, a multilingual voice cloning TTS https://x.com/myshell_ai/status/1783161876052066793
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fopenvoice2%2Fmain%2Ficon.png&w=256&q=75)
Lobe Chat
An open-source, modern-design ChatGPT/LLMs UI/Framework. Supports speech-synthesis, multi-modal, and extensible (function call) plugin system. https://github.com/lobehub/lobe-chat
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Flobe%2Fmain%2Ficon.png&w=256&q=75)
IDM-VTON
Improving Diffusion Models for Authentic Virtual Try-on in the Wild https://huggingface.co/spaces/yisol/IDM-VTON
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fidm-vton%2Fmain%2Ficon.png&w=256&q=75)
devika
Agentic AI Software Engineer https://github.com/stitionai/devika
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fdevika%2Fmain%2Ficon.png&w=256&q=75)
CosXL
Edit images with just prompt, an unofficial demo for CosXL and CosXL Edit from Stability AI, https://huggingface.co/spaces/multimodalart/cosxl
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fcosxl%2Fmain%2Ficon.webp&w=256&q=75)
parler-tts
a lightweight text-to-speech (TTS) model that can generate high-quality speech with features that can be controlled using a simple text prompt (e.g. gender, background noise, speaking rate, pitch and reverberation). https://huggingface.co/spaces/parler-tts/parler_tts_mini
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fparler-tts%2Fmain%2Ficon.png&w=256&q=75)
instantstyle
Upload the picture of an image, and generate images with that image style. Instant generation with no LoRA required https://huggingface.co/spaces/InstantX/InstantStyle
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Finstantstyle%2Fmain%2Ficon.jpeg&w=256&q=75)
face-to-all
diffusers InstantID + ControlNet inspired by face-to-many from fofr (https://x.com/fofrAI) - a localized Version of https://huggingface.co/spaces/multimodalart/face-to-all
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fface-to-all%2Fmain%2Ficon.png&w=256&q=75)
CustomNet
A unified encoder-based framework for object customization in text-to-image diffusion models https://huggingface.co/spaces/TencentARC/CustomNet
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fcustomnet%2Fmain%2Ficon.png&w=256&q=75)
spright
Generate images with spatial accuracy https://huggingface.co/spaces/SPRIGHT-T2I/SPRIGHT-T2I
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fspright%2Fmain%2Ficon.webp&w=256&q=75)
brushnet
A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion https://huggingface.co/spaces/TencentARC/BrushNet
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fbrushnet%2Fmain%2Ficon.gif&w=256&q=75)
Arc2Face
A Foundation Model of Human Faces https://huggingface.co/spaces/FoivosPar/Arc2Face
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Farc2face%2Fmain%2Ficon.gif&w=256&q=75)
supir
[NVIDIA ONLY] Text-driven, intelligent restoration, blending AI technology with creativity to give every image a brand new life https://supir.xpixel.group
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fsupir%2Fmain%2Ficon.png&w=256&q=75)
moondream2
a tiny vision language model that kicks ass and runs anywhere https://github.com/vikhyat/moondream
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fmoondream2%2Fmain%2Ficon.png&w=256&q=75)
ZETA
Zero-Shot Text-Based Audio Editing Using DDPM Inversion https://huggingface.co/spaces/hilamanor/audioEditing
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fzeta%2Fmain%2Ficon.png&w=256&q=75)
differential-diffusion-ui
Differential Diffusion modifies an image according to a text prompt, and according to a map that specifies the amount of change in each region https://differential-diffusion.github.io/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fdifferential-diffusion-ui%2Fmain%2Ficon.png&w=256&q=75)
dust3r
Geometric 3D Vision Made Easy https://dust3r.europe.naverlabs.com/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fdust3r%2Fmain%2Ficon.png&w=256&q=75)
Chatbot-Ollama
open source chat UI for Ollama https://github.com/ivanfioravanti/chatbot-ollama
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fchatbot-ollama%2Fmain%2Ficon.png&w=256&q=75)
remove-video-bg
Video background removal tool https://huggingface.co/spaces/amirgame197/Remove-Video-Background
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fremove-video-bg%2Fmain%2Ficon.png&w=256&q=75)
MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean https://github.com/myshell-ai/MeloTTS
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fmelotts%2Fmain%2Ficon.png&w=256&q=75)
gligen
An intuitive GUI for GLIGEN that uses ComfyUI in the backend https://github.com/mut-ex/gligen-gui
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fgligen%2Fmain%2Ficon.png&w=256&q=75)
Stable Cascade
Stable Cascade from StabilityAI
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fstablecascade%2Fmain%2Ficon.png&w=256&q=75)
Bark Voice Cloning
Upload a clean 20 seconds WAV file of the vocal persona you want to mimic, type your text-to-speech prompt and hit submit! A local version of https://huggingface.co/spaces/fffiloni/instant-TTS-Bark-cloning
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fbark%2Fmain%2Ficon.png&w=256&q=75)
[NVIDIA GPU ONLY] LGM
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation https://huggingface.co/spaces/ashawkey/LGM
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Flgm%2Fmain%2Ficon.gif&w=256&q=75)
BRIA RMBG
Background removal model developed by BRIA.AI, trained on a carefully selected dataset and is available as an open-source model for non-commercial use https://huggingface.co/spaces/briaai/BRIA-RMBG-1.4
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fbria-rmbg%2Fmain%2Ficon.webp&w=256&q=75)
VideoCrafter 2
[Runs fast on NVIDIA GPUs. Works on M1/M2/M3 Macs but slow] VideoCrafter is an open-source video generation and editing toolbox for crafting video content. It currently includes the Text2Video and Image2Video models https://github.com/AILab-CVC/VideoCrafter
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fvideocrafter2%2Fmain%2Ficon.png&w=256&q=75)
Moondream1
moondream1 is a tiny (1.6B parameter) vision language model trained by @vikhyatk that performs on par with models twice its size. It is trained on the LLaVa training dataset, and initialized with SigLIP as the vision tower and Phi-1.5 as the text encoder. https://huggingface.co/spaces/vikhyatk/moondream1
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fmoondream1%2Fmain%2Ficon.png&w=256&q=75)
InstantID
state-of-the-art tuning-free method to achieve ID-Preserving generation with only single image, supporting various downstream tasks. https://instantid.github.io/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Finstantid%2Fmain%2Ficon.webp&w=256&q=75)
PhotoMaker
Customizing Realistic Human Photos via Stacked ID Embedding https://github.com/TencentARC/PhotoMaker
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fphotomaker%2Fmain%2Ficon.png&w=256&q=75)
MAGNeT
MAGNeT is a text-to-music and text-to-sound model capable of generating high-quality audio samples conditioned on text descriptions https://github.com/facebookresearch/audiocraft/blob/main/docs/MAGNET.md
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fmagnet%2Fmain%2Ficon.webp&w=256&q=75)
vid2pose
Video to Openpose & DWPose (All OS supported) https://github.com/sdbds/vid2pose
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fvid2pose%2Fmain%2Ficon.gif&w=256&q=75)
Moore-AnimateAnyone-Mini
[NVIDIA ONLY] Efficient Implementation of Animate Anyone (13G VRAM + 2G model size) https://github.com/sdbds/Moore-AnimateAnyone-for-windows
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fmoore-animateanyone-mini%2Fmain%2Ficon.jpeg&w=256&q=75)
Moore-AnimateAnyone
[NVIDIA GPU ONLY] Unofficial Implementation of Animate Anyone https://github.com/MooreThreads/Moore-AnimateAnyone
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fmoore-animateanyone%2Fmain%2Ficon.jpeg&w=256&q=75)
OpenVoice
Instantly clone any voice from any text to any speech, in any language https://huggingface.co/spaces/myshell-ai/OpenVoice
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fopenvoice%2Fmain%2Ficon.webp&w=256&q=75)
IP-Adapter-FaceID
Enter a face image and transform it to any other image. Demo for the h94/IP-Adapter-FaceID model https://huggingface.co/spaces/multimodalart/Ip-Adapter-FaceID
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Ffaceid%2Fmain%2Ficon.png&w=256&q=75)
StreamDiffusion
[NVIDIA ONLY] A Pipeline-Level Solution for Real-Time Interactive Generation https://github.com/cumulo-autumn/StreamDiffusion
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fstreamdiffusion%2Fmain%2Ficon.png&w=256&q=75)
dreamtalk
When Expressive Talking Head Generation Meets Diffusion Probabilistic Models (https://github.com/ali-vilab/dreamtalk)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fdreamtalk%2Fmain%2Ficon.gif&w=256&q=75)
Stable Diffusion web UI
One-click launcher for Stable Diffusion web UI (AUTOMATIC1111/stable-diffusion-webui)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fautomatic1111%2Fmain%2Ficon.png&w=256&q=75)
Video2Openpose
Turn any video into Openpose video https://huggingface.co/spaces/fffiloni/video2openpose2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fvid2openpose%2Fmain%2Ficon.gif&w=256&q=75)
StyleAligned
Style Aligned Image Generation via Shared Attention https://style-aligned-gen.github.io/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2FStyleAligned.pinokio%2Fmain%2Ficon.png&w=256&q=75)
Video2Openpose
Turn any video into Openpose video https://huggingface.co/spaces/fffiloni/video2openpose2
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fvid2openpose.pinokio%2Fmain%2Ficon.gif&w=256&q=75)
MagicAnimate Mini
[NVIDIA GPU Only] An optimized version of MagicAnimate https://github.com/sdbds/magic-animate-for-windows
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2FMagicAnimateMini%2Fmain%2Ficon.gif&w=256&q=75)
Vid2DensePose
Convert your videos to densepose and use it on MagicAnimate https://github.com/Flode-Labs/vid2densepose
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fdensepose.pinokio%2Fmain%2Ficon.png&w=256&q=75)
MagicAnimate
[NVIDIA GPU Only] Temporally Consistent Human Image Animation using Diffusion Model https://showlab.github.io/magicanimate/
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2FMagicAnimate.pinokio%2Fmain%2Ficon.gif&w=256&q=75)
Realtime StableDiffusion
Demo showcasing ~real-time Latent Consistency Model pipeline with Diffusers and a MJPEG stream server (https://github.com/radames/Real-Time-Latent-Consistency-Model)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Frealtime-lcm.pinokio%2Fmain%2Ficon.png&w=256&q=75)
lavie
Text-to-Video (T2V) generation framework from Vchitect https://github.com/Vchitect/LaVie
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Flavie.pinokio%2Fmain%2Ficon.jpeg&w=256&q=75)
LEDITS++
Limitless Image Editing using Text-to-Image Models
![image](/_next/image?url=https%3A%2F%2Fhuggingface.co%2Fspaces%2Fcocktailpeanut%2Fleditsplusplus%2Fraw%2Fmain%2Fmagician.png&w=256&q=75)
Diffusers SDXL Turbo
Demo showcasing ~real-time Latent Consistency Model pipeline with Diffusers and a MJPEG stream server (https://github.com/radames/Real-Time-Latent-Consistency-Model)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fdiffusers-sdxl-turbo%2Fmain%2Ficon.png&w=256&q=75)
sdxl turbo
A Real-Time Text-to-Image Generation Model
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fsdxl-turbo%2Fmain%2Ficon.png&w=256&q=75)
Stable Video Diffusion
[NVIDIA ONLY] Stable Video Diffusion Streamlit App. Currently supports Nvidia GPU machines only.
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fsvd.pinokio%2Fmain%2Ficon.gif&w=256&q=75)
DEUS
A Realtime Creation Engine
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanutlabs%2Fdeus%2Fmain%2Ficon.png&w=256&q=75)
Mirror
An AI powered mirror
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fmirror%2Fmain%2Ficon.webp&w=256&q=75)
Realtime BakLLaVA
llama.cpp with BakLLaVA model describes what does it see (https://github.com/Fuzzy-Search/realtime-bakllava)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fbakllava.pinokio%2Fmain%2Ficon.png&w=256&q=75)
LP-MusicCaps
LLM-Based Pseudo Music Captioning
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Flp_music_caps.pinokio%2Fmain%2Ficon.png&w=256&q=75)
AudioSep
Separate Anything You Describe (https://huggingface.co/spaces/Audio-AGI/AudioSep)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2FAudioSep.pinokio%2Fmain%2Ficon.jpeg&w=256&q=75)
LCM
Fast Image generator using Latent consistency models https://replicate.com/blog/run-latent-consistency-model-on-mac
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Flcm.pinokio%2Fmain%2Ficon.png&w=256&q=75)
Text Generation WebUI
A Gradio web UI for Large Language Models https://github.com/oobabooga/text-generation-webui
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Foobabooga.pinokio%2Fmain%2Ficon.png&w=256&q=75)
IllusionDiffusion
Generate stunning illusion artwork with StableDiffusion (A space by @angrypenguinPNGAP - created with Monster Labs QR ControlNet.
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fillusion.pinokio%2Fmain%2Ficon.png&w=256&q=75)
XTTS
clone voices into different languages by using just a quick 3-second audio clip. (a local version of https://huggingface.co/spaces/coqui/xtts)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fxtts.pinokio%2Fmain%2Ficon.png&w=256&q=75)
RVC
1 Click Installer for Retrieval-based-Voice-Conversion-WebUI (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Frvc.pinokio%2Fmain%2Ficon.png&w=256&q=75)
kohya_ss
1 Click Installer for kohya_ss, a Stable Diffusion LoRa & Dreambooth WebUI (https://github.com/bmaltais/kohya_ss)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fkohya.pinokio%2Fmain%2Ficon.png&w=256&q=75)
Tokenflow
Temporally consistent video editing. A local version of https://huggingface.co/spaces/weizmannscience/tokenflow
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Ftokenflow.pinokio%2Fmain%2Ficon.gif&w=256&q=75)
ModelScope Image2Video (Nvidia GPU only)
Turn any image into a video! (Web UI created by fffiloni: https://huggingface.co/spaces/fffiloni/MS-Image2Video)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fms-image2video.pinokio%2Fmain%2Ficon.png&w=256&q=75)
VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2FVALL-E-X.pinokio%2Fmain%2Ficon.jpeg&w=256&q=75)
DenseDiffusion
Dense Text-to-Image Generation with Attention Modulation
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fdensediffusion.pinokio%2Fmain%2Ficon.png&w=256&q=75)
LoRA the Explorer
Stable Diffusion LoRA Playground (HuggingFace: https://huggingface.co/spaces/multimodalart/LoraTheExplorer)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2FLTE.pinokio%2Fmain%2Ficon.png&w=256&q=75)
1 Click Control-Lora for ComfyUI
Install Control-Lora Models and Workflows to ComfyUI with 1 click
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fcontrol-lora.comfyui.pinokio%2Fmain%2Ficon.png&w=256&q=75)
LDM 3D
[NVIDIA GPU ONLY] One click installer for Intel's ldm3d
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fldm3d.pinokio%2Fmain%2Ficon.png&w=256&q=75)
Audio Webui
A webui for different audio related Neural Networks
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Faudio-webui.pinokio%2Fmain%2Ficon.png&w=256&q=75)
AudioLDM 2
[Nvidia GPU only] One click installer for AudioLDM 2 Gradio UI
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2FAudioLDM2.pinokio%2Fmain%2Ficon.png&w=256&q=75)
AudioGradio
One click installer for AudioCraft MusicGen and AudioGen Gradio UI (Requires at least Pinokio v0.0.56)
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Faudiogradio.pinokio%2Fmain%2Ficon.png&w=256&q=75)
AnimateDiff
Install AnimateDiff Automatic1111 Extension and the models with one click
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fanimatediff.pinokio%2Fmain%2Ficon.jpeg&w=256&q=75)
Xorbits Inference
LLM Web UI and API
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fxinference.pinokio%2Fmain%2Ficon.png&w=256&q=75)
llamacpp
Port of Facebook's LLaMA model in C/C++
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Fllamacpp.pinokio%2Fmain%2Ficon.png&w=256&q=75)
Pinokio Tutorial
Simple script examples that highlight all the Pinokio APIs
![image](/_next/image?url=https%3A%2F%2Fraw.githubusercontent.com%2Fcocktailpeanut%2Ftutorial.pinokio%2Fmain%2Ficon.png&w=256&q=75)
Latest
Latest Pinokio scripts from the community (tagged as 'pinokio' on GitHub)