The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
koboldcpp.exe	2025-04-21	507.8 MB	0
koboldcpp-mac-arm64	2025-04-21	26.9 MB	0
koboldcpp-linux-x64-nocuda	2025-04-21	77.8 MB	0
koboldcpp-linux-x64-cuda1210	2025-04-21	701.2 MB	0
koboldcpp-linux-x64-cuda1150	2025-04-21	612.7 MB	0
koboldcpp_oldcpu.exe	2025-04-21	507.9 MB	0
koboldcpp_nocuda.exe	2025-04-21	77.1 MB	0
koboldcpp_cu12.exe	2025-04-21	624.0 MB	0
koboldcpp-1.89 source code.tar.gz	2025-04-20	27.7 MB	0
koboldcpp-1.89 source code.zip	2025-04-20	28.1 MB	0
README.md	2025-04-20	3.9 kB	0
Totals: 11 Items		3.2 GB	0

koboldcpp-1.89

https://github.com/user-attachments/assets/a2969fa0-e637-459d-918a-9eab056a0b94

NEW: Improved NoScript mode - NoScript mode now has chat mode and image generation support entirely without Javascript! Access it by default at http://localhost:5001/noscript on your browser. Tested to work on Internet Explorer 5, Netscape Navigation 4, NetSurf, Lynx, Dillo, and basically any browser made after 1999.
Added new launcher flags --overridekv and --overridetensors which work in the same way as llama.cpp's flags.
--overridekv allows you to specify a single metadata property to be overwritten. Input format is keyname=type:value
--overridetensors allow you to place tensors matching a pattern onto a specific backend. Input format is tensornamepattern=buffertype
Enabled @jeffbolznv coopmat2 support for Vulkan (supports flash attention, overall slightly faster). CM2 is only enabled if you have the latest Nvidia Game Ready Driver (576.02) and should provide all round speedups. Thought the (OldCPU) Vulkan binaries will now exclude coopmat, coopmat2 and DP4A, so please use OldCPU mode if you encounter issues.
Display available GPU memory when estimating layers
Fixed RWKV model loading
Added more sanity checks for Zenity, made YAD the default filepicker instead. If you still encounter issues, please select Legacy TK filepicker in the extras page, and report the issue.
Minor fixes for stable UI inpainting brush selection.
Enabled usage of Image Generation LoRAs even with a quantized diffusion model (the LoRA should still be unquantized)
Fixed a crash when using certain image LoRAs due to graph size limits. Also reverted CLIP quant to f32 changes.
CLI mode fixes
Updated Kobold Lite, multiple fixes and improvements
IMPORTANT: Relocated Tokens Tab and WebSearch Tab into Settings Panel (from context panel). Likewise, the regex and token sequence configs are now stored in settings rather than story (and will persist even on a new story).
Fixed URLs not opening on new tab
Reworked thinking tag handling - now separates display and submit regex behaviors (3 modes each)
Added Retain History toggle for WebSearch to retain some old search results on subsequent queries.
Added a editable Template for character creator (by @PeterPeet)
Increased to 10 local and 10 remote save slots.
Removed aetherroom club (dead site)
Merged fixes and improvements from upstream

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're on a modern MacOS (M1, M2, M3) you can try the koboldcpp-mac-arm64 MacOS binary. If you're using AMD, we recommend trying the Vulkan option (available in all releases) first, for best support. Alternatively, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

Source: README.md, updated 2025-04-20