The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
koboldcpp_nocuda.exe	2024-11-01	55.5 MB	0
koboldcpp_cu12.exe	2024-11-01	587.1 MB	0
koboldcpp.exe	2024-11-01	468.8 MB	0
koboldcpp-mac-arm64	2024-11-01	26.7 MB	0
koboldcpp-linux-x64-nocuda	2024-11-01	55.0 MB	0
koboldcpp-linux-x64-cuda1210	2024-11-01	661.2 MB	0
koboldcpp-linux-x64-cuda1150	2024-11-01	576.0 MB	0
koboldcpp_oldcpu.exe	2024-11-01	469.1 MB	0
koboldcpp-1.77 source code.tar.gz	2024-11-01	22.7 MB	0
koboldcpp-1.77 source code.zip	2024-11-01	23.2 MB	0
README.md	2024-11-01	4.3 kB	0
Totals: 11 Items		2.9 GB	0

koboldcpp-1.77

the road not taken edition

logprobs

NEW: Token Probabilities (logprobs) are now available over the API! Currently only supplied over the sync API (non-streaming), but a second /api/extra/last_logprobs dedicated logprobs endpoint is also provided. Will work and provide a link to view alternate token probabilities for both streaming and non-streaming if "logprobs" is enabled in KoboldAI Lite settings. Will also work in SillyTavern when streaming is disabled, once the latest build is out.
Response prompt_tokens, completion_tokens and total_tokens are now accurate values instead of placeholders.
Enabled CUDA graphs for the cuda12 build, which can improve performance on some cards.
Fixed a bug where .wav audio files uploaded directly to the /v1/audio/transcriptions endpoint get fragmented and cut off early. Audio sent as base64 within JSON payloads are unaffected.
Fixed a bug where Whisper transcription blocked generation in non-multiuser mode.
Fixed a bug where trim_stop did not remove a stop sequence that was divided across multiple tokens in some cases.
Significantly increased the maximum limits for stop sequences, anti-slop token bans, logit biases and DRY sequence breakers, (thanks to @mayaeary for the PR which changes the way some parameters are passed to the CPP side)
Added link to help page if user fails to select a model.
Flash Attention GUI quick launcher toggle hidden by default if Vulkan is selected (usually reduced performance).
Updated Kobold Lite, multiple fixes and improvements
NEW: Experimental ComfyUI Support Added!: ComfyUI can now be used as an image generation backend API from within KoboldAI Lite. No workflow customization is necessary. Note: ComfyUI must be launched with the flags --listen --enable-cors-header '*' to enable API access. Then you may use it normally like any other Image Gen backend.
Clarified the option for selecting A1111/Forge/KoboldCpp as an image gen backend, since Forge is gradually superseding A1111. This option is compatible with all 3 of the above.
You are now able to generate images from instruct mode via natural language, similar to chatgpt. (e.g. Please generate an image of a bag of sand). This option requires having an image model loaded, it uses regex and is enabled by default, it can be disabled in settings.
Added support for Tavern "V3" character cards: Actually, V3 is not a real format, it's an augmented V2 card used by Risu that adds additional metadata chunks. These chunks are not supported in Lite, but the base "V2" card functionality will work.
Added new scenario "Interactive Storywriter": This is similar to story writing mode, but allows you to secretly steer the story with hidden instruction prompts.
Added Token Probability Viewer - You can now see a table of alternative token probabilities in responses. Disabled by default, enable in advanced settings.
Fixed JSON file selection problems in some mobile browsers.
Fixed Aetherroom importer.
Minor Corpo UI layout tweaks by @Ace-Lite
Merged fixes and improvements from upstream

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're on a modern MacOS (M1, M2, M3) you can try the koboldcpp-mac-arm64 MacOS binary. If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Source: README.md, updated 2024-11-01