OCR Software for BSD

OCR BSD Clear Filters

Browse free open source OCR software and projects for BSD below. Use the toggles on the left to filter open source OCR software by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Skillfully - The future of skills based hiring Icon
    Skillfully - The future of skills based hiring

    Realistic Workplace Simulations that Show Applicant Skills in Action

    Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
    Learn More
  • 1
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    OpenKM is a electronic document management system and record management system EDRMS ( DMS, RMS, CMS ). It provides modern and flexible architecture that meet today's IT demands, based on open technology (Java, Tomcat, GWT, Lucene, Hibernate, Spring and jBPM), powerful and scalable multiplatform application. OpenKM is a Web 2.0 application that works with Internet Explorer, Firefox, Safari and Opera. Can be configured in major DMBS like Oracle, PostgreSQL and MySQL among others. Due to its technological architecture design, OpenKM meets the document management needs of businesses of all sizes (from SMEs to big corporations). Thanks to its elegant and intuitive interface, OpenKM transforms complex operations into easy tasks. The most relevant functions of OpenKM is the indexing of the most common types of files: text, Office, Office 2007, OpenOffice, PDF, HTML, XML, MP3, JPEG, etc. For a complete feature list take a look at http://goo.gl/au8cQy
    Leader badge
    Downloads: 415 This Week
    Last Update:
    See Project
  • 2
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 349 This Week
    Last Update:
    See Project
  • 3
    pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (but no editable text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries: convert, unpaper, tesseract, gs, and hocr2pdf (if tesseract < 3.03). It is known to run on Unix systems and has been tested on Linux and MacOS X. It supports parallel processing on multiprocessor systems. In contrast to most competing sandwich programs, it performs preprocessing of the scanned images, such as de-skewing or removal of dark edges etc. For further information please read the manual: http://www.tobias-elze.de/pdfsandwich/index.html
    Leader badge
    Downloads: 290 This Week
    Last Update:
    See Project
  • 4
    gImageReader

    gImageReader

    A graphical frontend to tesseract-ocr

    gImageReader is a simple Gtk/Qt front-end to tesseract. Features include: - Import PDF documents and images from disk, scanning devices, clipboard and screenshots - Process multiple images and documents in one go - Manual or automatic recognition area definition - Recognize to plain text or to hOCR documents - Recognized text displayed directly next to the image - Post-process the recognized text, including spellchecking - Generate PDF documents from hOCR documents **Note**: This page is only a mirror for the downloads. Development is happening on github at https://github.com/manisandro/gImageReader, release binaries are also posted there.
    Leader badge
    Downloads: 178 This Week
    Last Update:
    See Project
  • eProcurement Software Icon
    eProcurement Software

    Enterprises and companies seeking a solution to manage all their procurement operations and processes

    eBuyerAssist by Eyvo is a cloud-based procurement solution designed for businesses of all sizes and industries. Fully modular and scalable, it streamlines the entire procurement lifecycle—from requisition to fulfillment. The platform includes powerful tools for strategic sourcing, supplier management, warehouse operations, and contract oversight. Additional modules cover purchase orders, approval workflows, inventory and asset management, customer orders, budget control, cost accounting, invoice matching, vendor credit checks, and risk analysis. eBuyerAssist centralizes all procurement functions into a single, easy-to-use system—improving visibility, control, and efficiency across your organization. Whether you're aiming to reduce costs, enhance compliance, or align procurement with broader business goals, eBuyerAssist helps you get there faster, smarter, and with measurable results.
    Learn More
  • 5
    A GUI to ease the process of producing a multipage PDF from a scan. gscan2pdf should work on almost any Linux/BSD machine.
    Leader badge
    Downloads: 165 This Week
    Last Update:
    See Project
  • 6
    A Java JNA wrapper for Tesseract OCR API
    Leader badge
    Downloads: 70 This Week
    Last Update:
    See Project
  • 7
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. It supports local deployment, enabling organizations concerned about privacy or latency to run the pipeline on-premises rather than send sensitive documents to third-party cloud services. The codebase is written in Python with a focus on modularity: you can swap preprocessing, recognition, and post-processing components as needed for custom workflows.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    CD+Graphics Magic
    Timeline based editor for creating Compact Disc Subcode Graphics (also known as CD+G or CDG). Both karaoke and multimedia styles of content are supported. Please visit cdgmagic.sf.net for examples playable directly in the HTML5 CD+G player. CD+Graphics Scribe utility (separate download -- click "Browse All Files" above) can now convert existing CDG karaoke content to CMP (CD+Graphics Magic Project), LRC (Enhanced Lyrics), and ASS (Advanced SubStation Alpha) format.
    Leader badge
    Downloads: 37 This Week
    Last Update:
    See Project
  • 9
    chessPDFBrowser

    chessPDFBrowser

    Chess application whichs allows working with chess PDF books and PGNs.

    Chess application which allows working with PDFs and PGNs. You can work with the chess games of the PDF and edit their tree of variants. Graphical environment. Standard PGN TAGs. PGN comments. Ocr like (Fen string detection from chess board position images). Connection to Uci chess engines (like stockfish). Position analysis, full game analysis. You can now play games against uci engines. pdf2pgn command line command included. Detailed documentation. Multilanguage currently support for English, Spanish and Catalan. Dark mode option. JDK-17 compatibility
    Downloads: 68 This Week
    Last Update:
    See Project
  • Say goodbye to broken revenue funnels and poor customer experiences Icon
    Say goodbye to broken revenue funnels and poor customer experiences

    Connect and coordinate your data, signals, tools, and people at every step of the customer journey.

    LeanData is a Demand Management solution that supports all go-to-market strategies such as account-based sales development, geo-based territories, and more. LeanData features a visual, intuitive workflow native to Salesforce that enables users to view their entire lead flow in one interface. LeanData allows users to access the drag-and-drop feature to route their leads. LeanData also features an algorithms match that uses multiple fields in Salesforce.
    Learn More
  • 10
    A free OCR-A font, conformant to ANSI X3.17-1977, in TrueType format, with sources.
    Leader badge
    Downloads: 55 This Week
    Last Update:
    See Project
  • 11
    Linux-Intelligent-Ocr-Solution

    Linux-Intelligent-Ocr-Solution

    Easy-OCR solution and Tesseract trainer for GNU/Linux

    Linux-intelligent-ocr-solution Lios is a free and open source software for converting print in to text using either scanner or a camera, It can also produce text out of scanned images from other sources such as Pdf, Image, Folder containing Images or screenshot. Program is given total accessibility for visually impaired. A Tesseract Trainer GUI is also shipped with this package. Forum : https://groups.google.com/forum/#!forum/lios Video Tutorial : https://www.youtube.com/playlist?list=PLn29o8rxtRe1zS1r2-yGm1DNMOZCgdU0i Tesseract Training Tutorial (beta) : https://www.youtube.com/watch?v=qLpCld4cdtk Source Code Github : https://github.com/Nalin-x-Linux/lios-3 Gitlab : https://gitlab.com/Nalin-x-Linux/lios-3 User guide is available in download page
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    yagf

    yagf

    YAGF is a tesseract and cuneiform wrapper and helper*

    YAGF is a graphical front-end for cuneiform and tesseract OCR tools. With YAGF you can open already scanned image files or obtain new images via XSane (scanning results are automatically passed to YAGF). Once you have a scanned image you can prepare it for recognition, select particular image areas for recognition, set the recognition language and so on. Recognized text is displayed in a editor window where it can be corrected, saved to disk or copied to clipboard. YAGF also provides some facilities for a multi-page recognition (see the online help for more details).
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Neuroph OCR - Handwriting Recognition
    Neuroph OCR - Handwriting Recognition is developed to recognize hand written letter and characters. It's engine derived's from the Java Neural Network Framework - Neuroph and as such it can be used as a standalone project or a Neuroph plug in.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    DJVU++

    DJVU++

    The DjVu complete solution,with OCR Technology(Arabic ,English).

    DjVu++ is a user-friendly program that used to manipulate DjVu file formats such as eBooks with a penalty of editing features. The program introduce a free replacement for the property PDF format with similar resolution and smaller file size DjVu++ also support OCR to handle text in scanned books and images. The program shows good performance for English. In addition to the Arabic language to lead free and commercial software in this area. The main features of DjVu++ program are: o Manipulate DjVu files. o Support smaller size than PDF with the same performance. o DjVu++ supports two languages in the OCR technique (Arabic and English). o Read multiple documents at the same time with the new tabs feature. o DjVu++ supports multiple formats:  Convert PDF document into DjVu format with smaller file size and the same performance.  Convert DjVu into PDF format.  Combine images to a single DjVu document. Perform OCR operations on multiple image formats.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15

    phpSANE

    Web-Based Frontend for SANE

    phpSANE is a web-based frontend for SANE written in HTML/PHP so you can scan with your web-browser. It also supports OCR.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . The actual development and issue tracking can be found here: https://bitbucket.org/cryanfuse/crgrep
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Tifftool is a high-performance tool to clean scanned documents in preparation for onscreen display or for OCR. Features include skew correction, orientation correction, despeckle, page alignment, split pages and batch processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    An image postprocessor for the DIY Book Scanner described on instructables.com and diybookscanner.org. Gets images ready for OCR or for PDF. Written in Java based on a partial port of the Leptonica image processing library.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Primary goal of Imated is development of handwritten/machine printed - OCR system. And second goal is development text editor, that will be in a position to import scanned documents OCR them on-the-fly, edit them and print/save as a picture again.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    hocr - Hebrew OCR c/c++ library
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Akshara Malayalam OCR is a project for the development of an OCR for printed and handwritten documents in Malayalam language. The inspiration is from similar OCR softwares in other languages etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The Common OCR Service Interface. COSI is an API that allows developpers to easily bring OCR (Optical Character Recognition) capabilities to image processing applications. COSI supports existing OCR tools such as Tesseract, GOCR or GNU Ocrad.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Conjecture is a modular, extensible, open-source C++ framework for Optical Character Recognition (OCR). It is not a single OCR, but rather an extensible collection of OCRs that can be explored, compared, extended and modified within a unified environment
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Optical Character Recognition (OCR) software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DigitalEyes is an OCR (Optical Character Recognizer) developed in C/Caml released under GNU GPL by SuM42, as a sophomore project in EPITA.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next