Showing 92 open source projects for "ocr and data exporter"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    Node exporter

    Node exporter

    Exporter for machine metrics

    Power your metrics and alerting with a leading open-source monitoring solution. Prometheus implements a highly dimensional data model. Time series are identified by a metric name and a set of key-value pairs. PromQL allows slicing and dicing of collected time series data in order to generate ad-hoc graphs, tables, and alerts. Prometheus has multiple modes for visualizing data: a built-in expression browser, Grafana integration, and a console template language. Prometheus stores time series...
    Downloads: 40 This Week
    Last Update:
    See Project
  • 2
    Prometheus SNMP Exporter

    Prometheus SNMP Exporter

    SNMP Exporter for Prometheus

    This exporter is the recommended way to expose SNMP data in a format that Prometheus can ingest. To simply get started, it's recommended to use the if_mib module with switches, access points, or routers using the public_v2 auth module, which should be a read-only access community on the target device. Note, that community strings in SNMP are not considered secrets, as they are sent unencrypted in SNMP v1 and v2c. For secure access, SNMP v3 is required.
    Downloads: 63 This Week
    Last Update:
    See Project
  • 3
    Rapid LaTeX OCR

    Rapid LaTeX OCR

    Formula recognition based on LaTeX-OCR and ONNXRuntime

    Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes. If you want to train your own model, please move...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 104 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    EcoPaste

    EcoPaste

    Open source clipboard management tools for Windows, Macos and Linux

    Open source clipboard management tools for Windows, macOS, and Linux. Built with Tauri, the application is lightweight and refined, consuming minimal resources. It also delivers a uniform user experience across both Windows, MacOS, and Linux platforms. The application is resident in the background, wakes up with one click through custom shortcut keys, saves time, and improves efficiency. Allows you to bookmark clipboard content for easy and fast access. Whether it's crucial data for work...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 6
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 7
    DeckTape

    DeckTape

    PDF exporter for HTML presentations

    DeckTape is a high-quality PDF exporter for HTML presentation frameworks. DeckTape is built on top of Puppeteer which relies on Google Chrome for laying out and rendering Web pages and provides a headless Chrome instance scriptable with a JavaScript API. DeckTape currently supports the following presentation frameworks out of the box. DeckTape also provides a generic command that works by emulating the end-user interaction, allowing it to be used to convert presentations from virtually any kind...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    docconv

    docconv

    Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

    A Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. See go help install for details on the installation location of the installed docd executable. Make sure that the full path to the executable is in your PATH environment variable. To add image support to the docconv library you first need to install and build gosseract. Now you can add -tags ocr to any go command when building/fetching/testing docconv...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Get Avast Free Antivirus | Your top-rated shield against malware and online scams Icon
    Get Avast Free Antivirus | Your top-rated shield against malware and online scams

    Boost your PC's defense against cyberthreats and web-based scams.

    Our antivirus software scans for security and performance issues and helps you to fix them instantly. It also protects you in real time by analyzing unknown files before they reach your desktop PC or laptop — all for free.
    Free Download
  • 10
    deepdoctection

    deepdoctection

    A Repo For Document AI

    DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for fine-tuning...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Doctor Dok

    Doctor Dok

    Doctor Dok is an AI based medical data framework

    ... - digitalized - accessible anywhere from Mobile or Desktop. Using AI you may translate your health records to one of 50+ languages - making abroad health services more accessible. Doctor Dok uses AI to OCR even a hardly readable photo of your health documents. Then stores it in the cloud with Zero Trust Security architecture (nobody but You can decrypt the data).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    DeepDetect

    DeepDetect

    Deep Learning API and Server in C++14 support for Caffe, PyTorch

    ... of image tagging, object detection, segmentation, OCR, Audio, Video, Text classification, CSV for tabular data and time series. Neural network templates for the most effective architectures for GPU, CPU, and Embedded devices. Training in a few hours and with small data thanks to 25+ pre-trained models. Full Open Source, with an ecosystem of tools (API clients, video, annotation, ...) Fast Server written in pure C++, a single codebase for Cloud, Desktop & Embedded.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Texify

    Texify

    Math OCR model that outputs LaTeX and markdown

    Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Tarsier

    Tarsier

    Vision utilities for web interaction agents

    ... as buttons, links, or input fields that are visible on the page; Tarsier can also tag all textual elements if you pass tag_text_elements=True. Furthermore, we've developed an OCR algorithm to convert a page screenshot into a whitespace-structured string (almost like ASCII art) that an LLM even without vision can understand. Since current vision-language models still lack fine-grained representations needed for web interaction tasks, this is critical.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Amplify

    Amplify

    Automatic enrichment, enhancement, and explanation of your data

    Amplify attaches afterburners to your data. Amplify explains metadata extraction, classification, tagging, and reporting. Eriches derivative data generation like thumbnails, previews, conversions, etc. Enhances batteries-included value-adds like data quality reports, image augmentation, OCR, translations, etc. Amplify leverages the decentralized compute provided by Bacalhau to magically enrich your data. A built-in suite of pipelines decides what your data is and how to best improve upon...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    MySQL 2 Excel Exporter 3-105 [I.S.A]

    MySQL 2 Excel Exporter 3-105 [I.S.A]

    MySQL 2 Excel: Exporter 3-105 [Improved.Simplified.Alternative]

    'MySQL2Excel_Exporter' is an desktop application developed using python 3.6.8 and other add-on libaries. The application exports MySql tables as a excel file. MySQL2Excel_Exporter has two parts: 1) Export - converts all records in mySQL table into excel file 2) Export Filter - converts selected recorerds in mySQL table into excel file Compatible only for windows OS.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    NAPS2 - Not Another PDF Scanner

    NAPS2 - Not Another PDF Scanner

    Scan documents to PDF and other file types, as simply as possible.

    Visit NAPS2's home page at www.naps2.com. NAPS2 is a document scanning application with a focus on simplicity and ease of use. Scan your documents from WIA- and TWAIN-compatible scanners, organize the pages as you like, and save them as PDF, TIFF, JPEG, PNG, and other file formats. Available on Windows, Mac, and Linux. NAPS2 is currently available in over 40 different languages. Want to see NAPS2 in your preferred language? Help translate! See the wiki for more details.
    Leader badge
    Downloads: 513 This Week
    Last Update:
    See Project
  • 18
    SimSail

    SimSail

    Logiciel gratuit de navigation, de météo et de routage

    SimSail est un logiciel gratuit de navigation, d’analyse météo, de planification de route à la voile et d’analyse de performance.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 19
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims...
    Leader badge
    Downloads: 20 This Week
    Last Update:
    See Project
  • 20
    Super PDF Editor (a Batch PDF Processor)

    Super PDF Editor (a Batch PDF Processor)

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign PDF.

    Super PDF Editor is a robust and versatile PDF management software designed to streamline your document handling needs. Whether you're an individual, student, or professional, this software offers a comprehensive suite of tools to create, edit, and manage your PDFs with ease. Key Features: Extract Page: Easily extract specific pages from a PDF document. Split Page: Divide a single PDF page into multiple smaller pages. Rotate Page: Rotate pages to adjust their orientation. Merge Page:...
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 21
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    cogli2

    cogli2

    A simple tool for the visualisation of coarse-grained systems

    cogli2 (formerly known as cogli1) is a simple tool to visualise configurations of coarse-grained simulations. It is straightforward to use and it has a few features that make it very handy to produce publication-ready figures. It supports its own file format but implementing additional parsers is fairly easy. As of now it is not well documented, but this is something we will improve upon in the near future.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23

    realwatermark

    A Python application to add watermarks (text or image) to PDF files

    A Python application to add watermarks (text or image) to PDF files, converts them into image and back to PDF with options for OCR and compression.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    .... Easy pdf imposition, booklet, n ups pages, and more. OCR performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 25
    Super-PDF-Editor

    Super-PDF-Editor

    World's most comprehensive, powerful, process-based PDF editor

    ... performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.