Search Results for "pdf to text open source"

Showing 10918 open source projects for "pdf to text open source"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 1
    Open Source Routing Machine

    Open Source Routing Machine

    Open Source Routing Machine - C++ backend

    High-performance routing engine written in C++14 designed to run on OpenStreetMap data. There are several services available via HTTP API, C++ library interface and NodeJs wrapper. Nearest, snaps coordinates to the street network and returns the nearest matches. Route finds the fastest route between coordinates. Table computes the duration or distances of the fastest route between all pairs of supplied coordinates. Match snaps noisy GPS traces to the road network in the most plausible way....
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    PDF Arranger

    PDF Arranger

    Small python-gtk application, to merge or split PDFs

    PDF Arranger is a small python-gtk application, which helps the user to merge or split PDF documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a front end for pikepdf. PDF Arranger is a fork of Konstantinos Poulios’s PDF Shuffler (see Savannah or Sourceforge). It’s a humble attempt to make the project a bit more active.
    Downloads: 301 This Week
    Last Update:
    See Project
  • 3
    Asciidoctor PDF

    Asciidoctor PDF

    Asciidoctor PDF: A native PDF converter for AsciiDoc

    A fast text processor & publishing toolchain for converting AsciiDoc to HTML5, DocBook & more. Asciidoctor is a fast, open source, Ruby-based text processor for parsing AsciiDoc® into a document model and converting it to output formats such as HTML 5, DocBook 5, manual pages, PDF, EPUB 3, and other formats. Asciidoctor also has an ecosystem of extensions, converters, build plugins, and tools to help you author and publish content written in AsciiDoc.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 4
    Markdown PDF

    Markdown PDF

    Markdown converter for Visual Studio Code

    This extension converts Markdown files to PDF, HTML, PNG or JPEG files.
    Downloads: 8 This Week
    Last Update:
    See Project
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 5
    PDF.js

    PDF.js

    A PDF Reader in JavaScript

    PDF.js is a web standards-based platform for parsing and rendering Portable Document Formats (PDFs). Open source and built with HTML5, this PDF viewer is supported by a great community and Mozilla Labs. PDF.js can be used on both modern and older browsers, and is built into version 19+ of Firefox.
    Downloads: 90 This Week
    Last Update:
    See Project
  • 6
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    PDF Signature

    PDF Signature

    Free web software for signing PDFs and also organize pages

    Free web software for signing, organizing, editing metadatas or compressing PDFs.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 8
    Google Open Source Project Style Guide

    Google Open Source Project Style Guide

    Chinese version of Google open source project style guide

    Each larger open source project has its own style guide, a series of conventions on how to write code for the project (sometimes more arbitrary). When all the code maintains a consistent style, it is more important when understanding large code bases. easy. The meaning of "style" covers a wide range, from "variables use camelCase" to "never use global variables" to "never use exceptions". The English version of the project maintains the programming style guidelines used in Google...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    Text Generation Web UI

    Text Generation Web UI

    A gradio web UI for running Large Language Models like LLaMA

    A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. Dropdown menu for switching between models. Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open Assistant formats. Nice HTML output for GPT-4chan. Markdown output for GALACTICA, including LaTeX rendering. Custom chat characters. Advanced chat features (send images, get audio responses with TTS). Very...
    Downloads: 93 This Week
    Last Update:
    See Project
  • Your top-rated shield against malware and online scams | Avast Free Antivirus Icon
    Your top-rated shield against malware and online scams | Avast Free Antivirus

    Browse and email in peace, supported by clever AI

    Our antivirus software scans for security and performance issues and helps you to fix them instantly. It also protects you in real time by analyzing unknown files before they reach your desktop PC or laptop — all for free.
    Free Download
  • 10
    Open-Sora

    Open-Sora

    Open-Sora: Democratizing Efficient Video Production for All

    Open-Sora is an open-source initiative aimed at democratizing high-quality video production. It offers a user-friendly platform that simplifies the complexities of video generation, making advanced video techniques accessible to everyone. The project embraces open-source principles, fostering creativity and innovation in content creation. Open-Sora provides tools, models, and resources to create high-quality videos, aiming to lower the entry barrier for video production and support diverse...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 11
    Text Editors

    Text Editors

    Sempare Template (scripting) Engine for Delphi

    Sempare Template (scripting) Engine for Delphi allows for flexible dynamic text generation. It can be used for generating email, HTML, reports, source code, xml, configuration, etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Laravel PDF

    Laravel PDF

    Create PDF files in Laravel apps

    This package provides a simple way to create PDFs in Laravel apps. Under the hood it uses Chromium to generate PDFs from Blade views. You can use modern CSS features like grid and flexbox to create beautiful PDFs.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    Pdf Viewer For Android

    Pdf Viewer For Android

    A Lightweight PDF Viewer Android library

    A Simple PDF Viewer library which only occupies around 80kb while most of the Pdf viewer occupies upto 16MB space. Hello Developers! We're thrilled to share some significant enhancements we've made to our PDF viewer library. We've fine-tuned several aspects to enhance your experience and ensure top-notch performance and security. Step into the future with Jetpack Compose compatibility. Integrating our PDF viewer in Compose projects is now effortless, thanks to the PdfRendererViewCompose...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    Slate Text Editor framework

    Slate Text Editor framework

    Completely customizable framework for building rich text editors

    you can do things like turn a selection of text bold, or add a semantically rendered block quote in the middle of the page. The most important part of Slate is that plugins are first-class entities. That means you can completely customize the editing experience, to build complex editors like Medium's or Dropbox's, without having to fight against the library's assumptions. Slate's core logic assumes very little about the schema of the data you'll be editing, which means
    Downloads: 13 This Week
    Last Update:
    See Project
  • 17
    Obsidian Text Generator Plugin

    Obsidian Text Generator Plugin

    Text generator is a handy plugin for Obsidian

    Text Generator is an open-source AI Assistant Tool that brings the power of Generative Artificial Intelligence to the power of knowledge creation and organization in Obsidian. For example, use Text Generator to generate ideas, attractive titles, summaries, outlines, and whole paragraphs based on your knowledge database.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    Stirling-PDF

    Stirling-PDF

    Web application that allows you to perform operations on PDF files

    Stirling PDF is a powerful, locally hosted web-based PDF manipulation tool offering a wide range of editing, conversion, and utility features. It allows users to merge, split, compress, convert, OCR, and perform other operations on PDF files directly from a browser without uploading data to third-party servers. The tool is privacy-conscious, self-hostable via Docker, and built with modularity in mind to allow future expansion and integration.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Text Search Engine

    Text Search Engine

    A text search engine that supports mixed Chinese and English search

    Text-Search-Engine is a JavaScript-based lightweight search engine that enables full-text search functionality. It allows developers to implement fast search indexing and retrieval in web applications.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    FastReport Open Source

    FastReport Open Source

    Free Open Source Reporting tool for .NET

    Free Open Source Reporting tool for .NET Core/.NET Framework that helps your application generate document-like reports.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 21
    Open QR Code

    Open QR Code

    Open QR Code is an open-source, cross-platform app

    Open QR Code is an open-source cross-platform application developed using Flutter as main framework used to build the application, in common C, C++, Dart, Skia (a 2D rendering engine), and Impeller (the default rendering engine on iOS), Java, Kotlin. Open QR Code allows users to generate and scan QR codes effortlessly. The app is available on Android, Windows, and the Web. Users can generate QR codes from any text input, save them to their gallery, share them directly from the app, and scan QR...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 22
    MCP Text Editor

    MCP Text Editor

    Provides line-oriented text file editing capabilities

    The MCP Text Editor Server provides line-oriented text file editing capabilities through a standardized API, optimized for integration with Large Language Models (LLMs). It enables efficient partial file access, minimizing token usage while ensuring safe concurrent editing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    react-native-pdf

    react-native-pdf

    A pdf component for react-native

    A react native PDF view component (cross-platform support) Read a PDF from a URL, blob, local file, or asset and can cache it. Display horizontally or vertically. Drag and zoom. Double tap for zoom. Support password-protected pdf. Jump to a specific page in the pdf. We use react-native-blob-util to handle file system access in this package, So you should install react-native-pdf and react-native-blob-util.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 24
    Recognizers-Text

    Recognizers-Text

    Recognition and resolution of numbers, units, date/time, etc.

    Recognizers-Text is a multilingual text recognition library that extracts structured information such as dates, numbers, and currency values from unstructured text.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25

    Tesseract OCR

    Open Source OCR Engine

    Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns. Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. It supports...
    Downloads: 2,248 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.