Showing 118 open source projects for "duplicate text finder"

View related business solutions
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ... piewszym wykonaniu programu, zostaje utworzony katalog Data, w miejscu gdzie zapisany jest program. Aby wybrać ścieżkę do katalogu z obrazami należy w pliku 'settings.txt' zapisać ścieżkę. Następnie można wykonywać program z: -an, -mnb, -c, -i Link do GitHub: https://github.com/Duke-Axer/Duplicate-Finder Wszystkie pytania proszę pisać na b.gabka.nkn@gmail.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Text Line Duplicate Remover

    Text Line Duplicate Remover

    Remove duplicate lines from your text

    This standalone offline web browser tool helps you remove duplicate lines from your text, with additional text processing options. Simply open it in your browser by double-clicking the html file. It also includes the source code too. I made this when I was working with long lists of entries and needed something to automatically clean them up. As a bonus you can also change the Sentence Case of the text, make it lowercase, UPPERCASE or Sentence case.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    Video hard subtitle extraction, generate srt file. There is no need to apply for a third-party API, and text recognition can be implemented locally. A deep learning-based video subtitle extraction framework, including subtitle region detection and subtitle content extraction. A GUI tool for extracting hard-coded subtitles (hardsub) from videos and generating srt files. Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu...
    Downloads: 65 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    Pylint

    Pylint

    It's not just a linter that annoys you!

    Pylint is a static code analyzer for Python 2 or 3. The latest version supports Python 3.7.2 and above. Pylint analyses your code without actually running it. It checks for errors, enforces a coding standard, looks for code smells, and can make suggestions about how the code could be refactored. Projects that you might want to use alongside pylint include flake8 (faster and simpler checks with very few false positives), mypy, pyright or pyre (typing checks), bandit (security-oriented...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    SurveyJS

    SurveyJS

    JavaScript Survey and Form Library

    SurveyJS Form Library is distributed as npm packages and as scripts and style sheets that you can reference on your page. You can use it in any React, Angular, Vue, Knockout, or jQuery application. React, Angular, Knockout, and Vue3 are supported natively. To communicate with the server, the libraries use JSON objects that represent form schemas (content and layout of a form) and form results (answers). You have the option to build dynamic JSON-driven forms using our free full-featured...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Duplicate Agent

    Duplicate Agent

    Duplicate Files Finder and Cleaner

    This program was created to detect and delete unnecessary files on your computer that you've unknowingly created, copied, or backed up in some way.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    GrandPerspective

    GrandPerspective

    Graphically shows disk usage for Macs

    GrandPerspective is a utility application for macOS that graphically displays the disk usage of your file system. It can help you to manage your disk, as you can easily spot which files and folders take up the most space.
    Leader badge
    Downloads: 3,352 This Week
    Last Update:
    See Project
  • 9
    Swiss File Knife

    Swiss File Knife

    One hundred command line tools in a small and portable binary.

    Create zip files, extract zip files, replace text in files, search in files using expressions, stream text editor, instant command line ftp and http server, send folder via network, copy folder excluding sub folders and files, find duplicate files, run a command on all files of a folder, split and join large files, make md5 checksum lists of files, remove tab characters, convert CR/LF, list newest or biggest files of a folder, compare folders, treesize, show first or last lines of a file, find...
    Leader badge
    Downloads: 471 This Week
    Last Update:
    See Project
  • The All-in-One Commerce Platform for Businesses - Shopify Icon
    The All-in-One Commerce Platform for Businesses - Shopify

    Shopify offers plans for anyone that wants to sell products online and build an ecommerce store, small to mid-sized businesses as well as enterprise

    Shopify is a leading all-in-one commerce platform that enables businesses to start, build, and grow their online and physical stores. It offers tools to create customized websites, manage inventory, process payments, and sell across multiple channels including online, in-person, wholesale, and global markets. The platform includes integrated marketing tools, analytics, and customer engagement features to help merchants reach and retain customers. Shopify supports thousands of third-party apps and offers developer-friendly APIs for custom solutions. With world-class checkout technology, Shopify powers over 150 million high-intent shoppers worldwide. Its reliable, scalable infrastructure ensures fast performance and seamless operations at any business size.
    Learn More
  • 10
    CRC RevEng

    CRC RevEng

    Arbitrary-precision CRC calculator and algorithm finder

    CRC RevEng is a portable, arbitrary-precision CRC calculator and algorithm finder. It calculates CRCs using any of the 113 preset algorithms, or a user-specified algorithm to any width. It calculates reversed CRCs to give the bit pattern that produces a desired forward CRC. CRC RevEng also reverse-engineers any CRC algorithm from sufficient correctly formatted message-CRC pairs and optional known parameters. It comprises powerful input interpretation options. Compliant with Ross Williams...
    Leader badge
    Downloads: 118 This Week
    Last Update:
    See Project
  • 11
    XnView MP
    ... image management. You have features like batch rename, batch converter, duplicate image finder, image compare, but you can also create contact sheets, slideshow. XnConvert is a fast and powerful batch image converter, you can convert, resize, watermark, add text, enhance, filter in batch mode. XnResize is a fast and powerful batch image resizer, you can convert, resize in batch mode. You have the Online Convert to batch resize and convert your files from your web browser.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 12
    Sokoban YASC

    Sokoban YASC

    A very richly featured implementation of the Sokoban puzzle game

    Sokoban YASC - Yet Another Sokoban Clone - for Windows. A wealth of features, e.g., deadlock detection, reverse mode, and replay mode. Good import functions and highly configurable, e.g., skins. Tools: Editor, solver, optimizer, generator, capture, duplicate finder.
    Leader badge
    Downloads: 54 This Week
    Last Update:
    See Project
  • 13
    File-Studio

    File-Studio

    A tool that automates complex file operations.

    File studio is a tool that assists in handling complex file operations such as bulk renaming, organizing folders and more.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 14

    Virtualdub Batch Video DeShake

    Batch to compress [and deshake] all videos [or images] in folder

    Installation: Execute "DeShakInst.BAT" VirtualDub2 44282; AviSynth+ 3.7.5 updated to C:\DVD DESHAK.BAT updated to C:\UT and added to PATH Usage: DESHAK task[s] [parameters] Tasks: tp1: deshake pass1 LOG generation for 2nd pass tp2: deshake pass2 and compress video and audio to MP3 tcomp: compress (no deshake) twav: extract WAV and/or uses external WAV audio Parameters (more in help): vEXT: video extension (ie: vmov), default: vAVI qN: h264...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 15
    Sprint PDF Editor (Smarter PDF Solution)

    Sprint PDF Editor (Smarter PDF Solution)

    Edit, Convert, Extract , Export, Secure and PDF Imposition.

    Sprint PDF Editor® The Productive, Modern, Innovative, Clean & Colourful GUI. Faster, Smarter & Seamless workflows, with 50+ functions. Sprint PDF Editor & Reader, Complete PDF Solution, Supercharge Your Workflows With Imposition, Extract, Compress, Watermark, Protect & Secure, Split & Merge, Crop Pages, Printing, Stamp & more. Your Privacy, Our Priority Protect Your Data with Complete Confidence. Our software is designed to keep your information 100% secure. Unlike cloud-based...
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16

    logwitch

    a simple log file scanner for linux written in shell and lua

    logwitch checks single or rotated log files, either plain text or gzipped. It can work with log4j and gnu/linux system logs. You configure it to watch for lines in log files that interest you in by creating a simple text file in its config directory. It emails a daily report to you and timestamps logs so that you do not receive duplicate information. logwitch is GPL3 licensed
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Synonymizer / ThesaurusPlus

    Synonymizer / ThesaurusPlus

    Spice Up Your Sentences, Make Them Incomprehensible!

    NOTE: The first time you run the application, you must run as admin. This is a program that replaces all words in a given text with synonyms. I will admit that after one or two passes by this tool, your sentences will no longer be coherent. It might be a good place to start, but please keep that in mind. This was a joke a friend of mine made and I decided to code it up in a day or so. Lastly, there is a config,txt file that gets generated when the program configs change. You can edit...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    pathopener

    pathopener

    Quick and Easy Folder Access

    ... Storage: Automatically saves your folder paths for future sessions Duplicate Prevention: Automatically detects and handles duplicate paths Path Validation: Verifies paths exist before saving Simple Interface: Clean, easy-to-use interface with one-click folder opening Read the ReadMe text
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    garysfm

    garysfm

    An advanced file manager with qss themes and iso and folder previews

    garysfm which stands for Gary's File Manager is a file manager with some advanced features. Those features include bulk renaming and folder image previews. I has rather advanced search functions, tab browsing with persistence between launches. It remembers your folder sorting and view options in icon view. It also remembers your active tabs between sessions. It has progress dialog while doing large operations like copying large files, and folders with many files. python version works on...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Blaiz Tools

    Blaiz Tools

    Tools for working with Gossamer codebase and app source code

    Tools for packing files, converting images, manipulating text, checking code and working with the Gossamer codebase and app source code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    FSTDFC

    File Search To Date Folder Copy

    Version 0.7 - 31st October 2024 Added option to specify multiple file wildcards for search criteria, separate multiple by space to use Font size changes to make the form easier to use Version 0.6 - 5th September 2024 Increased maximum file count from 100K to 1000K Any file with EXIF data showing the date will be used for the output folder naming, all other files will use the "LastWriteTime" as the parameter. Added counter showing the number of files still to be copied to the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MediaFileDuplicateFinder

    MediaFileDuplicateFinder

    MediaFileDuplicateFinder finds duplicated video, image, & audio files

    Media File Duplicate Finder finds duplicated video and image files based on similarity. Finds duplicates which have different resolution, frame rate, watermark, and video file tags. It's a cross-platform program that supports Windows, Linux and Mac-OS. Please use below link to report bugs, feature-request, or questions. https://github.com/David-Maisonave/MediaFileDuplicateFinder/issues/new/choose For more information, visit the following Wiki page: https://github.com/David-Maisonave...
    Downloads: 41 This Week
    Last Update:
    See Project
  • 23
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    World's most comprehensive, powerful, process-based and lighting fast PDF reader, editor and batch processor. Includes features like Create PDF from Images, HTML, Text files. Create a processing log file. Extract Page, Split Page, Rotate Page, Merge Page, Duplicate page, Move Page, Printing, and Compress Page. Improve image enhancement before OCR operation for better OCR performance. pdf Imposition, etc. Super PDF Editor is best for bulk pdf processing, especially for the printing industry...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    Remove Duplicate Lines

    Remove Duplicate Lines

    Remove duplicate lines in text file

    A handy tool with graphical interface that remove/delete duplicate lines in a text file. GitHub source: https://github.com/ahmed-fathy/remove-duplicates/
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Burny1250 terminal

    Burny1250 terminal

    Program for transferring G-code to Burny 1250 or Burny 1250+

    The program is designed to transfer the G-code to the controller of the Burny1250 plasma cutting machines. If there is a postprocessor for a modern cam program, then it is necessary to resolve the issue only with the transfer data to the controller Burny1250. The program may be needed for machines: B & W SYSTEMS CNC Plasma Cutter (BWPC01) LOCKFORMER Vulcan NT 2000 Aviator XLT Innerlogic Proline 2200 Precision Plasma Cutter Innerlogic SR-45i CNC plasma C&G Aviator XL CNC plasma...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.