data scraper free download

Showing 47 open source projects for "data scraper"

View related business solutions

Catch Bugs Before Your Customers Do
Real-time error alerts, performance insights, and anomaly detection across your full stack. Free 30-day trial.

Move from alert to fix before users notice. AppSignal monitors errors, performance bottlenecks, host health, and uptime—all from one dashboard. Instant notifications on deployments, anomaly triggers for memory spikes or error surges, and seamless log management. Works out of the box with Rails, Django, Express, Phoenix, Next.js, and dozens more. Starts at $23/month with no hidden fees.

Try AppSignal Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

Linkedin Scraper

A library that scrapes Linkedin for user data

Linkedin Scraper is a library that scrapes Linkedin for user data. Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper. The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways.

Downloads: 1 This Week

Last Update: 2026-01-27
See Project
2

dude uncomplicated data extraction

dude uncomplicated data extraction: A simple framework

Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.

Downloads: 0 This Week

Last Update: 2024-03-02
See Project
3

Colly

Elegant Scraper and Crawler Framework for Golang

Colly provides a clean interface to write any kind of crawler/scraper/spider. With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing or archiving. Clean API. Fast (>1k request/sec on a single core) Manages request delays and maximum concurrency per domain. Automatic cookie and session handling. Sync/async/parallel scraping.

Downloads: 8 This Week

Last Update: 2025-03-27
See Project
4

CyberScraper 2077

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.

Downloads: 0 This Week

Last Update: 2026-01-20
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
5

Spider

High-performance Rust web crawler and scraper for large-scale data

Spider is a high-performance web crawler and web scraping library written in Rust that enables developers to crawl and index websites efficiently. It focuses on speed, concurrency, and reliability by using asynchronous and multi-threaded processing to handle large volumes of web pages. It can rapidly crawl websites to collect links, retrieve page content, and extract structured information from HTML documents. Spider can operate concurrently across many pages, allowing it to gather large...

Downloads: 6 This Week

Last Update: 1 day ago
See Project
6

Crawl4AI

Open-source LLM Friendly Web Crawler & Scraper

Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.

Downloads: 0 This Week

Last Update: 2026-01-16
See Project
7

Email Scraper and Validator

This is a simple desktop application built with Python and Tkinter that allows users to scrape email addresses from websites and validate them using an external API. It also provides features to save the scraped emails to a database, and export the data to various file formats. 1. Enter a list of website URLs or emails in the input field. 2. Click the Scrape button to scrape email addresses from the provided websites. 3. Click the Validate button to validate the scraped email...

Downloads: 1 This Week

Last Update: 2024-03-03
See Project
8

MDCx

Movie metadata scraper and organizer for media libraries and NFO

MDCx is an open source media metadata scraping and organization tool designed to automate the process of collecting detailed information for movie files. It retrieves metadata from multiple online sources and applies it to local media collections, helping users maintain structured and well-organized libraries. MDCx can download information such as titles, cast data, artwork, and other metadata, then generate standardized NFO files compatible with media management systems. It also supports...

Downloads: 8 This Week

Last Update: 6 days ago
See Project
9

Google Maps Extractor

Free Google Map Extractor(With Email) | Google Maps Scraper

A free Google Map extractor for business leads—fast & efficient! This Google Maps scraper extracts phone numbers, emails, locations, and social media profiles, then exports to CSV. Visit: https://gmplus.io/

Downloads: 7 This Week

Last Update: 2025-04-12
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
10

ai-scrapper

🚀 Discover AI Web Scraper! 🚀 Tired of copying and pasting data from websites? I developed a desktop application with Electron and Gemini AI to extract structured data easily and efficiently! 🤖✨

1 Review

Downloads: 8 This Week

Last Update: 2025-05-31
See Project
11

linkedin2username

Generate probable usernames from LinkedIn company employee lists

...This process helps security researchers, penetration testers, and investigators perform reconnaissance by building potential username lists for further security testing or OSINT analysis. Unlike tools that rely on official APIs, linkedin2username operates as a pure web scraper and therefore does not require API keys. The script uses Selenium to automate browser interactions and perform searches within LinkedIn to gather employee data.

Downloads: 4 This Week

Last Update: 2026-03-07
See Project
12

scraper-with-chatgpt

It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.

Downloads: 0 This Week

Last Update: 2023-08-28
See Project
13

URS (Universal Reddit Scraper)

A comprehensive Reddit scraping command-line tool written in Python

Universal Reddit Scraper, a comprehensive Reddit scraping command-line tool written in Python. Whether you are using URS for enterprise or personal use, I am very interested in hearing about your use case and how it has helped you achieve a goal. This is a comprehensive Reddit scraping tool that integrates multiple features. All files except for those generated by the wordcloud tool are exported to JSON by default. Wordcloud files are exported to PNG by default. All exported files are saved...

Downloads: 0 This Week

Last Update: 2023-05-08
See Project
14

Goutte

Goutte, a simple PHP Web Scraper

Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...

Downloads: 0 This Week

Last Update: 2023-04-01
See Project
15

NSFW Data Scraper

Collection of scripts to aggregate image data

NSFW Data Scraper is an open-source project that provides scripts for automatically collecting large datasets of images intended for training NSFW image classification systems. The repository focuses on aggregating image data from various online sources so that developers can build datasets suitable for training content moderation models. These datasets typically contain images categorized into different classes associated with adult or explicit content, which can then be used to train neural networks that detect unsafe or inappropriate material. ...

Downloads: 6 This Week

Last Update: 6 days ago
See Project
16

AutoScraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.

Downloads: 1 This Week

Last Update: 2023-04-12
See Project
17

mlscraper

ML-based HTML scraper that learns extraction rules from examples

...Once trained, the generated scraper can process new pages and return the extracted data in structured formats such as dictionaries or lists. This approach simplifies web scraping tasks by shifting the focus from rule-writing to example-based training. Internally, the project processes HTML documents, identifies relevant elements in the DOM, and builds extraction logic based on statistical or heuristic analysis of the training samples.

Downloads: 2 This Week

Last Update: 2 days ago
See Project
18

Vanga

Compiler-like generic data scraper and GUI automation tool.

A Java-based visual compiler for GUI recognition and automation. The screens are described in an XML file which contains the definitions of lexemes and the tokens that comprise them. Upon a successful match of a screen, user-defined code is executed. Within the scope of this code, the user is capable of extracting data from the screen, interpreting it, and driving the GUI accordingly. The demonstration example reads the value of a calculator, displays it for the user, and enables him to...

Downloads: 0 This Week

Last Update: 2021-08-23
See Project
19

NYT Vote Scraper

Scrapes the NYT Votes Remaining Page JSON

NYT Vote Scraper is a small but clever project that periodically fetches JSON data from the “Votes Remaining” page of The New York Times during the 2020 U.S. presidential election and commits the results into the repository, effectively using Git as a time-series database. The idea is to create a historical record — including diffs — of how vote counts and “votes remaining” estimates changed over time.

Downloads: 0 This Week

Last Update: 2025-12-09
See Project
20

JonDoFox Advanced Privacy Browser

Browser with fingerprinting- and psychological profiling protection

In addition to fingerprinting, ad networks are collecting psychological data of the users. This data is primarily based on mouse movement and scroll (we can't block clicks. reasonably). It leaks and is being used for anything from spam to blackmailing. Our addons block only those javascript functions, thus leaving the Internet intact (unlike noscript, which makes FB being unusable). If a page is broken, hit ctrl+shift+p and retry (private browsing mode) We do reach a "nearly unique" rating at the EFF fingerprinting test page. ...

Downloads: 0 This Week

Last Update: 2019-09-15
See Project
21

X-RAY

The next web scraper, see through the <html> noise

Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...

Downloads: 0 This Week

Last Update: 2021-10-05
See Project
22

django-dynamic-scraper

Creating Scrapy scrapers via the Django admin interface

...Since it simplifies things DDS is not usable for all kinds of scrapers, but it is well suited for the relatively common case of regularly scraping a website with a list of updated items (e.g. news, events, etc.) and then dig into the detail page to scrape some more infos for each item. Django Dynamic Scraper tries to keep its data structure in the database as separated as possible from the models in your app, so it comes with its own Django model classes for defining scrapers, runtime information related to your scraper runs and classes.

Downloads: 0 This Week

Last Update: 2022-09-05
See Project
23

google-play-scraper

Node.js scraper to get data from Google Play

Node.js module to scrape application data from the Google Play store. Retrieves the full detail of an application. Retrieves a list of applications from one of the collections at Google Play. Retrieves a list of apps that results of searching by the given term. Returns the list of applications by the given developer name. Given a string returns up to five suggestions to complete a search query term. Retrieves a page of reviews for a specific application. Returns a list of similar apps to the...

Downloads: 0 This Week

Last Update: 2022-03-22
See Project
24

WebExtractServer

WebExtractServer use with WebExtractLte for use with web browsers

Browse data, fetched by WebExtractLte directly in your browser. Designed to be used with Webscraper (webscraper.io) - third party web scraper tool, available as plugin for Chrome and Firefox.

Downloads: 0 This Week

Last Update: 2019-04-29
See Project
25

YellowPages Australia Scraper

Gather Publicly Available Business Contact Data

Do you need contact information of businesses in Australia? You can get it easily by using YellowPages Australia Scraper. Program automatically navigates in YellowPages AU and gathers the data you needed.

2 Reviews

Downloads: 0 This Week

Last Update: 2018-11-15
See Project