Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "internet dump spider"

x

Sort By:

Relevance

OS

Linux 87
Windows 86
Mac 68
More...
BSD 51
ChromeOS 39
Mobile Operating Systems 3
Desktop Operating Systems 2
Server Operating Systems 1

Category

Internet 99
Software Development 14
System 12
Security 7
Communications 3
Artificial Intelligence 2
Business 2
Database 2
Formats and Protocols 2
Multimedia 2
Scientific/Engineering 2
Education 1
Games 1
Social sciences 1
Text Editors 1

License

OSI-Approved Open Source 85
Other License 3
Public Domain 2
Creative Commons Attribution License 1
More...
GNU Free Documentation License 1

Translations

English 38
German 7
Italian 2
Chinese (Simplified) 1
More...
Chinese (Traditional) 1
Esperanto 1
French 1
Polish 1

Programming Language

Java 25
PHP 19
Python 17
JavaScript 9
More...
C++ 8
Perl 8
C# 5
C 4
Go 4
Unix Shell 3
Visual Basic .NET 3
Elixir 1
Kotlin 1
Pascal 1
PL/SQL 1
Ruby 1
Rust 1

Status

Production/Stable 32
Beta 16
Alpha 13
Pre-Alpha 9
More...
Planning 6
Mature 2
Inactive 2

Showing 104 open source projects for "internet dump spider"

View related business solutions

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
Application Monitoring That Won't Slow Your App Down
AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.

Start Free
1

Spider

High-performance Rust web crawler and scraper for large-scale data

Spider is a high-performance web crawler and web scraping library written in Rust that enables developers to crawl and index websites efficiently. It focuses on speed, concurrency, and reliability by using asynchronous and multi-threaded processing to handle large volumes of web pages. It can rapidly crawl websites to collect links, retrieve page content, and extract structured information from HTML documents. Spider can operate concurrently across many pages, allowing it to gather large...

Downloads: 1 This Week

Last Update: 1 hour ago
See Project
2

xhs-spider

Desktop tool for collecting and exporting Xiaohongshu post data

XHS-Spider is a desktop data collection tool designed to gather content and metadata from the Xiaohongshu platform. It provides a graphical interface that allows users to explore posts, collect information, and download media such as images and videos from individual notes or search results. It was developed primarily as a learning project to demonstrate approaches to building web crawlers and experimenting with technologies such as WebView2 and WPF UI. It supports multiple ways to locate...

Downloads: 2 This Week

Last Update: 2026-03-11
See Project
3

spider_collection

Collection of Python web scraping scripts for data extraction tasks

spider_collection is a collection of Python web crawler scripts created primarily for experimentation, learning, and practical scraping tasks. spider_collection gathers multiple independent spiders designed to collect data from different platforms and services, demonstrating a variety of scraping techniques and workflows. These crawlers make use of common Python scraping tools such as requests, parsel, BeautifulSoup, and the Scrapy framework to extract structured information from web pages....

Downloads: 1 This Week

Last Update: 16 hours ago
See Project
4

EasySpider

A visual no-code/code-free web crawler/spider

A visual code-free/no-code web crawler/spider, supporting both Chinese and English.

Downloads: 12 This Week

Last Update: 2025-01-01
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

FEAPDER

Powerful Python crawler framework for scalable web scraping tasks

feapder is a Python-based web crawling framework designed to simplify the process of building scalable and efficient web scrapers. It focuses on providing a developer-friendly environment that makes it easier to create, run, and manage crawlers for a variety of data collection tasks. It includes several built-in spider types, such as AirSpider, Spider, TaskSpider, and BatchSpider, which address different crawling scenarios ranging from lightweight scraping to distributed and batch-based...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
6

Scrapy-Redis

Redis-based components for Scrapy

You can start multiple spider instances that share a single redis queue. Best suitable for broad multi-domain crawls. Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue. Scheduler + Duplication Filter, Item Pipeline, Base Spiders. Default requests serializer is pickle, but it can be changed to any module with loads and dumps functions. Note that pickle is not compatible between python versions. Version...

Downloads: 0 This Week

Last Update: 2024-07-06
See Project
7

DB Browser for SQLite

The DB Browser for SQLite

DB Browser for SQLite (DB4S) is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite. DB4S is for users and developers who want to create, search, and edit databases. DB4S uses a familiar spreadsheet-like interface, and complicated SQL commands do not have to be learned. This program is not a visual shell for the sqlite command line tool, and does not require familiarity with SQL commands. It is a tool to be used by both developers and...

71 Reviews

Downloads: 89 This Week

Last Update: 2025-05-03
See Project
8

Grab Framework Project

Web Scraping Framework

Grab is a python framework for building web scrapers. With Grab you can build web scrapers of various complexity, from simple 5-line scripts to complex asynchronous website crawlers processing millions of web pages. Grab provides an API for performing network requests and for handling the received content e.g. interacting with DOM tree of the HTML document. The single request/response API that allows you to build network request, perform it and work with the received content. The API is...

Downloads: 0 This Week

Last Update: 2025-09-18
See Project
9

Scrapling

An adaptive Web Scraping framework

Scrapling is an adaptive web scraping framework designed to handle everything from a single HTTP request to large-scale, concurrent crawls. Built for modern websites, it intelligently adapts to structural changes by automatically relocating elements when page layouts update. The framework includes advanced fetchers capable of bypassing anti-bot protections such as Cloudflare Turnstile using stealth and browser automation techniques. Its powerful spider system supports multi-session crawling,...

Downloads: 2 This Week

Last Update: 2026-03-08
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
10

DotnetSpider

Lightweight .NET framework for fast web crawling and data scraping

DotnetSpider is a web crawling and data extraction framework built on the .NET Standard platform. It is designed to help developers create efficient and scalable crawlers for collecting structured data from websites. It provides a high-level API that simplifies the process of defining spiders, managing requests, and extracting content from web pages. Developers can create custom spiders by extending base classes and configuring pipelines that handle downloading, parsing, and storing...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
11

GitHub Actions for Firebase

GitHub Action for interacting with Firebase

This Action for firebase-tools enables arbitrary actions with the firebase command-line client. Starting with version v2.1.2 each version release will point to a versioned docker image allowing for hardening our pipeline (so things don't break when I do something dump). On top of this, you can also point to a master version if you would like to test out what might not be deployed into a release yet. If you want to add a message to a deployment (e.g. the Git commit message) you need to take...

Downloads: 1 This Week

Last Update: 2026-03-19
See Project
12

req

Simple Go HTTP client with Black Magic

Simple and easy to use, providing rich client-level and request-level settings, all of which are intuitive and chainable methods. Provides powerful and convenient debug utilities, including debug logs, performance traces, and even dump the complete request and response content. API testing can be done with minimal code, no need to explicitly create any Request or Client, or even to handle errors. Detect and decode to utf-8 automatically if possible to avoid garbled characters (See Auto...

Downloads: 0 This Week

Last Update: 2025-12-16
See Project
13

python-fxxk-spider

Collection of 100+ Python web scraping projects and crawler examples

python-fxxk-spider is a curated collection of Python web scraping and crawler projects gathered in a single repository for reference and learning. It aggregates many independent scraping examples that target a wide range of websites, online services, and public data sources. Instead of being a single crawler tool, it functions as a catalog of ready-made Python spider implementations that demonstrate different scraping techniques. python-fxxk-spider includes scrapers for social media,...

Downloads: 3 This Week

Last Update: 16 hours ago
See Project
14

Web Spider, Web Crawler, Email Extractor

Free Extracts Emails, Phones and custom text from Web using JAVA Regex

In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender :...

Downloads: 2 This Week

Last Update: 2025-11-23
See Project
15

AutoWikiBrowser

AutoWikiBrowser is a semi-automated Wikipedia editor, designed to make tedious, repetitive tasks quicker and easier. For more information, see the project homepage at http://en.wikipedia.org/wiki/Wikipedia:AutoWikiBrowser.

6 Reviews

Downloads: 79 This Week

Last Update: 2026-02-11
See Project
16

ahCrawler

A PHP search engine for your website and web analytics tool. GNU GPL3

ahCrawler is a set to implement your own search on your website and an analyzer for your web content. It can be used on a shared hosting. It consists of * crawler (spider) and indexer * search for your website(s) * search statistics * website analyzer (http header, short titles and keywords, linkchecker, ...) You need to install it on your own server. So all crawled data stay in your environment. You never know when an external webspider updated your content. Trigger a rescan...

1 Review

Downloads: 1 This Week

Last Update: 2025-12-11
See Project
17

溫度日記 Hearty Journal

療癒系心情日記 App

Hearty Journal is a beautiful diary and personal journal application with a focus on privacy. Securely record your thoughts, feelings, ideas and private moments with the ease of writing on a pad of paper. Its aesthetic looks like a piece of notebook paper with handwritten words on it. Also, beautiful themes, lovely journal stickers and luxury fonts are available in the app. Hearty Journal works on both your computer and phone (Windows, macOS, iOS and Android are supported). Moreover, to keep...

1 Review

Downloads: 0 This Week

Last Update: 2025-06-15
See Project
18

Crawlab

Distributed web crawler admin platform for spiders management

Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. Master node and worker nodes communicate...

Downloads: 0 This Week

Last Update: 2023-07-26
See Project
19

Spider-Search

Search multiple engines for a specific string

Search multiple engines for a specific string

Downloads: 0 This Week

Last Update: 2023-04-17
See Project
20

Easyspider - Distributed Web Crawler

Easy Spider is a distributed Perl Web Crawler Project from 2006

Easy Spider is a distributed Perl Web Crawler Project from 2006. It features code from crawling webpages, distributing it to a server and generating xml files from it. The client site can be any computer (Windows or Linux) and the Server stores all data. Websites that use EasySpider Crawling for Article Writing...

1 Review

Downloads: 0 This Week

Last Update: 2025-03-16
See Project
21

crawly

High-level web crawling and scraping framework for Elixir apps

Crawly is a high-level application framework for crawling websites and extracting structured data using the Elixir programming language. It provides a complete environment for building web crawlers that systematically visit pages, collect information, and transform that data into structured formats for further processing. Crawly is designed for tasks such as data mining, information processing, and building historical archives of web content. Crawly follows the Elixir and OTP architecture...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
22

Web Spider, Web Crawler, Email Extractor

Free Extracts Emails, Phones and custom text from Web using JAVA Regex

In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export...

3 Reviews

Downloads: 1 This Week

Last Update: 2022-12-24
See Project
23

Orao Basket

Programming tools for emulator of eight bit computer ORAO

Smederevo, 05, august 2018 Long time ago, about 1986 I have become proud owner of eight bit computer ORAO based on MOS 6502 processor. It was first and for me the best home computer at that time. My whole knowledge of computer programming begins with that computer. Recently for some unknown reason I have become interested in old eight bit computers again. After short search on the Internet I have found emulator of my favorite computer. It literally emulates every peace of hardware...

Downloads: 0 This Week

Last Update: 2023-02-19
See Project
24

sposkpat2

sposkpat2, Single Purpose Operating System Kpat Live Distro

...Please give it a try. 12 card games are included: Aces Up Forty & Eight Freecell Golf Grandfather Grandfather's Clock Gypsy Klondike Mod3 Simple Simon Spider Yuko A safe and silent way to play a card game: Blocked from all networks, including the internet. Discs are spinned down for quietness and energy-saving. No distractions, no nags, never. Open source. Now for displays up to 4k. Made possible by debian (made on buster for bullseye) and KDE's kpat. Boots from CD/DVD, USB stick and inside virtual machines such as qemu. ...

Downloads: 0 This Week

Last Update: 2022-11-16
See Project
25

rubywebcrawler

web spider software written in ruby

Downloads: 0 This Week

Last Update: 2021-12-24
See Project

Previous
You're on page 1
2
3
4
5
Next

Related Searches

email extractor

sqlite

grab

web crawler

crawler

php board

sql database edit mdf files support

run sql command line

wikipedia

php spider

Related Categories

Internet

Software Development

System

Security

Communications

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise