Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may create and pass an HttpClient instance to Goutte. For example, to add a 60 second request timeout. Read the documentation of the BrowserKit, DomCrawler, and HttpClient Symfony Components for more information about what you can do with Goutte. Goutte is a thin wrapper around the following Symfony Components: BrowserKit, CssSelector, DomCrawler, and HttpClient.

Features

  • Screen scraping and web crawling library
  • Make requests
  • Click on links
  • Extract data
  • Submit forms
  • Depends on PHP 7.1+

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Goutte

Goutte Web Site

Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Goutte!

Additional Project Details

Programming Language

PHP

Related Categories

PHP Browsers, PHP Libraries, PHP Web Scrapers

Registered

2021-07-06