Class: Ronin::Recon::Web::Spider

Inherits:
Ronin::Recon::WebWorker show all
Defined in:
lib/ronin/recon/builtin/web/spider.rb

Overview

A recon worker that spiders a website.

Constant Summary

Constants included from Mixins::HTTP

Mixins::HTTP::VALID_STATUS_CODES

Instance Method Summary collapse

Methods inherited from Ronin::Recon::Worker

accepts, concurrency, #initialize, intensity, outputs, register, run

Constructor Details

This class inherits a constructor from Ronin::Recon::Worker

Instance Method Details

#process(website) {|url| ... } ⇒ Object

Spiders a website and yields every spidered URL.

Parameters:

Yields:

  • (url)

    Every spidered URL will be yielded.

Yield Parameters:



60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# File 'lib/ronin/recon/builtin/web/spider.rb', line 60

def process(website)
  base_uri = website.to_uri

  Ronin::Web::Spider.site(base_uri) do |agent|
    agent.every_page do |page|
      if VALID_STATUS_CODES.include?(page.code)
        yield URL.new(page.url, status:  page.code,
                                headers: page.headers,
                                body:    page.body)
      end
    end

    agent.every_javascript_url_string do |url,page|
      uri = URI.parse(url)

      case uri
      when URI::HTTP
        agent.enqueue(uri)
      end
    rescue URI::InvalidURIError
      # ignore invalid URIs
    end

    agent.every_javascript_path_string do |path,page|
      if (uri = page.to_absolute(path))
        case uri
        when URI::HTTP
          agent.enqueue(uri)
        end
      end
    end
  end
end