Class: Ronin::Web::Spider::Agent
- Inherits:
-
Spidr::Agent
- Object
- Spidr::Agent
- Ronin::Web::Spider::Agent
- Defined in:
- lib/ronin/web/spider/agent.rb
Overview
Extends Spidr::Agent.
Instance Attribute Summary collapse
-
#collected_certs ⇒ Array<Ronin::Support::Crypto::Cert>
readonly
All certificates encountered while spidering.
-
#visited_hosts ⇒ Set<String>?
readonly
The visited host names.
Instance Method Summary collapse
-
#every_cert {|cert| ... } ⇒ Object
Passes every unique TLS certificate to the given block and populates #collected_certs.
-
#every_comment {|comment| ... } ⇒ Object
Passes every HTML and JavaScript comment to the given block.
-
#every_favicon {|favicon| ... } ⇒ Object
Pass every favicon from every page to the given block.
-
#every_host {|host| ... } ⇒ Object
Passes every unique host name that the agent visits to the given block and populates #visited_hosts.
-
#every_html_comment {|comment| ... } ⇒ Object
Passes every non-empty HTML comment to the given block.
-
#every_javascript {|js| ... } ⇒ Object
(also: #every_js)
Passes every piece of JavaScript to the given block.
-
#every_javascript_comment {|comment| ... } ⇒ Object
(also: #every_js_comment)
Passes every JavaScript comment to the given block.
-
#every_javascript_string {|string| ... } ⇒ Object
(also: #every_js_string)
Passes every JavaScript string value to the given block.
-
#initialize(proxy: Support::Network::HTTP.proxy, user_agent: Support::Network::HTTP.user_agent, **kwargs) {|agent| ... } ⇒ Agent
constructor
Creates a new Spider object.
Constructor Details
#initialize(proxy: Support::Network::HTTP.proxy, user_agent: Support::Network::HTTP.user_agent, **kwargs) {|agent| ... } ⇒ Agent
Creates a new Spider object.
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
# File 'lib/ronin/web/spider/agent.rb', line 96 def initialize(proxy: Support::Network::HTTP.proxy, user_agent: Support::Network::HTTP.user_agent, **kwargs, &block) proxy = case proxy when Addressable::URI Spidr::Proxy.new( host: proxy.host, port: proxy.port, user: proxy.user, password: proxy.password ) else proxy end user_agent = case user_agent when Symbol Support::Network::HTTP::UserAgents[user_agent] else user_agent end super(proxy: proxy, user_agent: user_agent, **kwargs,&block) end |
Instance Attribute Details
#collected_certs ⇒ Array<Ronin::Support::Crypto::Cert> (readonly)
All certificates encountered while spidering.
161 162 163 |
# File 'lib/ronin/web/spider/agent.rb', line 161 def collected_certs @collected_certs end |
#visited_hosts ⇒ Set<String>? (readonly)
The visited host names.
127 128 129 |
# File 'lib/ronin/web/spider/agent.rb', line 127 def visited_hosts @visited_hosts end |
Instance Method Details
#every_cert {|cert| ... } ⇒ Object
Passes every unique TLS certificate to the given block and populates #collected_certs.
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 |
# File 'lib/ronin/web/spider/agent.rb', line 178 def every_cert @collected_certs ||= [] serials = Set.new every_page do |page| if page.url.scheme == 'https' cert = sessions[page.url].peer_cert if serials.add?(cert.serial) cert = Support::Crypto::Cert(cert) @collected_certs << cert yield cert end end end end |
#every_comment {|comment| ... } ⇒ Object
Passes every HTML and JavaScript comment to the given block.
354 355 356 357 |
# File 'lib/ronin/web/spider/agent.rb', line 354 def every_comment(&block) every_html_comment(&block) every_javascript_comment(&block) end |
#every_favicon {|favicon| ... } ⇒ Object
Pass every favicon from every page to the given block.
215 216 217 218 219 |
# File 'lib/ronin/web/spider/agent.rb', line 215 def every_favicon every_page do |page| yield page if page.icon? end end |
#every_host {|host| ... } ⇒ Object
Passes every unique host name that the agent visits to the given block and populates #visited_hosts.
144 145 146 147 148 149 150 151 152 153 154 |
# File 'lib/ronin/web/spider/agent.rb', line 144 def every_host @visited_hosts ||= Set.new every_page do |page| host = page.url.host if @visited_hosts.add?(host) yield host end end end |
#every_html_comment {|comment| ... } ⇒ Object
Passes every non-empty HTML comment to the given block.
238 239 240 241 242 243 244 245 246 247 248 |
# File 'lib/ronin/web/spider/agent.rb', line 238 def every_html_comment every_html_page do |page| page.doc.xpath('//comment()').each do |comment| comment_text = comment.inner_text.strip unless comment_text.empty? yield comment_text end end end end |
#every_javascript {|js| ... } ⇒ Object Also known as: every_js
Passes every piece of JavaScript to the given block.
266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 |
# File 'lib/ronin/web/spider/agent.rb', line 266 def every_javascript # yield inner text of every `<script type="text/javascript">` tag # and every `.js` URL. every_html_page do |page| page.doc.xpath('//script[@type="text/javascript"]').each do |script| unless script.inner_text.empty? yield script.inner_text end end end every_javascript_page do |page| yield page.body end end |
#every_javascript_comment {|comment| ... } ⇒ Object Also known as: every_js_comment
Passes every JavaScript comment to the given block.
327 328 329 330 331 |
# File 'lib/ronin/web/spider/agent.rb', line 327 def every_javascript_comment(&block) every_javascript do |js| js.scan(Support::Text::Patterns::JAVASCRIPT_COMMENT,&block) end end |
#every_javascript_string {|string| ... } ⇒ Object Also known as: every_js_string
Passes every JavaScript string value to the given block.
end
301 302 303 304 305 306 307 |
# File 'lib/ronin/web/spider/agent.rb', line 301 def every_javascript_string every_javascript do |js| js.scan(Support::Text::Patterns::STRING) do |js_string| yield Support::Encoding::JS.unquote(js_string) end end end |