NAME

ronin-web-spider - Spiders a website

SYNOPSIS

ronin-web spider [options] {--host HOST | --domain DOMAIN | --site URL}

DESCRIPTION

Spiders a website.

OPTIONS

--host HOST
Spiders the specific HOST.
--domain DOMAIN
Spiders the whole DOMAIN.
--site URL
Spiders the website, starting at the URL.
--open-timeout SECS
Sets the connection open timeout.
--read-timeout SECS
Sets the read timeout.
--ssl-timeout SECS
Sets the SSL connection timeout.
--continue-timeout SECS
Sets the continue timeout.
--keep-alive-timeout SECS
Sets the connection keep alive timeout.
-P, --proxy PROXY
Sets the proxy to use.
-H, --headerNAME: VALUE
Sets a default header.
--host-header NAME=VALUE
Sets a default header.
-u, --user-agent chrome-linux|chrome-macos|chrome-windows|chrome-iphone|chrome-ipad|chrome-android|firefox-linux|firefox-macos|firefox-windows|firefox-iphone|firefox-ipad|firefox-android|safari-macos|safari-iphone|safari-ipad|edge
The User-Agent to use.
-U, --user-agent-string STRING
The raw User-Agent string to use.
-R, --referer URL
Sets the Referer URL.
--delay SECS
Sets the delay in seconds between each request.
-l, --limit COUNT
Only spiders up to COUNT pages.
-d, --max-depth DEPTH
Only spiders up to max depth.
--enqueue URL
Adds the URL to the queue.
--visited URL
Marks the URL as previously visited.
--strip-fragments
Enables/disables stripping the fragment component of every URL.
--strip-query
Enables/disables stripping the query component of every URL.
--visit-scheme SCHEME
Visit URLs with the URI scheme.
--visit-schemes-like /REGEX/
Visit URLs with URI schemes that match the REGEX.
--ignore-scheme SCHEME
Ignore URLs with the URI scheme.
--ignore-schemes-like /REGEX/
Ignore URLs with URI schemes matching the REGEX.
--visit-host HOST
Visit URLs with the matching host name.
--visit-hosts-like /REGEX/
Visit URLs with hostnames that match the REGEX.
--ignore-host HOST
Ignore the host name.
--ignore-hosts-like /REGEX/
Ignore the host names matching the REGEX.
--visit-port PORT
Visit URLs with the matching port number.
--visit-ports-like /REGEX/
Visit URLs with port numbers that match the REGEX.
--ignore-port PORT
Ignore the port number.
--ignore-ports-like /REGEX/
Ignore the port numbers matching the REGEXP.
--visit-link URL
Visit the URL.
--visit-links-like /REGEX/
Visit URLs that match the REGEX.
--ignore-link URL
Ignore the URL.
--ignore-links-like /REGEX/
Ignore URLs matching the REGEX.
--visit-ext FILE_EXT
Visit URLs with the matching file ext.
--visit-exts-like /REGEX/
Visit URLs with file exts that match the REGEX.
--ignore-ext FILE_EXT
Ignore the URLs with the file ext.
--ignore-exts-like /REGEX/
Ignore URLs with file exts matching the REGEX.
-r, --robots
Specifies whether to honor robots.txt.
--print-status
Print the status codes for each URL.
--print-headers
Print response headers for each URL.
--print-header NAME
Prints a specific header.
--history FILE
Sets the history file to write every visited URL to.
--archive DIR
Archive every visited page to the DIR.
--git-archive DIR
Archive every visited page to the git repository.
-X, --xpath XPATH
Evaluates the XPath on each HTML page.
-C, --css-path XPATH
Evaluates the CSS-path on each HTML page.
--print-hosts
Print all discovered hostnames.
--print-certs
Print all encountered SSL/TLS certificates.
--save-certs
Saves all encountered SSL/TLS certificates.
--print-js-strings
Print all JavaScript strings.
--print-js-url-strings
Print URL strings found in JavaScript.
--print-js-path-strings
Print path strings found in JavaScript.
--print-js-absolute-path-strings
Only print absolute path strings found in JavaScript.
--print-js-relative-path-strings
Only print relative path strings found in JavaScript.
--print-html-comments
Print HTML comments.
--print-js-comments
Print JavaScript comments.
--print-comments
Print all HTML and JavaScript comments.
-v, --verbose
Enables verbose output.
-h, --help
Print help information.

ENVIRONMENT

HTTP_PROXY
Sets the global HTTP proxy.
RONIN_HTTP_PROXY
Sets the HTTP proxy for Ronin.

AUTHOR

Postmodern postmodern.mod3@gmail.com

SEE ALSO

ronin-web-server ronin-web-proxy ronin-web-diff ronin-web-new-spider