Passive Information Gathering
Passive information is information gathered without interacting with the target.
Whois
The Internet Corporation of Assigned Names and Numbers (ICANN) requires that accredited registrars enter the holder's contact information, the domain's creation, and expiration dates, and other information in the Whois database immediately after registering a domain. In simple terms, the Whois database is a searchable list of all domains currently registered worldwide.
It is a TCP-based transaction-oriented query/response protocol listening on TCP port 43 by default.
We can use it for querying databases containing domain names, IP addresses, or autonomous systems and provide information services to Internet users.
WHOIS lookups were initially performed using command-line tools. Nowadays, many web-based (example: https://whois.domaintools.com/) tools exist, but command-line options often give us the most control over our queries and help filter and sort the resultant output.
Commands:
whois <Domain Name>
Information Gathered
Organisation
Locations
Domain Email address
Registrar Email address
Phone number
Language English (US)
Registrar
New Domain
DNSSEC
Name servers
DNS
DNS converts domain names to IP addresses, allowing browsers to access resources on the Internet.
Each Internet-connected device has a unique IP address that other machines use to locate it.
DNS servers minimize the need for people to learn IP addresses like
104.17.42.72
in IPv4 or more sophisticated modern alphanumeric IP addresses like2606:4700::6811:2b48
in IPv6. When a user typeswww.facebook.com
into their web browser, a translation must occur between what the user types and the IP address required to reach thewww.facebook.com
webpage.
Why DNS
It allows names to be used instead of numbers to identify hosts.
It is a lot easier to remember a name than it is to recall a number.
By merely retargeting a name to the new numeric address, a server can change numeric addresses without having to notify everyone on the Internet.
A single name might refer to several hosts splitting the workload between different servers.
DNS Hierarchy
There is a hierarchy of names in the DNS structure.
The top level is
root
which is designated by a dot (.
)The Top-Level Domains, is the second top level and it is the last portion of a hostname(in
www.facebook.com
, the TLD server iscom
)After the top-level domains comes the domain of the organization (i.e.
facebook
/google
..)
Resource Records
Resource Records are the results of DNS queries and have the following structure:
Resource Record
A domain name, usually a fully qualified domain name, is the first part of a Resource Record. If you don't use a fully qualified domain name, the zone's name where the record is located will be appended to the end of the name.
TTL
In seconds, the Time-To-Live (TTL
) defaults to the minimum value specified in the SOA
record.
Record Class
Internet, Hesiod, or Chaos
Start Of Authority (SOA)
It should be first in a zone file because it indicates the start of a zone. Each zone can only have one SOA
record, and additionally, it contains the zone's values, such as a serial number and multiple expiration timeouts.
Name Servers (NS)
The distributed database is bound together by NS
Records. They are in charge of a zone's authoritative name server and the authority for a child zone to a name server.
IPv4 Addresses (A)
The A record is only a mapping between a hostname and an IP address. 'Forward' zones are those with A
records.
Pointer (PTR)
The PTR
record is a mapping between an IP address and a hostname. 'Reverse' zones are those that have PTR
records.
Canonical Name (CNAME)
An alias hostname is mapped to an A
record hostname using the CNAME
record.
Mail Exchange (MX)
The MX
record identifies a host that will accept emails for a specific host. A priority value has been assigned to the specified host. Multiple MX records can exist on the same host, and a prioritized list is made consisting of the records for a specific host.
Nslookup and DIG
With Nslookup and DIG, we can search for domain name servers on the Internet and ask them for information about hosts and domains.
The tool has 2 modes, interactive and non-interactive.
We can query
A
records by just submitting a domain name. But we can also use the-query
parameter to search specific resource records (i.e.nslookup google.com
).We can also specify a nameserver if needed by adding
@<Name Server/IP Address of Name Server>
to the command (i.e.dig facebook.com @1.1.1.1
)Commands:
We can specifiy a specific query or we can use any to get all the data avaliable:
nslookup <-query=A/PTR/ANY/TXT/MX/...> <Domain Name>
Example:
nslookup -query=PTR 31.13.92.36
Example:
nslookup -query=ANY google.com
The same goes for DIG:
dig <a/txt/mx/any/... <Domain Name>
Example:
dig mx facebook.com
Example:
dig any cloudflare.com
Organizations are given IP addresses on the Internet, but they aren't always their owners. They might rely on ISPs and hosting providers that lease smaller netblocks to them.
We can combine some of the results gathered via
nslookup
with thewhois
database to determine if our target organization uses hosting providers.
Passive Subdomain Enumeration
Subdomain enumeration refers to mapping all available subdomains within a domain name.
Virus Total
VirusTotal maintains its DNS replication service, which is developed by preserving DNS resolutions made when users visit URLs given by them.
To receive information about a domain, type the domain name into the search bar and click on the "Relations" tab.
Certificates
Another interesting source of information we can use to extract subdomains is SSL/TLS certificates.
The main reason is Certificate Transparency (CT), a project that requires every SSL/TLS certificate issued by a Certificate Authority (CA) to be published in a publicly accessible log.
Organization using two primary resources:
We can use this command to get the results in a more organized way:
curl -s "https://crt.sh/?q=<Domain Name>&output=json" | jq -r '.[] | "(.name_value)\n(.common_name)"' | sort -u > "<Domain Name>_crt.sh.txt"
We can also do the enumeration manually using this command
openssl s_client -ign_eof 2>/dev/null <<<$'HEAD / HTTP/1.0\r\n\r' -connect "<Domain>:<Port Number (443)>" | openssl x509 -noout -text -in - | grep 'DNS' | sed -e 's|DNS:|\n|g' -e 's|^*.*||g' | tr -d ',' | sort -u
Automating Passive Subdomain Enumeration
TheHarvester
Simple-to-use yet powerful and effective tool for early-stage penetration testing and red team engagements.
The tool collects emails, names, subdomains, IP addresses, and URLs from various public data sources for passive information gathering.
There are many modules that can be used with the tool, to automate the proess of using the modules, we can use this command:
cat sources.txt | while read source; do theHarvester -d "<Domain>" -b $source -f "${source}_<Domain>";done
Example:
cat sources.txt | while read source; do theHarvester -d "facebook.com" -b $source -f "${source}_facebook.com";done
Where the sources.txt contains the modules.
Passive Infrastructure Identification
Netcraft
Netcraft can offer us information about the servers without even interacting with them, and this is something valuable from a passive information gathering point of view.
We can use the service by visiting https://sitereport.netcraft.com and entering the target domain.
Some interesting details we can observe from the report are:
Background: General information about the domain, including the date it was first seen by Netcraft crawlers.
Network Information about the netblock owner, hosting company, nameservers, etc.
Hosting history Latest IPs used, webserver, and target OS.
We need to pay special attention to the latest IPs used. Sometimes we can spot the actual IP address from the webserver before it was placed behind a load balancer, web application firewall, or IDS, allowing us to connect directly to it if the configuration allows it.
Wayback Machine
The Internet Archive is an American digital library that provides free public access to digitalized materials, including websites, collected automatically via its web crawlers.
We can access several versions of these websites using the Wayback Machine to find old versions that may have interesting comments in the source code or files that should not be there.
This tool can be used to find older versions of a website at a point in time.
We can also use the tool waybackurls to inspect URLs saved by Wayback Machine and look for specific keywords.
We can install the tool using
Go
:go install github.com/tomnomnom/waybackurls@latest
Last updated