Minor Lee fixes

This commit is contained in:
Christian Martorella 2018-12-22 21:29:11 +01:00
parent 8b88a662df
commit 53703ffd0f
6 changed files with 471 additions and 473 deletions

102
README.md
View file

@ -6,7 +6,7 @@
* | |_| | | | __/ / __ / (_| | | \ V / __/\__ \ || __/ | * * | |_| | | | __/ / __ / (_| | | \ V / __/\__ \ || __/ | *
* \__|_| |_|\___| \/ /_/ \__,_|_| \_/ \___||___/\__\___|_| * * \__|_| |_|\___| \/ /_/ \__,_|_| \_/ \___||___/\__\___|_| *
* * * *
* TheHarvester Ver. 3.0.5 * * TheHarvester Ver. 3.0.6 *
* Coded by Christian Martorella * * Coded by Christian Martorella *
* Edge-Security Research * * Edge-Security Research *
* cmartorella@edge-security.com * * cmartorella@edge-security.com *
@ -15,78 +15,83 @@
What is this? What is this?
------------- -------------
theHarvester is a tool used for gathering names, emails, subdomains, virtual
hosts, open ports/banners, and employee names from different public sources
(search engines, PGP key servers). A really simple, but very effective tool for
the early stages of a penetration test or just to know the visibility of your
company on the internet.
theHarvester is a tool for gathering subdomain names, e-mail addresses, virtual The data sources include:
hosts, open ports/ banners, and employee names from different public sources
(search engines, pgp key servers).
Is a really simple tool, but very effective for the early stages of a penetration Passive:
test or just to know the visibility of your company in the Internet. --------
* baidu: Baidu search engine
The sources are: * bing: Microsoft search engine - www.bing.com
**Passive**: * bingapi: Microsoft search engine, through the API (Requires API key, see below.)
---------
* threatcrowd: Open source threat intelligence - https://www.threatcrowd.org/ * censys:
* crtsh: Comodo Certificate search - www.crt.sh * crtsh: Comodo Certificate search - www.crt.sh
* google: Google search engine - www.google.com (With optional google dorking) * cymon:
* googleCSE: Google custom search engine
* google-profiles: Google search engine, specific search for Google profiles
* bing: Microsoft search engine - www.bing.com
* bingapi: microsoft search engine, through the API (you need to add your Key in
the discovery/bingsearch.py file)
* dogpile: Dogpile search engine - www.dogpile.com * dogpile: Dogpile search engine - www.dogpile.com
* pgp: PGP key server - mit.edu * google: Google search engine - www.google.com (Optional Google dorking.)
* googleCSE: Google custom search engine
* googleplus: Users that work in target company (Uses Google search.)
* google-certificates: Google Certificate Transparency report
* google-profiles: Google search engine, specific search for Google profiles
* hunter: Hunter search engine (Requires API key, see below.)
* linkedin: Google search engine, specific search for Linkedin users * linkedin: Google search engine, specific search for Linkedin users
* netcraft:
* pgp: PGP key server - mit.edu
* shodan: Shodan search engine, will search for ports and banners from discovered
hosts - www.shodanhq.com
* threatcrowd: Open source threat intelligence - www.threatcrowd.org
* trello:
* twitter: Twitter accounts related to a specific domain (Uses Google search.)
* vhost: Bing virtual hosts search * vhost: Bing virtual hosts search
* twitter: Twitter accounts related to an specific domain (uses google search) * virustotal:
* googleplus: users that works in target company (uses google search)
* yahoo: Yahoo search engine * yahoo: Yahoo search engine
* baidu: Baidu search engine * all:
* shodan: Shodan Computer search engine, will search for ports and banner of the
discovered hosts (http://www.shodanhq.com/)
* hunter: Hunter search engine (you need to add your Key in the discovery/huntersearch.py file)
* google-certificates: Google Certificate Transparency report
Active: Active:
------- -------
* DNS brute force: this plugin will run a dictionary brute force enumeration * DNS brute force: dictionary brute force enumeration
* DNS reverse lookup: reverse lookup of ip´s discovered in order to find hostnames * DNS reverse lookup: reverse lookup of IP´s discovered in order to find hostnames
* DNS TDL expansion: TLD dictionary brute force enumeration * DNS TDL expansion: TLD dictionary brute force enumeration
Modules that require an API key:
Modules that need API keys to work: --------------------------------
---------------------------------- * googleCSE: add your API key and CSE ID to discovery/googleCSE.py
* googleCSE: You need to create a Google Custom Search engine(CSE), and add your * hunter: add your API key to discovery/huntersearch.py
Google API key and CSE ID in the plugin (discovery/googleCSE.py) * shodan: add your API key to discovery/shodansearch.py
* shodan: You need to provide your API key in discovery/shodansearch.py (one provided at the moment)
* hunter: You need to provide your API key in discovery/huntersearch.py (none is provided at the moment)
Dependencies: Dependencies:
------------ -------------
* Requests library (http://docs.python-requests.org/en/latest/) * Requests library (http://docs.python-requests.org/en/latest/)
`pip install requests` `pip install requests`
* Beautiful Soup 4 (https://pypi.org/project/beautifulsoup4//) * Beautiful Soup 4 (https://pypi.org/project/beautifulsoup4//)
` pip install beautifulsoup4` `pip install beautifulsoup4`
Changelog in 3.0.0: Changelog in 3.0.0:
------------------ ------------------
@ -96,9 +101,8 @@ Changelog in 3.0.0:
* Shodan DB search fixed * Shodan DB search fixed
* Result storage in Sqlite * Result storage in Sqlite
Comments, bugs, or requests?
Comments? Bugs? Requests? ----------------------------
------------------------
cmartorella@edge-security.com cmartorella@edge-security.com
Updates: Updates:
@ -109,6 +113,6 @@ Thanks:
------- -------
* Matthew Brown @NotoriousRebel * Matthew Brown @NotoriousRebel
* Janos Zold @Jzold * Janos Zold @Jzold
* John Matherly - SHODAN project * John Matherly - Shodan project
* Lee Baird for suggestions and bugs reporting * Lee Baird @discoverscripts - suggestions and bugs reporting
* Ahmed Aboul Ela - subdomain names dictionary (big and small) * Ahmed Aboul Ela - subdomain names dictionaries (big and small)

0
censysparser.py Normal file → Executable file
View file

4
cymonparser.py Normal file → Executable file
View file

@ -6,12 +6,12 @@ class parser:
def __init__(self, results): def __init__(self, results):
self.results = results self.results = results
self.ipaddresses = [] self.ipaddresses = []
self.soup = BeautifulSoup(results.results,features="html.parser") self.soup = BeautifulSoup(results.results, features="html.parser")
def search_ipaddresses(self): def search_ipaddresses(self):
try: try:
tags = self.soup.findAll('td') tags = self.soup.findAll('td')
allip = re.findall( r'[0-9]+(?:\.[0-9]+){3}',str(tags)) allip = re.findall(r'[0-9]+(?:\.[0-9]+){3}',str(tags))
self.ipaddresses = set(allip) self.ipaddresses = set(allip)
return self.ipaddresses return self.ipaddresses
except Exception as e: except Exception as e:

113
myparser.py Normal file → Executable file
View file

@ -1,6 +1,5 @@
import re import re
class parser: class parser:
def __init__(self, results, word): def __init__(self, results, word):
@ -17,11 +16,10 @@ def genericClean(self):
self.results = re.sub('%3a', ' ', self.results) self.results = re.sub('%3a', ' ', self.results)
self.results = re.sub('<strong>', '', self.results) self.results = re.sub('<strong>', '', self.results)
self.results = re.sub('</strong>', '', self.results) self.results = re.sub('</strong>', '', self.results)
self.results = re.sub('<wbr>','',self.results) self.results = re.sub('<wbr>', '', self.results)
self.results = re.sub('</wbr>','',self.results) self.results = re.sub('</wbr>', '', self.results)
for e in ('<', '>', ':', '=', ';', '&', '%3A', '%3D', '%3C', '/', '\\'):
for e in ('>', ':', '=', '<', '/', '\\', ';', '&', '%3A', '%3D', '%3C'):
self.results = self.results.replace(e, ' ') self.results = self.results.replace(e, ' ')
def urlClean(self): def urlClean(self):
@ -37,7 +35,7 @@ def emails(self):
self.genericClean() self.genericClean()
reg_emails = re.compile( reg_emails = re.compile(
# Local part is required, charset is flexible # Local part is required, charset is flexible
# https://tools.ietf.org/html/rfc6531 (removed * and () as they provide FP mostly ) # https://tools.ietf.org/html/rfc6531 (removed * and () as they provide FP mostly )
'[a-zA-Z0-9.\-_+#~!$&\',;=:]+' + '[a-zA-Z0-9.\-_+#~!$&\',;=:]+' +
'@' + '@' +
'[a-zA-Z0-9.-]*' + '[a-zA-Z0-9.-]*' +
@ -58,11 +56,17 @@ def fileurls(self, file):
urls.append(x) urls.append(x)
return urls return urls
def hostnames(self):
self.genericClean()
reg_hosts = re.compile('[a-zA-Z0-9.-]*\.' + self.word)
self.temp = reg_hosts.findall(self.results)
hostnames = self.unique()
return hostnames
def people_googleplus(self): def people_googleplus(self):
self.results = re.sub('</b>', '', self.results) self.results = re.sub('</b>', '', self.results)
self.results = re.sub('<b>', '', self.results) self.results = re.sub('<b>', '', self.results)
reg_people = re.compile('>[a-zA-Z0-9._ ]* - Google\+') reg_people = re.compile('>[a-zA-Z0-9._ ]* - Google\+')
#reg_people = re.compile('">[a-zA-Z0-9._ -]* profiles | LinkedIn')
self.temp = reg_people.findall(self.results) self.temp = reg_people.findall(self.results)
resul = [] resul = []
for x in self.temp: for x in self.temp:
@ -75,11 +79,44 @@ def people_googleplus(self):
resul.append(y) resul.append(y)
return resul return resul
def hostnames_all(self):
reg_hosts = re.compile('<cite>(.*?)</cite>')
temp = reg_hosts.findall(self.results)
for x in temp:
if x.count(':'):
res = x.split(':')[1].split('/')[2]
else:
res = x.split("/")[0]
self.temp.append(res)
hostnames = self.unique()
return hostnames
def people_jigsaw(self):
res = []
reg_people = re.compile(
"href=javascript:showContact\('[0-9]*'\)>[a-zA-Z0-9., ]*</a></span>")
self.temp = reg_people.findall(self.results)
for x in self.temp:
a = x.split('>')[1].replace("</a", "")
res.append(a)
return res
def people_linkedin(self):
reg_people = re.compile('">[a-zA-Z0-9._ -]* \| LinkedIn')
self.temp = reg_people.findall(self.results)
resul = []
for x in self.temp:
y = x.replace(' | LinkedIn', '')
y = y.replace(' profiles ', '')
y = y.replace('LinkedIn', '')
y = y.replace('"', '')
y = y.replace('>', '')
if y != " ":
resul.append(y)
return resul
def people_twitter(self): def people_twitter(self):
reg_people = re.compile('(@[a-zA-Z0-9._ -]*)') reg_people = re.compile('(@[a-zA-Z0-9._ -]*)')
#reg_people = re.compile('">[a-zA-Z0-9._ -]* profiles | LinkedIn')
self.temp = reg_people.findall(self.results) self.temp = reg_people.findall(self.results)
users = self.unique() users = self.unique()
resul = [] resul = []
@ -93,21 +130,6 @@ def people_twitter(self):
resul.append(y) resul.append(y)
return resul return resul
def people_linkedin(self):
reg_people = re.compile('">[a-zA-Z0-9._ -]* \| LinkedIn')
#reg_people = re.compile('">[a-zA-Z0-9._ -]* profiles | LinkedIn')
self.temp = reg_people.findall(self.results)
resul = []
for x in self.temp:
y = x.replace(' | LinkedIn', '')
y = y.replace(' profiles ', '')
y = y.replace('LinkedIn', '')
y = y.replace('"', '')
y = y.replace('>', '')
if y != " ":
resul.append(y)
return resul
def profiles(self): def profiles(self):
reg_people = re.compile('">[a-zA-Z0-9._ -]* - <em>Google Profile</em>') reg_people = re.compile('">[a-zA-Z0-9._ -]* - <em>Google Profile</em>')
self.temp = reg_people.findall(self.results) self.temp = reg_people.findall(self.results)
@ -120,34 +142,6 @@ def profiles(self):
resul.append(y) resul.append(y)
return resul return resul
def people_jigsaw(self):
res = []
#reg_people = re.compile("'tblrow' title='[a-zA-Z0-9.-]*'><span class='nowrap'/>")
reg_people = re.compile(
"href=javascript:showContact\('[0-9]*'\)>[a-zA-Z0-9., ]*</a></span>")
self.temp = reg_people.findall(self.results)
for x in self.temp:
a = x.split('>')[1].replace("</a", "")
res.append(a)
return res
def hostnames(self):
self.genericClean()
reg_hosts = re.compile('[a-zA-Z0-9.-]*\.' + self.word)
self.temp = reg_hosts.findall(self.results)
hostnames = self.unique()
return hostnames
def urls(self):
#self.genericClean()
#reg_hosts = re.compile("https://"+ self.word +'*[a-zA-Z0-9.-:/]')
#reg_urls = re.compile('https://trello.com'+'[a-zA-Z0-9]+')
found = re.finditer('https://(www\.)?trello.com/([a-zA-Z0-9\-_\.]+/?)*', self.results)
for x in found:
self.temp.append(x.group())
urls = self.unique()
return urls
def set(self): def set(self):
reg_sets = re.compile('>[a-zA-Z0-9]*</a></font>') reg_sets = re.compile('>[a-zA-Z0-9]*</a></font>')
self.temp = reg_sets.findall(self.results) self.temp = reg_sets.findall(self.results)
@ -158,17 +152,12 @@ def set(self):
sets.append(y) sets.append(y)
return sets return sets
def hostnames_all(self): def urls(self):
reg_hosts = re.compile('<cite>(.*?)</cite>') found = re.finditer('https://(www\.)?trello.com/([a-zA-Z0-9\-_\.]+/?)*', self.results)
temp = reg_hosts.findall(self.results) for x in found:
for x in temp: self.temp.append(x.group())
if x.count(':'): urls = self.unique()
res = x.split(':')[1].split('/')[2] return urls
else:
res = x.split("/")[0]
self.temp.append(res)
hostnames = self.unique()
return hostnames
def unique(self): def unique(self):
self.new = [] self.new = []

143
stash.py Normal file → Executable file
View file

@ -13,32 +13,32 @@ def __init__(self):
self.scanstats = [] self.scanstats = []
self.latestscanresults = [] self.latestscanresults = []
self.previousscanresults = [] self.previousscanresults = []
def do_init(self): def do_init(self):
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
c = conn.cursor() c = conn.cursor()
c.execute ('CREATE TABLE results (domain text, resource text, type text, find_date date, source text)') c.execute('CREATE TABLE results (domain text, resource text, type text, find_date date, source text)')
conn.commit() conn.commit()
conn.close() conn.close()
return return
def store(self,domain, resource,res_type,source): def store(self, domain, resource, res_type, source):
self.domain = domain self.domain = domain
self.resource = resource self.resource = resource
self.type = res_type self.type = res_type
self.source = source self.source = source
self.date = datetime.date.today() self.date = datetime.date.today()
try: try:
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
c = conn.cursor() c = conn.cursor()
c.execute ('INSERT INTO results (domain,resource, type, find_date, source) VALUES (?,?,?,?,?)',(self.domain,self.resource,self.type,self.date,self.source)) c.execute('INSERT INTO results (domain,resource, type, find_date, source) VALUES (?,?,?,?,?)',
conn.commit() (self.domain, self.resource, self.type, self.date, self.source))
conn.close() conn.commit()
conn.close()
except Exception as e: except Exception as e:
print(e) print(e)
return return
def store_all(self,domain,all,res_type,source): def store_all(self, domain, all, res_type, source):
self.domain = domain self.domain = domain
self.all = all self.all = all
self.type = res_type self.type = res_type
@ -46,52 +46,56 @@ def store_all(self,domain,all,res_type,source):
self.date = datetime.date.today() self.date = datetime.date.today()
for x in self.all: for x in self.all:
try: try:
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
c = conn.cursor() c = conn.cursor()
c.execute ('INSERT INTO results (domain,resource, type, find_date, source) VALUES (?,?,?,?,?)',(self.domain,x,self.type,self.date,self.source)) c.execute('INSERT INTO results (domain,resource, type, find_date, source) VALUES (?,?,?,?,?)',
(self.domain, x, self.type, self.date, self.source))
conn.commit() conn.commit()
conn.close() conn.close()
except Exception as e: except Exception as e:
print(e) print(e)
return return
def generatedashboardcode(self,domain): def generatedashboardcode(self, domain):
try: try:
self.latestscandomain["domain"] = domain self.latestscandomain["domain"] = domain
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="host"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="host"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["host"] = data[0] self.latestscandomain["host"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="email"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="email"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["email"] = data[0] self.latestscandomain["email"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="ip"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="ip"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["ip"] = data[0] self.latestscandomain["ip"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="vhost"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="vhost"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["vhost"] = data[0] self.latestscandomain["vhost"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="shodan"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="shodan"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["shodan"] = data[0] self.latestscandomain["shodan"] = data[0]
c.execute('''SELECT MAX(find_date) FROM results WHERE domain=?''',(domain,)) c.execute('''SELECT MAX(find_date) FROM results WHERE domain=?''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["latestdate"] = data[0] self.latestscandomain["latestdate"] = data[0]
latestdate = data [0] latestdate = data[0]
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="host"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="host"''', (domain, latestdate,))
scandetailshost = c.fetchall() scandetailshost = c.fetchall()
self.latestscandomain["scandetailshost"] = scandetailshost self.latestscandomain["scandetailshost"] = scandetailshost
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="email"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="email"''',
(domain, latestdate,))
scandetailsemail = c.fetchall() scandetailsemail = c.fetchall()
self.latestscandomain["scandetailsemail"] = scandetailsemail self.latestscandomain["scandetailsemail"] = scandetailsemail
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="ip"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="ip"''', (domain, latestdate,))
scandetailsip = c.fetchall() scandetailsip = c.fetchall()
self.latestscandomain["scandetailsip"] = scandetailsip self.latestscandomain["scandetailsip"] = scandetailsip
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="vhost"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="vhost"''',
(domain, latestdate,))
scandetailsvhost = c.fetchall() scandetailsvhost = c.fetchall()
self.latestscandomain["scandetailsvhost"] = scandetailsvhost self.latestscandomain["scandetailsvhost"] = scandetailsvhost
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="shodan"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="shodan"''',
(domain, latestdate,))
scandetailsshodan = c.fetchall() scandetailsshodan = c.fetchall()
self.latestscandomain["scandetailsshodan"] = scandetailsshodan self.latestscandomain["scandetailsshodan"] = scandetailsshodan
return self.latestscandomain return self.latestscandomain
@ -100,7 +104,7 @@ def generatedashboardcode(self,domain):
finally: finally:
conn.close() conn.close()
def getlatestscanresults(self,domain,previousday=False): def getlatestscanresults(self, domain, previousday=False):
try: try:
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
if previousday: if previousday:
@ -109,10 +113,11 @@ def getlatestscanresults(self,domain,previousday=False):
c.execute(''' c.execute('''
SELECT DISTINCT(find_date) SELECT DISTINCT(find_date)
FROM results FROM results
WHERE find_date=date('now', '-1 day') and domain=?''',(domain,)) WHERE find_date=date('now', '-1 day') and domain=?''', (domain,))
previousscandate = c.fetchone() previousscandate = c.fetchone()
if not previousscandate: #when theHarvester runs first time/day this query will return if not previousscandate: # when theHarvester runs first time/day this query will return
self.previousscanresults = ["No results","No results","No results","No results","No results"] self.previousscanresults = ["No results", "No results", "No results", "No results",
"No results"]
else: else:
c = conn.cursor() c = conn.cursor()
c.execute(''' c.execute('''
@ -120,7 +125,7 @@ def getlatestscanresults(self,domain,previousday=False):
FROM results FROM results
WHERE find_date=? and domain=? WHERE find_date=? and domain=?
ORDER BY source,type ORDER BY source,type
''',(previousscandate[0],domain,)) ''', (previousscandate[0], domain,))
results = c.fetchall() results = c.fetchall()
self.previousscanresults = results self.previousscanresults = results
return self.previousscanresults return self.previousscanresults
@ -129,7 +134,7 @@ def getlatestscanresults(self,domain,previousday=False):
else: else:
try: try:
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT MAX(find_date) FROM results WHERE domain=?''',(domain,)) c.execute('''SELECT MAX(find_date) FROM results WHERE domain=?''', (domain,))
latestscandate = c.fetchone() latestscandate = c.fetchone()
c = conn.cursor() c = conn.cursor()
c.execute(''' c.execute('''
@ -137,7 +142,7 @@ def getlatestscanresults(self,domain,previousday=False):
FROM results FROM results
WHERE find_date=? and domain=? WHERE find_date=? and domain=?
ORDER BY source,type ORDER BY source,type
''',(latestscandate[0],domain,)) ''', (latestscandate[0], domain,))
results = c.fetchall() results = c.fetchall()
self.latestscanresults = results self.latestscanresults = results
return self.latestscanresults return self.latestscanresults
@ -146,7 +151,7 @@ def getlatestscanresults(self,domain,previousday=False):
except Exception as e: except Exception as e:
print("Error connecting to theHarvester database: " + str(e)) print("Error connecting to theHarvester database: " + str(e))
finally: finally:
conn.close() conn.close()
def getscanboarddata(self): def getscanboarddata(self):
try: try:
@ -174,37 +179,42 @@ def getscanboarddata(self):
except Exception as e: except Exception as e:
print(e) print(e)
finally: finally:
conn.close() conn.close()
def getscanhistorydomain(self,domain): def getscanhistorydomain(self, domain):
try: try:
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT DISTINCT(find_date) FROM results WHERE domain=?''',(domain,)) c.execute('''SELECT DISTINCT(find_date) FROM results WHERE domain=?''', (domain,))
dates = c.fetchall() dates = c.fetchall()
for date in dates: for date in dates:
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="host" AND find_date=?''',(domain,date[0])) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="host" AND find_date=?''',
(domain, date[0]))
counthost = c.fetchone() counthost = c.fetchone()
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="email" AND find_date=?''',(domain,date[0])) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="email" AND find_date=?''',
(domain, date[0]))
countemail = c.fetchone() countemail = c.fetchone()
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="ip" AND find_date=?''',(domain,date[0])) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="ip" AND find_date=?''',
(domain, date[0]))
countip = c.fetchone() countip = c.fetchone()
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="vhost" AND find_date=?''',(domain,date[0])) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="vhost" AND find_date=?''',
(domain, date[0]))
countvhost = c.fetchone() countvhost = c.fetchone()
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="shodan" AND find_date=?''',(domain,date[0])) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="shodan" AND find_date=?''',
(domain, date[0]))
countshodan = c.fetchone() countshodan = c.fetchone()
results = { results = {
"date" : str(date[0]), "date": str(date[0]),
"hosts" : str(counthost[0]), "hosts": str(counthost[0]),
"email" : str(countemail[0]), "email": str(countemail[0]),
"ip" : str(countip[0]), "ip": str(countip[0]),
"vhost" : str(countvhost[0]), "vhost": str(countvhost[0]),
"shodan" : str(countshodan[0]) "shodan": str(countshodan[0])
} }
self.domainscanhistory.append(results) self.domainscanhistory.append(results)
return self.domainscanhistory return self.domainscanhistory
@ -216,7 +226,7 @@ def getscanhistorydomain(self,domain):
def getpluginscanstatistics(self): def getpluginscanstatistics(self):
try: try:
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
c = conn.cursor() c = conn.cursor()
c.execute(''' c.execute('''
SELECT domain,find_date, type, source, count(*) SELECT domain,find_date, type, source, count(*)
FROM results FROM results
@ -229,44 +239,47 @@ def getpluginscanstatistics(self):
print(e) print(e)
finally: finally:
conn.close() conn.close()
def latestscanchartdata(self,domain): def latestscanchartdata(self, domain):
try: try:
self.latestscandomain["domain"] = domain self.latestscandomain["domain"] = domain
conn = sqlite3.connect(self.db) conn = sqlite3.connect(self.db)
c = conn.cursor() c = conn.cursor()
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="host"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="host"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["host"] = data[0] self.latestscandomain["host"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="email"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="email"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["email"] = data[0] self.latestscandomain["email"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="ip"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="ip"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["ip"] = data[0] self.latestscandomain["ip"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="vhost"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="vhost"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["vhost"] = data[0] self.latestscandomain["vhost"] = data[0]
c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="shodan"''',(domain,)) c.execute('''SELECT COUNT(*) from results WHERE domain=? AND type="shodan"''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["shodan"] = data[0] self.latestscandomain["shodan"] = data[0]
c.execute('''SELECT MAX(find_date) FROM results WHERE domain=?''',(domain,)) c.execute('''SELECT MAX(find_date) FROM results WHERE domain=?''', (domain,))
data = c.fetchone() data = c.fetchone()
self.latestscandomain["latestdate"] = data[0] self.latestscandomain["latestdate"] = data[0]
latestdate = data [0] latestdate = data[0]
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="host"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="host"''', (domain, latestdate,))
scandetailshost = c.fetchall() scandetailshost = c.fetchall()
self.latestscandomain["scandetailshost"] = scandetailshost self.latestscandomain["scandetailshost"] = scandetailshost
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="email"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="email"''',
(domain, latestdate,))
scandetailsemail = c.fetchall() scandetailsemail = c.fetchall()
self.latestscandomain["scandetailsemail"] = scandetailsemail self.latestscandomain["scandetailsemail"] = scandetailsemail
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="ip"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="ip"''', (domain, latestdate,))
scandetailsip = c.fetchall() scandetailsip = c.fetchall()
self.latestscandomain["scandetailsip"] = scandetailsip self.latestscandomain["scandetailsip"] = scandetailsip
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="vhost"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="vhost"''',
(domain, latestdate,))
scandetailsvhost = c.fetchall() scandetailsvhost = c.fetchall()
self.latestscandomain["scandetailsvhost"] = scandetailsvhost self.latestscandomain["scandetailsvhost"] = scandetailsvhost
c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="shodan"''',(domain,latestdate,)) c.execute('''SELECT * FROM results WHERE domain=? AND find_date=? AND type="shodan"''',
(domain, latestdate,))
scandetailsshodan = c.fetchall() scandetailsshodan = c.fetchall()
self.latestscandomain["scandetailsshodan"] = scandetailsshodan self.latestscandomain["scandetailsshodan"] = scandetailsshodan
return self.latestscandomain return self.latestscandomain
@ -274,5 +287,3 @@ def latestscanchartdata(self,domain):
print(e) print(e)
finally: finally:
conn.close() conn.close()

View file

@ -10,13 +10,13 @@
try: try:
import requests import requests
except: except:
print("Requests library not found, please install it before proceeding\n") print("Requests library not found, please install it before proceeding.\n\n")
sys.exit() sys.exit()
try: try:
import bs4 import bs4
except: except:
print("\nBeautifulSoup library not found, please install it before proceeding\n") print("\nBeautifulSoup library not found, please install it before proceeding.\n\n")
sys.exit() sys.exit()
from discovery import * from discovery import *
@ -43,31 +43,31 @@ def usage():
if os.path.dirname(sys.argv[0]) == os.getcwd(): if os.path.dirname(sys.argv[0]) == os.getcwd():
comm = "./" + comm comm = "./" + comm
print("Usage: theharvester options \n") print("Usage: theHarvester.py <options> \n")
print(" -d: Domain to search or company name") print(" -d: company name or domain to search")
print(""" -b: data source: baidu, bing, bingapi, censys, crtsh, dogpile, print(""" -b: source: baidu, bing, bingapi, censys, crtsh, cymon, dogpile, google,
google, google-certificates, googleCSE, googleplus, google-profiles, googleCSE, googleplus, google-certificates, google-profiles,
hunter, linkedin, netcraft, pgp, threatcrowd, hunter, linkedin, netcraft, pgp, threatcrowd, trello, twitter,
twitter, vhost, virustotal, yahoo, all""") vhost, virustotal, yahoo, all""")
print(" -g: use Google dorking instead of normal Google search") print(" -g: use Google Dorking instead of normal Google search")
print(" -s: start in result number X (default: 0)") print(" -s: start with result number X (default: 0)")
print(" -v: verify host name via DNS resolution and search for virtual hosts") print(" -v: verify host name via DNS resolution and search for virtual hosts")
print(" -f: save the results into an HTML and XML file (both)") print(" -f: save the results into an HTML and/or XML file")
print(" -n: perform a DNS reverse query on all ranges discovered") print(" -n: perform a DNS reverse query on all ranges discovered")
print(" -c: perform a DNS brute force for the domain name") print(" -c: perform a DNS brute force for the domain name")
print(" -t: perform a DNS TLD expansion discovery") print(" -t: perform a DNS TLD expansion discovery")
print(" -e: use this DNS server") print(" -e: use this DNS server")
print(" -p: port scan the detected hosts and check for Takeovers (80,443,22,21,8080)") print(" -p: port scan the detected hosts and check for Takeovers (80,443,22,21,8080)")
print(" -l: limit the number of results to work with(Bing goes from 50 to 50 results,") print(" -l: limit the number of results to work with (Bing goes from 50 to 50 results,")
print(" Google 100 to 100, and PGP doesn't use this option)") print(" Google 100 to 100, and PGP doesn't use this option)")
print(" -h: use SHODAN database to query discovered hosts") print(" -h: use Shodan to query discovered hosts")
print("\nExamples:") print("\nExamples:")
print((" " + comm + " -d microsoft.com -l 500 -b google -f myresults.html")) print((" " + comm + " -d acme.com -l 500 -b google -f myresults.html"))
print((" " + comm + " -d microsoft.com -b pgp, virustotal")) print((" " + comm + " -d acme.com -b pgp, virustotal"))
print((" " + comm + " -d microsoft -l 200 -b linkedin")) print((" " + comm + " -d acme -l 200 -b linkedin"))
print((" " + comm + " -d microsoft.com -l 200 -g -b google")) print((" " + comm + " -d acme.com -l 200 -g -b google"))
print((" " + comm + " -d apple.com -b googleCSE -l 500 -s 300")) print((" " + comm + " -d acme.com -b googleCSE -l 500 -s 300"))
print((" " + comm + " -d cornell.edu -l 100 -b bing -h \n")) print((" " + comm + " -d acme.edu -l 100 -b bing -h \n"))
def start(argv): def start(argv):
@ -134,84 +134,21 @@ def start(argv):
dnstld = True dnstld = True
elif opt == '-b': elif opt == '-b':
engines = set(arg.split(',')) engines = set(arg.split(','))
supportedengines = set( supportedengines = set(["baidu","bing","bingapi","censys","crtsh","cymon","dogpile","google","googleCSE","googleplus",'google-certificates',"google-profiles","hunter","linkedin","netcraft","pgp","threatcrowd","trello","twitter","vhost","virustotal","yahoo","all"])
["baidu", "bing", "crtsh", "censys", "cymon", "bingapi", "dogpile", "google", "googleCSE", "virustotal",
"threatcrowd", "googleplus", "google-profiles", 'google-certificates', "linkedin", "pgp", "twitter",
"trello", "vhost", "yahoo", "netcraft", "hunter", "all"])
if set(engines).issubset(supportedengines): if set(engines).issubset(supportedengines):
print("found supported engines") print("found supported engines")
print(("[-] Starting harvesting process for domain: " + word + "\n")) print(("[-] Starting harvesting process for domain: " + word + "\n"))
for engineitem in engines: for engineitem in engines:
if engineitem == "google": if engineitem == "baidu":
print("[-] Searching in Google:") print("[-] Searching in Baidu..")
search = googlesearch.search_google(word, limit, start) search = baidusearch.search_baidu(word, limit)
search.process(google_dorking)
emails = search.get_emails()
hosts = search.get_hostnames()
all_emails.extend(emails)
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'google')
db.store_all(word, emails, 'email', 'google')
if engineitem == "netcraft":
print("[-] Searching in Netcraft:")
search = netcraft.search_netcraft(word)
search.process() search.process()
all_emails = search.get_emails()
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'netcraft') db.store_all(word, all_hosts, 'host', 'baidu')
db.store_all(word, all_emails, 'email', 'baidu')
if engineitem == "google-certificates":
print("[-] Searching in Google Certificate transparency report..")
search = googlecertificates.search_googlecertificates(word, limit, start)
search.process()
hosts = search.get_domains()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'google-certificates')
if engineitem == "threatcrowd":
print("[-] Searching in Threatcrowd:")
search = threatcrowd.search_threatcrowd(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'threatcrowd')
if engineitem == "virustotal":
print("[-] Searching in Virustotal:")
search = virustotal.search_virustotal(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'virustotal')
if engineitem == "crtsh":
print("[-] Searching in CRT.sh:")
search = crtsh.search_crtsh(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'CRTsh')
if engineitem == "googleCSE":
print("[-] Searching in Google Custom Search:")
search = googleCSE.search_googleCSE(word, limit, start)
search.process()
search.store_results()
emails = search.get_emails()
all_emails.extend(emails)
db = stash.stash_manager()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db.store_all(word, emails, 'email', 'googleCSE')
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'googleCSE')
elif engineitem == "bing" or engineitem == "bingapi": elif engineitem == "bing" or engineitem == "bingapi":
print("[-] Searching in Bing:") print("[-] Searching in Bing:")
@ -221,60 +158,78 @@ def start(argv):
else: else:
bingapi = "no" bingapi = "no"
search.process(bingapi) search.process(bingapi)
emails = search.get_emails() all_emails = search.get_emails()
all_emails.extend(emails)
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, emails, 'email', 'bing') db.store_all(word, all_hosts, 'email', 'bing')
db.store_all(word, hosts, 'host', 'bing') db.store_all(word, all_hosts, 'host', 'bing')
elif engineitem == "censys":
print("[-] Searching in Censys:")
from discovery import censys
# Import locally or won't work
search = censys.search_censys(word)
search.process()
all_ip = search.get_ipaddresses()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'censys')
db.store_all(word, all_ip, 'ip', 'censys')
elif engineitem == "crtsh":
print("[-] Searching in CRT.sh:")
search = crtsh.search_crtsh(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'CRTsh')
elif engineitem == "cymon":
print("[-] Searching in Cymon:")
from discovery import cymon
# Import locally or won't work
search = cymon.search_cymon(word)
search.process()
all_ip = search.get_ipaddresses()
db = stash.stash_manager()
db.store_all(word, all_ip, 'ip', 'cymon')
elif engineitem == "dogpile": elif engineitem == "dogpile":
print("[-] Searching in Dogpilesearch..") print("[-] Searching in Dogpilesearch..")
search = dogpilesearch.search_dogpile(word, limit) search = dogpilesearch.search_dogpile(word, limit)
search.process() search.process()
emails = search.get_emails() all_emails = search.get_emails()
all_emails.extend(emails) all_hosts = search.get_hostnames()
hosts = search.get_hostnames() db.store_all(word, all_hosts, 'email', 'dogpile')
all_hosts.extend(hosts) db.store_all(word, all_hosts, 'host', 'dogpile')
db.store_all(word, emails, 'email', 'dogpile')
db.store_all(word, hosts, 'host', 'dogpile')
elif engineitem == "pgp": elif engineitem == "google":
print("[-] Searching in PGP key server..") print("[-] Searching in Google:")
search = pgpsearch.search_pgp(word) search = googlesearch.search_google(word, limit, start)
search.process() search.process(google_dorking)
emails = search.get_emails() emails = search.get_emails()
all_emails.extend(emails) all_emails.extend(emails)
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'pgp') db.store_all(word, all_hosts, 'host', 'google')
db.store_all(word, emails, 'email', 'pgp') db.store_all(word, all_emails, 'email', 'google')
elif engineitem == "yahoo": elif engineitem == "googleCSE":
print("[-] Searching in Yahoo..") print("[-] Searching in Google Custom Search:")
search = yahoosearch.search_yahoo(word, limit) search = googleCSE.search_googleCSE(word, limit, start)
search.process() search.process()
emails = search.get_emails() search.store_results()
all_emails.extend(emails) all_emails = search.get_emails()
db = stash.stash_manager()
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
db.store_all(word, all_hosts, 'email', 'googleCSE')
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'yahoo') db.store_all(word, all_hosts, 'host', 'googleCSE')
db.store_all(word, emails, 'email', 'yahoo')
elif engineitem == "baidu":
print("[-] Searching in Baidu..")
search = baidusearch.search_baidu(word, limit)
search.process()
emails = search.get_emails()
all_emails.extend(emails)
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'baidu')
db.store_all(word, emails, 'email', 'baidu')
elif engineitem == "googleplus": elif engineitem == "googleplus":
print("[-] Searching in Google+ ..") print("[-] Searching in Google+ ..")
@ -289,31 +244,14 @@ def start(argv):
print(user) print(user)
sys.exit() sys.exit()
elif engineitem == "twitter": elif engineitem == "google-certificates":
print("[-] Searching in Twitter ..") print("[-] Searching in Google Certificate transparency report..")
search = twittersearch.search_twitter(word, limit) search = googlecertificates.search_googlecertificates(word, limit, start)
search.process() search.process()
people = search.get_people() hosts = search.get_domains()
all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, people, 'name', 'twitter') db.store_all(word, all_hosts, 'host', 'google-certificates')
print("Users from Twitter:")
print("-------------------")
for user in people:
print(user)
sys.exit()
elif engineitem == "linkedin":
print("[-] Searching in Linkedin..")
search = linkedinsearch.search_linkedin(word, limit)
search.process()
people = search.get_people()
db = stash.stash_manager()
db.store_all(word, people, 'name', 'linkedin')
print("Users from Linkedin:")
print("-------------------")
for user in people:
print(user)
sys.exit()
elif engineitem == "google-profiles": elif engineitem == "google-profiles":
print("[-] Searching in Google profiles..") print("[-] Searching in Google profiles..")
@ -331,7 +269,7 @@ def start(argv):
elif engineitem == "hunter": elif engineitem == "hunter":
print("[-] Searching in Hunter:") print("[-] Searching in Hunter:")
from discovery import huntersearch from discovery import huntersearch
# import locally or won't work # Import locally or won't work
search = huntersearch.search_hunter(word, limit, start) search = huntersearch.search_hunter(word, limit, start)
search.process() search.process()
emails = search.get_emails() emails = search.get_emails()
@ -339,114 +277,107 @@ def start(argv):
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'hunter') db.store_all(word, all_hosts, 'host', 'hunter')
db.store_all(word, emails, 'email', 'hunter') db.store_all(word, all_emails, 'email', 'hunter')
elif engineitem == "censys": elif engineitem == "linkedin":
print("[-] Searching in Censys:") print("[-] Searching in Linkedin..")
from discovery import censys search = linkedinsearch.search_linkedin(word, limit)
# import locally or won't work search.process()
search = censys.search_censys(word) people = search.get_people()
db = stash.stash_manager()
db.store_all(word, people, 'name', 'linkedin')
print("Users from Linkedin:")
print("-------------------")
for user in people:
print(user)
sys.exit()
elif engineitem == "netcraft":
print("[-] Searching in Netcraft:")
search = netcraft.search_netcraft(word)
search.process() search.process()
ips = search.get_ipaddresses()
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
all_ip.extend(ips)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'censys') db.store_all(word, all_hosts, 'host', 'netcraft')
db.store_all(word, ips, 'ip', 'censys')
elif engineitem == "cymon": elif engineitem == "pgp":
print("[-] Searching in Cymon:") print("[-] Searching in PGP key server..")
from discovery import cymon search = pgpsearch.search_pgp(word)
# import locally or won't work
search = cymon.search_cymon(word)
search.process() search.process()
ips = search.get_ipaddresses() all_emails = search.get_emails()
all_ip.extend(ips) hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, ips, 'ip', 'cymon') db.store_all(word, all_hosts, 'host', 'pgp')
db.store_all(word, all_emails, 'email', 'pgp')
elif engineitem == "threatcrowd":
print("[-] Searching in Threatcrowd:")
search = threatcrowd.search_threatcrowd(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'threatcrowd')
elif engineitem == "trello": elif engineitem == "trello":
print("[-] Searching in Trello:") print("[-] Searching in Trello:")
from discovery import trello from discovery import trello
# import locally or won't work # Import locally or won't work
search = trello.search_trello(word, limit) search = trello.search_trello(word,limit)
search.process() search.process()
emails = search.get_emails() all_emails = search.get_emails()
all_emails.extend(emails) all_hosts = search.get_urls()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'trello') db.store_all(word, all_hosts, 'host', 'trello')
db.store_all(word, emails, 'email', 'trello') db.store_all(word, all_emails, 'email', 'trello')
for x in all_hosts: for x in all_hosts:
print(x) print(x)
sys.exit() sys.exit()
elif engineitem == "twitter":
print("[-] Searching in Twitter ..")
search = twittersearch.search_twitter(word, limit)
search.process()
people = search.get_people()
db = stash.stash_manager()
db.store_all(word, people, 'name', 'twitter')
print("Users from Twitter:")
print("-------------------")
for user in people:
print(user)
sys.exit()
# vhost
elif engineitem == "virustotal":
print("[-] Searching in Virustotal:")
search = virustotal.search_virustotal(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'virustotal')
elif engineitem == "yahoo":
print("[-] Searching in Yahoo..")
search = yahoosearch.search_yahoo(word, limit)
search.process()
all_emails = search.get_emails()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'yahoo')
db.store_all(word, all_emails, 'email', 'yahoo')
elif engineitem == "all": elif engineitem == "all":
print(("Full harvest on " + word)) print(("Full harvest on " + word))
all_emails = [] all_emails = []
all_hosts = [] all_hosts = []
print("[-] Searching in Google..") # baidu
search = googlesearch.search_google(word, limit, start)
search.process(google_dorking)
emails = search.get_emails()
hosts = search.get_hostnames()
all_emails.extend(emails)
db = stash.stash_manager()
db.store_all(word, emails, 'email', 'google')
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'google')
print("[-] Searching in PGP Key server..")
search = pgpsearch.search_pgp(word)
search.process()
emails = search.get_emails()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'PGP')
all_emails.extend(emails)
db = stash.stash_manager()
db.store_all(word, emails, 'email', 'PGP')
print("[-] Searching in Netcraft server..")
search = netcraft.search_netcraft(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'netcraft')
print("[-] Searching in ThreatCrowd server..")
try:
search = threatcrowd.search_threatcrowd(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'threatcrowd')
except Exception:
pass
print("[-] Searching in CRTSH server..")
search = crtsh.search_crtsh(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'CRTsh')
print("[-] Searching in Virustotal server..")
search = virustotal.search_virustotal(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'virustotal')
print("[-] Searching in Bing..") print("[-] Searching in Bing..")
bingapi = "no" bingapi = "no"
@ -456,15 +387,62 @@ def start(argv):
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'bing') db.store_all(word, all_hosts, 'host', 'bing')
all_emails.extend(emails) all_emails.extend(emails)
# Clean up email list, sort and uniq all_emails = sorted(set(all_emails))
# all_emails=sorted(set(all_emails)) db.store_all(word, all_emails, 'email', 'bing')
db.store_all(word, emails, 'email', 'bing')
print("[-] Searching in Censys:")
from discovery import censys
search = censys.search_censys(word)
search.process()
all_ip = search.get_ipaddresses()
all_hosts = search.get_hostnames()
db = stash.stash_manager()
db.store_all(word, all_ip, 'ip', 'censys')
db.store_all(word, all_hosts, 'host', 'censys')
print("[-] Searching in CRTSH server..")
search = crtsh.search_crtsh(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'CRTsh')
# cymon
# dogpile
print("[-] Searching in Google..")
search = googlesearch.search_google(word, limit, start)
search.process(google_dorking)
emails = search.get_emails()
hosts = search.get_hostnames()
all_emails.extend(emails)
db = stash.stash_manager()
db.store_all(word, all_emails, 'email', 'google')
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'google')
print ("[-] Searching in Google Certificate transparency report..")
search = googlecertificates.search_googlecertificates(word, limit, start)
search.process()
domains = search.get_domains()
all_hosts.extend(domains)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'google-certificates')
# googleplus
# google-certificates
# google-profiles
print("[-] Searching in Hunter:") print("[-] Searching in Hunter:")
from discovery import huntersearch from discovery import huntersearch
# import locally # Import locally
search = huntersearch.search_hunter(word, limit, start) search = huntersearch.search_hunter(word, limit, start)
search.process() search.process()
emails = search.get_emails() emails = search.get_emails()
@ -473,39 +451,65 @@ def start(argv):
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, hosts, 'host', 'hunter') db.store_all(word, hosts, 'host', 'hunter')
all_emails.extend(emails) all_emails.extend(emails)
# all_emails = sorted(set(all_emails)) all_emails = sorted(set(all_emails))
db.store_all(word, emails, 'email', 'hunter') db.store_all(word, all_emails, 'email', 'hunter')
print("[-] Searching in Google Certificate transparency report..") # linkedin
search = googlecertificates.search_googlecertificates(word, limit, start)
search.process()
domains = search.get_domains()
all_hosts.extend(domains)
db = stash.stash_manager()
db.store_all(word, domains, 'host', 'google-certificates')
print("[-] Searching in Censys:") print("[-] Searching in Netcraft server..")
from discovery import censys search = netcraft.search_netcraft(word)
search = censys.search_censys(word)
search.process() search.process()
ips = search.get_ipaddresses()
all_ip.extend(ips)
hosts = search.get_hostnames() hosts = search.get_hostnames()
all_hosts.extend(hosts) all_hosts.extend(hosts)
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, ips, 'ip', 'censys') db.store_all(word, all_hosts, 'host', 'netcraft')
db.store_all(word, hosts, 'host', 'censys')
print("[-] Searching in PGP Key server..")
search = pgpsearch.search_pgp(word)
search.process()
emails = search.get_emails()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'PGP')
all_emails.extend(emails)
db = stash.stash_manager()
db.store_all(word, all_emails, 'email', 'PGP')
print("[-] Searching in ThreatCrowd server..")
try:
search = threatcrowd.search_threatcrowd(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'threatcrowd')
except Exception: pass
# trello
# twitter
# vhost
print("[-] Searching in Virustotal server..")
search = virustotal.search_virustotal(word)
search.process()
hosts = search.get_hostnames()
all_hosts.extend(hosts)
db = stash.stash_manager()
db.store_all(word, all_hosts, 'host', 'virustotal')
# yahoo
else: else:
usage() usage()
print( print("Invalid search engine, try with: baidu, bing, bingapi, censys, crtsh, cymon, dogpile, google, googleCSE, googleplus, google-certificates, google-profiles, hunter, linkedin, netcraft, pgp, threatcrowd, trello, twitter, vhost, virustotal, yahoo, all")
"Invalid search engine, try with: baidu, bing, bingapi, crtsh, censys, cymon, dogpile, google, googleCSE, virustotal, netcraft, googleplus, google-profiles, linkedin, pgp, twitter, vhost, yahoo, hunter, all")
sys.exit() sys.exit()
# Results############################################################ # Results ############################################################
print("\n\033[1;32;40mHarvesting results") print("\n\033[1;32;40mHarvesting results")
if (len(all_ip) == 0): if (len(all_ip) == 0):
print("No IP addresses found") print("No IP addresses found.")
else: else:
print("\033[1;33;40m \n[+] IP addresses found in search engines:") print("\033[1;33;40m \n[+] IP addresses found in search engines:")
print("------------------------------------") print("------------------------------------")
@ -514,7 +518,7 @@ def start(argv):
print("\n\n[+] Emails found:") print("\n\n[+] Emails found:")
print("------------------") print("------------------")
# Sanity check to see if all_emails and all_hosts is defined # Sanity check to see if all_emails and all_hosts are defined
try: try:
all_emails all_emails
except NameError: except NameError:
@ -527,14 +531,14 @@ def start(argv):
sys.exit() sys.exit()
if all_emails == []: if all_emails == []:
print("No emails found") print("No emails found.")
else: else:
print(("\n".join(all_emails))) print(("\n".join(all_emails)))
print("\033[1;33;40m \n[+] Hosts found in search engines:") print("\033[1;33;40m \n[+] Hosts found in search engines:")
print("------------------------------------") print("------------------------------------")
if all_hosts == [] or all_emails is None: if all_hosts == [] or all_emails is None:
print("No hosts found") print("No hosts found.")
else: else:
total = len(all_hosts) total = len(all_hosts)
print(("\nTotal hosts: " + str(total) + "\n")) print(("\nTotal hosts: " + str(total) + "\n"))
@ -554,7 +558,7 @@ def start(argv):
db = stash.stash_manager() db = stash.stash_manager()
db.store_all(word, host_ip, 'ip', 'DNS-resolver') db.store_all(word, host_ip, 'ip', 'DNS-resolver')
# DNS Brute force#################################################### # DNS Brute force ################################################
dnsres = [] dnsres = []
if dnsbrute == True: if dnsbrute == True:
print("\n\033[94m[-] Starting DNS brute force: \033[1;33;40m") print("\n\033[94m[-] Starting DNS brute force: \033[1;33;40m")
@ -576,15 +580,14 @@ def start(argv):
for x in full: for x in full:
host = x.split(':')[1] host = x.split(':')[1]
domain = x.split(':')[0] domain = x.split(':')[0]
if host != "empty": if host != "empty" :
print(("- Scanning : " + host)) print(("- Scanning : " + host))
ports = [80, 443, 22, 8080, 21] ports = [80,443,22,8080,21]
try: try:
scan = port_scanner.port_scan(host, ports) scan = port_scanner.port_scan(host,ports)
openports = scan.process() openports = scan.process()
if len(openports) > 1: if len(openports) > 1:
print(("\t\033[91m Detected open ports: " + ','.join( print(("\t\033[91m Detected open ports: " + ','.join(str(e) for e in openports) + "\033[1;32;40m"))
str(e) for e in openports) + "\033[1;32;40m"))
takeover_check = 'True' takeover_check = 'True'
if takeover_check == 'True': if takeover_check == 'True':
if len(openports) > 0: if len(openports) > 0:
@ -593,7 +596,7 @@ def start(argv):
except Exception as e: except Exception as e:
print(e) print(e)
# DNS reverse lookup################################################# # DNS reverse lookup ################################################
dnsrev = [] dnsrev = []
if dnslookup == True: if dnslookup == True:
print("\n[+] Starting active queries:") print("\n[+] Starting active queries:")
@ -623,7 +626,7 @@ def start(argv):
for xh in dnsrev: for xh in dnsrev:
print(xh) print(xh)
# DNS TLD expansion################################################### # DNS TLD expansion #################################################
dnstldres = [] dnstldres = []
if dnstld == True: if dnstld == True:
print("[-] Starting DNS TLD expansion:") print("[-] Starting DNS TLD expansion:")
@ -637,7 +640,7 @@ def start(argv):
if y not in full: if y not in full:
full.append(y) full.append(y)
# Virtual hosts search############################################### # Virtual hosts search ##############################################
if virtual == "basic": if virtual == "basic":
print("\n[+] Virtual hosts:") print("\n[+] Virtual hosts:")
print("------------------") print("------------------")
@ -655,7 +658,7 @@ def start(argv):
vhost = sorted(set(vhost)) vhost = sorted(set(vhost))
else: else:
pass pass
# Shodan search#################################################### # Shodan search ####################################################
shodanres = [] shodanres = []
shodanvisited = [] shodanvisited = []
if shodan == True: if shodan == True:
@ -686,7 +689,7 @@ def start(argv):
pass pass
################################################################### ###################################################################
# Here i need to add explosion mode. # Here we need to add explosion mode.
# Tengo que sacar los TLD para hacer esto. # Tengo que sacar los TLD para hacer esto.
recursion = None recursion = None
if recursion: if recursion:
@ -701,7 +704,7 @@ def start(argv):
else: else:
pass pass
# Reporting####################################################### # Reporting #######################################################
if filename != "": if filename != "":
try: try:
print("NEW REPORTING BEGINS:") print("NEW REPORTING BEGINS:")
@ -722,19 +725,16 @@ def start(argv):
graph = reportgraph.graphgenerator(word) graph = reportgraph.graphgenerator(word)
HTMLcode += graph.drawlatestscangraph(word, latestscanchartdata) HTMLcode += graph.drawlatestscangraph(word, latestscanchartdata)
HTMLcode += graph.drawscattergraphscanhistory(word, scanhistorydomain) HTMLcode += graph.drawscattergraphscanhistory(word, scanhistorydomain)
HTMLcode += generator.generatepluginscanstatistics(pluginscanstatistics) HTMLcode += generator.generatescanstatistics(scanstatistics)
HTMLcode += generator.generatedashboardcode(scanboarddata) HTMLcode += '<p><span style="color: #000000;">Report generated on ' + str(datetime.datetime.now())+'</span></p>'
HTMLcode += '<p><span style="color: #000000;">Report generated on ' + str(
datetime.datetime.now()) + '</span></p>'
HTMLcode += ''' HTMLcode += '''
</body> </body>
</html> </html>
''' '''
Html_file = open("report.html", "w") Html_file = open("report.html","w")
Html_file.write(HTMLcode) Html_file.write(HTMLcode)
Html_file.close() Html_file.close()
print("NEW REPORTING FINISHED!") print("NEW REPORTING FINISHED!")
print("[+] Saving files...") print("[+] Saving files...")
html = htmlExport.htmlExport( html = htmlExport.htmlExport(
all_emails, all_emails,
@ -749,7 +749,7 @@ def start(argv):
save = html.writehtml() save = html.writehtml()
except Exception as e: except Exception as e:
print(e) print(e)
print("Error creating the file") print("Error creating the file.")
try: try:
filename = filename.split(".")[0] + ".xml" filename = filename.split(".")[0] + ".xml"
file = open(filename, 'w') file = open(filename, 'w')
@ -773,15 +773,9 @@ def start(argv):
shodanalysis = [] shodanalysis = []
for x in shodanres: for x in shodanres:
res = x.split("SAPO") res = x.split("SAPO")
# print " res[0] " + res[0] # ip/host
# print " res[1] " + res[1] # banner/info
# print " res[2] " + res[2] # port
file.write('<shodan>') file.write('<shodan>')
# page.h3(res[0])
file.write('<host>' + res[0] + '</host>') file.write('<host>' + res[0] + '</host>')
# page.a("Port :" + res[2])
file.write('<port>' + res[2] + '</port>') file.write('<port>' + res[2] + '</port>')
# page.pre(res[1])
file.write('<banner><!--' + res[1] + '--></banner>') file.write('<banner><!--' + res[1] + '--></banner>')
reg_server = re.compile('Server:.*') reg_server = re.compile('Server:.*')
@ -794,9 +788,9 @@ def start(argv):
shodanalysis = sorted(set(shodanalysis)) shodanalysis = sorted(set(shodanalysis))
file.write('<servers>') file.write('<servers>')
for x in shodanalysis: for x in shodanalysis:
# page.pre(x)
file.write('<server>' + x + '</server>') file.write('<server>' + x + '</server>')
file.write('</servers>') file.write('</servers>')
file.write('</theHarvester>') file.write('</theHarvester>')
file.flush() file.flush()
file.close() file.close()