Kjwon15’s
Artificial
Inteligence
Systems
cfile8.uf.146F914B4D68A963025987.rarcfile3.uf.12225D4E4D68A9662E6C71.execfile30.uf.1929EC4F4D68A9652691A8.exe
UnescapeURI:
http://server.domain/page?name=%76%61%6c%75%65 -> http://server.domain/page?name=value
GetHtml:
It downloads content
FindNode:
It finds “href=” and save URIs to database
Parsing:
It finds unparsed uri from database and call GetHtml and FindNode
URItoFilename:
http://server.domain/page?name=value -> server.domain/page/name=value
CheckDomain:
Match URI and base domain
cfile9.uf.151C4F484D68A962064776.rar
cfile4.uf.1741DA4C4D68A96003CE98.exe
2011/02/08 – [Computer/Programing] – Kjwon15’s Web Crawler (WebBot)
UnescapeURI:
http://server.domain/page?name=%76%61%6c%75%65 -> http://server.domain/page?name=value
GetHtml:
It downloads content
FindNode:
It finds “href=” and save URIs to database
Parsing:
It finds unparsed uri from database and call GetHtml and FindNode
URItoFilename:
http://server.domain/page?name=value -> server.domain/page/name=value
CheckDomain:
Match URI and base domain
cfile24.uf.1738124F4D68A95F0A5239.rar
cfile30.uf.141A6A4A4D68A95F2D2DF7.exe