Number of URL dicovered by Crawling differs

sebastien.aucouturier's Avatar

sebastien.aucouturier

03 Jun, 2016 12:32 PM

Tasos,
running only check=-
how we can explain that the Number of URL discovered and number of pages audited differs from a scan to another (the 2nd was launch few second after the latest ended)
Thanks

  1. Support Staff 1 Posted by Tasos Laskos on 03 Jun, 2016 12:36 PM

    Tasos Laskos's Avatar

    It depends, there could be timed out requests that resulted in missing pages or new dynamic information that changes over time or changes due to the previous scan.
    There's no way to know without access to the web application.

  2. 2 Posted by sebastien.aucou... on 03 Jun, 2016 12:51 PM

    sebastien.aucouturier's Avatar

    I reproduced it, using acunetix Vulnerable HTML5 test website: http://testhtml5.vulnweb.com/

    Most of the time the missing url is:
    http://testhtml5.vulnweb.com/logout

    I do not see any page timeout -
    cmd use : arachni --check=- http://testhtml5.vulnweb.com/

  3. Support Staff 3 Posted by Tasos Laskos on 03 Jun, 2016 01:09 PM

    Tasos Laskos's Avatar

    I think you'll find that the results will be consistent with the nightlies.
    That was due to a fade-in/out effect transition in the login form, the system has been updated to handle them better.

  4. 4 Posted by sebastien.aucou... on 03 Jun, 2016 01:14 PM

    sebastien.aucouturier's Avatar

    Great tasos,
    i will checkout and make a run and tell you about.
    Thanks a lot.

  5. Support Staff 5 Posted by Tasos Laskos on 03 Jun, 2016 01:17 PM

    Tasos Laskos's Avatar

    I did notice another thing with the nightlies though.
    Sometimes it doesn't go past /#/latest/page/1, other times it reaches /#/latest/page/2 and more.
    Not sure what that's about yet but I'm aware of it.

  6. 6 Posted by sebastien.aucou... on 03 Jun, 2016 04:12 PM

    sebastien.aucouturier's Avatar

    Tasos,
    here is the link crawled with 2.0-deb 26 May night build

     [+] http://testhtml5.vulnweb.com/
     [+] http://testhtml5.vulnweb.com/#/about
     [+] http://testhtml5.vulnweb.com/#/archive
     [+] http://testhtml5.vulnweb.com/#/carousel
     [+] http://testhtml5.vulnweb.com/#/contact
     [+] http://testhtml5.vulnweb.com/#/latest
     [+] http://testhtml5.vulnweb.com/#/latest/page/1
     [+] http://testhtml5.vulnweb.com/#/popular
     [+] http://testhtml5.vulnweb.com/#/popular/page/1
     [+] http://testhtml5.vulnweb.com/#/popular/page/2
     [+] http://testhtml5.vulnweb.com/#/popular/page/3
     [+] http://testhtml5.vulnweb.com/#/popular/page/4
     [+] http://testhtml5.vulnweb.com/ajax/latest
     [+] http://testhtml5.vulnweb.com/ajax/popular
     [+] http://testhtml5.vulnweb.com/contact
     [+] http://testhtml5.vulnweb.com/forgotpw
     [+] http://testhtml5.vulnweb.com/login
     [+] http://testhtml5.vulnweb.com/logout
     [+] http://testhtml5.vulnweb.com/static/css/style.css
     [~] Total: 19
    

    The crawl is better because, those 404 link does not appear anymore :

     [+] http://testhtml5.vulnweb.com/.carousel
     [+] http://testhtml5.vulnweb.com/.carousel-inner
     [+] http://testhtml5.vulnweb.com/.fluid-container
     [+] http://testhtml5.vulnweb.com/.well
     [+] http://testhtml5.vulnweb.com/row
     [+] http://testhtml5.vulnweb.com/span
    

    The js link have also disapeared:

     [+] http://testhtml5.vulnweb.com/static/app/app.js
     [+] http://testhtml5.vulnweb.com/static/app/controllers/controllers.js
     [+] http://testhtml5.vulnweb.com/static/app/libs/sessvars.js
     [+] http://testhtml5.vulnweb.com/static/app/post.js
     [+] http://testhtml5.vulnweb.com/static/app/services/itemsService.js
    

    and you already know about /#/latest/page/ urls
    look like url logout always shown (on the few test i do)

  7. Support Staff 7 Posted by Tasos Laskos on 03 Jun, 2016 04:55 PM

    Tasos Laskos's Avatar
  8. 8 Posted by sebastien.aucou... on 04 Jun, 2016 08:03 AM

    sebastien.aucouturier's Avatar

    I test the nightlies , and js url come back.
    keep me inform when you got a fix for /#/latest/page/
    Thanks Tasos, Have a nice day.

  9. Support Staff 9 Posted by Tasos Laskos on 03 Aug, 2016 02:24 PM

    Tasos Laskos's Avatar

    Finally found some time to look into this and the fix is being uploaded in the nightlies as we speak.
    I'll let you know once the packages are up.

  10. Support Staff 10 Posted by Tasos Laskos on 03 Aug, 2016 04:28 PM

    Tasos Laskos's Avatar
  11. Tasos Laskos closed this discussion on 03 Aug, 2016 04:28 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac