Arachni: Discussion

Arachni spidering REST APIs

2017-05-16T10:03:39Z

Ok, what I noticed, the endpoint that is provided, if it returns a 404, Arachni doesn't spider it.

http://test-endpoint:8081/ has nothing, so we return a 404. It makes logical sense to start the scan from this start point because it can then spider into all areas. Whereas, starting from test1 will mean that it may not reach test2/test3/ so on. How can I still get Arachni to run even when response code is 404?

Arachni spidering REST APIs

2017-05-16T10:12:27Z

It has nothing to do with the response code, are you sure that there are paths that lead to the other resources from the target you provided?

Arachni spidering REST APIs

2017-05-16T10:24:32Z

[HTTP: 404] http://app:8081/ [~] Analysis resulted in 0 usable paths.

I think it is todo with response code, if it returns a 200 (I changed the "/" resource to return a 200) it spiders and continues the scan.

Arachni spidering REST APIs

2017-05-16T10:28:17Z

Are you absolutely certain that only the response code is different? Did you check via a browser?

Arachni spidering REST APIs

2017-05-16T11:07:23Z

Also, if http://test-endpoint:8081/has nothing it makes no sense to provide that as a starting point since it exposes no other paths to follow.

Arachni spidering REST APIs

2017-05-16T11:25:51Z

Yeah i'm sure, I changed the microservice to return a 200 instead of a 404 on the "/" resource. And it spiders but this is hacky, I don't want to change the application to make it scannable. Is the original behaviour of Arachni to not run on a 404?

I thought it would've been able to spider into other parts of the url, I'm not too sure how the underlying implementation works but I thought it would just pump in different url endings /AB, /AC etc etc... even if a 404 was present. I think you're right, if there is a 404, it doesn't expose any usable paths, thus, Arachni only scans one page. I wish this was more explicitly stated, maybe it is just my understanding.

How should I do this then? Is there a way of automatically spidering into URL's from a 404 "/"?

Arachni spidering REST APIs

2017-05-16T11:26:59Z

I can start from "/health" which returns a 200 but I fear it may not be able to reach other sibling resources.

Arachni spidering REST APIs

2017-05-16T12:07:30Z

The response code is irrelevant, if the response has paths that can be followed then they will be followed.
I think what confused you are the checks that discover directories, those aren't part of the crawl and will not run on non-200 codes, so you will see a difference in behavior.

The important part is that the scan will not run if the seed URLs hasn't got any usable paths, you either need to provide a different target URL or extend the paths manually.

Arachni spidering REST APIs

2017-05-16T16:23:00Z

That's what i'm doing now, i'm creating a file with the paths it should go under. Can regex expressions be used in the file or relative paths? Or must they be simple absolute paths of the URLs?

Arachni spidering REST APIs

2017-05-17T09:51:13Z

Either relative or absolute paths, if they're relative they're going to be converted to absolute using the target URL as a reference.