The way you've setup the grid no other Dispatcher than arachni-master will be used due to the high weight you've assigned to it.
Also, try removing the --spawns option, it's unstable and will be removed.
Don't specify --grid either so that you won't have to specify --spawns, the Dispatchers will load balance the scans amongst themselves still.
Also, arachni-master isn't managing anything, no one node is more important that the others. Whichever one you ask it'll search the Grid for the one with the lowest workload score, ask it for an Instance and then pass that information back to you.
Still very unstable. It is like the arachni_rpc is loosing it's connection to the dispatcher(After pressing ctrl+c it will hang forever)
And then some of the new scans started with rpc will not start.
I don't know if it is caused by ram consumption but some of our servers will also just become completely unresponsive in some cases. I just think it's hard to control ram when it is balancing a lot of scans by itself.
Do you have any ideas or any debug info that i could provide that would be helpfull ?
That is good to hear. Around 60% to 80%. I would not consider it as lagging but more as the client unable to contact the dispatcher. Also some of the boxes are completely dead meaning i can't even SSH into them. In google cloud console they show ~0% cpu at that time though.
Good to know thanks.
I understand. It is hard for me to believe that it is a network issue though, given that it is build in google cloud and the servers are next to eachother..
Yeah, better stick with one scan per core.
Also, while the scans are running can you try watch -n1 dfand watch -n1 free over SSH? At the point where it gets stuck we'll know how things look resource-wise.