I\'ve got a project where I\'m hitting a bunch of custom Windows Performance Counters on multiple servers and aggregating them into a database. If a server is down, I want
Opening Socket to a specific port usually does the trick. If you really want it to be fast, be sure to set the NoDelay property on the new socket (Nagle algorithm) so there is no buffering.
Fast will largely depend on latency but this is probably the fastest way I know to connect to an endpoint. It's pretty simple to parallelize using the async methods. How fast you can check will largely depend on your network topology but in tests for 1000 servers (latency between 0-75ms) I've been able to get connectivity state in ~30 seconds. Not scientific data at all but should give you the idea.
Also, don't ever do this through UNC file shares because if the server no longer exists you will have a lot of dangling connections that take forever to timeout. So if you have a lot of servers with invalid DNS records and you try to poll them you will bring Windows down completely over time. Things like File.Exists and any file access will cause this.
The "Full-Blown" option would be to install a monitoring tool like SCOM (System Center Operations Manager), this has an SDK you can use to query SCOM for (performance) and maintenance information avout machines being monitored. Might be a bridge to far though....
Telnet is another option. Try telnetting to the target machine to see if it responds.
Create a small Windows Service that you install on your target machine, have the sys admin stop it when they perform maintenance on the target machine (just use batch file to net stop / net start the service)
The only "quick" way I think to see if it's up without relying on ping would be to create a socket, and see if you can actually connect to the port of the service you're trying to reach.
This would be the equivalent of telnet servername 135 to see if it's up.
Specifically...
System.Net.Sockets.TcpClient
)Close()
to cancel the connection.Disclaimer: I have no idea what effect this would have on any threat/firewall protection that may see this type of Connect / Disconnect with no data sent activity as a threat.
Ping First, Ask Questions Later
Why not ping first, and then do the di.Exists if you get a response?
That would allow you to fail early in the case that is not reachable, and not waste the time for machines that are down hard.
I have, in fact, used this method successfully before.
Paralellize
Another option you have is to paralellize the checking, and action on the servers as they are known to be available.
You could use the Paralell.ForEach()
method, and use a thread-safe queue along with a simple consumer thread to do the required action. Combined with the checking method above, this could alleviate almost all of your bottleneck on the up/down checking.
Knock on the Door
Yet another method would be to ckeck if the required remote service is running (either by hitting its port directly or by querying it with WMI).
Since WMI is almost always running when a machine is up, your connection should be very quick to either succeed or fail.