Q: How will the network monitoring be conducted?
[Network] monitoring is really important, especially if you're committing to some kind of service-level agreement (SLA), then you need to monitor that network to be able to demonstrate that you're meeting that SLA. A network that isn't monitored is just a pile of equipment.
There are two kinds of [network] monitoring: historical monitoring and real-time monitoring. Both have different purposes and both help you for different reasons. Historical monitoring tracks some characteristic of the network over time. One example is the utilization on a particular pipe. Maybe you're recording how full that pipe is every 10 minutes and over the course of the year you build a graph that lets you plan for the future. You can often eyeball the chart and see that you're going to run out of capacity in, say, six months at current growth rates, and you know it takes six weeks to order more parts, so you can practically pick the day that you need to order more capacity.
This is a lot better than the situation where the customer calls and says, "The network seems to be overloaded," and you say, "I can fix the problem, but it's going to take six weeks to order more capacity." Historical monitoring helps you prevent that kind of situation.
Some other things you might want to do historical monitoring on are [network] utilization and congestion. Don't go wild and try to collect every bit of information on every possible link. I think it's important to pick your trunk lines and maybe some key servers, and monitor those lines for utilization and packet loss, and that will do you just fine.
The other kind of [network] monitoring is often what people think about first when they think about monitoring, and that's real-time monitoring. That detects whether something [on your network] is up or down. So real-time monitoring tells you when to roll in the repair truck. This is even more important with an N+1 redundant network. [In that case] a device can be down and you won't know it, because the network stays up when one device goes down, so it's important not just to monitor is data getting through, but the individual components. I've been at many sites, and there's a router with three fans that could run on two fans just fine, and the third one was for redundancy. I got there and multiple fans had died because no one noticed that the first fan had died.