The Data Intensive Distributed Computing Research Group (DIDC) 
(NOTE: Obsolete Web page! For current projects see the new DSD web page )
Lawrence Berkeley National Laboratory
Distributed Systems Department
Computational Research Division

Self-Configuring Network Monitor Project
Principal Investigators:   Deb Agarwal and Brian Tierney


Vision:


Application developers currently have very few tools to aid in developing distributed applications that effectively utilize the network; the tools which do exist are generally accessible only to the network engineer and do not provide information regarding the entire network path (local and wide area networks). Without information about a stream from intermediate hops within the network, the end-to-end system is often unable to identify and diagnose problems within the network. For a distributed application to fully utilize the network, it must first know the current network properties and what is happening to its data. This project is addressing the need for a network monitoring infrastructure to support passive network monitoring. The ultimate goal of this infrastructure is to provide accurate, comprehensive, and on-demand, application-to-application monitoring capabilities throughout the interior of the interconnecting network domains. In this project we are designing and implementing a self-configuring monitoring system that uses special request packets to automatically activate monitoring along the network path between communicating endpoints. Archived monitoring data will help point the way beyond the handcrafted systems of network testbeds to a production environment that can routinely support high performance distributed applications. This passive monitoring system will integrate with active monitoring efforts and provide an essential component in a complete end-to-end network test and monitoring capability. It will complement the existing network operation efforts. A principal design goal of the system is to provide components that are secure, easy to install, and easy to maintain so that the system does not add a burden to the network's administration. This architecture will not require modifications to the application, network routing, or forwarding infrastructure, nor is human intervention required once monitoring has been triggered.

Major Goals and Technical Challenges:


Comprehensive end-to-end and top-to-bottom monitoring is critical for developing and debugging high performance, distributed applications. However, this service is largely unavailable to the application developer except in testbed environments. Increasingly the approach of these applications is to rely on "automatic" tuning of transport parameters such as TCP window size, parallel streams, etc. However, the results of the tuning still must be verified, and sometimes debugged, both of which rely on fine-grained network monitoring. In addition, end-to-end approaches are limited in their ability to diagnose problems in the intervening networks and to diagnose the impact of tuning on other traffic in the network. The information from the monitors will be directly available to applications to aid in debugging and tuning of application data transmission.

Applications will be able to send "request" packets to automatically activate monitoring along the network path between communicating endpoints. The request packets pass through passive sensors that are deployed at the ingress and egress routers of the wide-area networks and at critical points in the end site networks. To activate monitoring, an endpoint of a data stream runs a program that sends request packets to the other endpoint. The goal of these packets is to alert each monitor in the interior of the network that the corresponding application flow is requesting monitoring from the network. Once activated, the monitors open a connection to a remote agent. The sensors will send to the agent a stream of monitoring data extracted from the packet flow. We will be deploying this system at critical ESnet ingress and egress sites and at a few prototype end sites. This passive monitoring system will provide an essential component in a complete end-to-end network test and monitoring capability and will complement the existing network operation efforts. Most critically, this monitoring system will provide a mechanism for applications to determine what is happening to their data in the network. It is expected to be critical in helping to bridge the gap between network engineers and application designers/users.

The goals of this project are:



Recent Talks


Page last modified: Friday, 04-Feb-2005 00:39:26 PST
Contact: Webmaster <webmaster@george.lbl.gov>
Privacy and site security notice to Users