January 20, 2015
Global computer networks are immensely beneficial to many users but they also can be immensely difficult for network administrators. Running a modern data network – with thousands of computers spread across a wide area – requires juggling myriad systems including power regulation, maintenance and traffic management, not to mention security.
To meet the needs of ever-expanding systems, researchers at Princeton and Microsoft have created an automated tool that manages the network’s needs. Called Statesman, the new software acts like an air traffic controller for large computer networks: it constantly monitors the needs of the system and coordinates actions of other tools involved in maintenance and operations.
“Companies that run these large clouds have a scale problem,” said Jennifer Rexford BSE ’91, the Gordon Y.S. Wu Professor in Engineering and one of the developers of Statesman. “The size of the networks keeps getting bigger and bigger.”
Working with the team at Microsoft’s Azure network, Rexford and graduate student Peng Sun set out to create a network management system that is reliable, adaptable and requires little or no human intervention. In order to manage multiple tasks that are running independently – like power management and network traffic – the team created three states in a network: an observed state; a proposed state; and a target state. The Statesman program maintains a current view of a network, which is the observed state, and also is responsible for updating the network to a desired target state.
When a subordinate system wants to make a change in the network – say a traffic management system wants to send data requests to a different group of servers on the network – it develops a proposed state and sends this to Statesman. Statesman compares the proposed state to changes proposed by other programs and uses a set of rules to determine whether the change can be allowed. If, for example, the traffic manager wanted to use servers that a power management system needed to take offline, the traffic request would be denied.
“We wanted a system that could manage very large-scale infrastructure automatically and handle conflict and safety issues on its own,” Sun said.
Statesman began operating in Microsoft data centers last October.
“It is not a prototype,” Sun said. “It was built for use from day one.”
by John Sullivan, EQuad News Idea to Impact Winter 2015