Musings on personal and enterprise technology (of potential interest to professional technoids and others)

Tuesday, October 23, 2007

Change Management/ITIL/Root Causes - Information Security Magazine

Nice perspective on ITIL, and the ideal goal of proactively identifying root-causes for meaningful followups:

Seven Winners, One Mission - Information Security Magazine:
"'...we shouldn't be doing security for the sake of doing security. We should be doing security because we're running a business,' Riggs says. Service management is a Reuters-wide mandate, one spawned three years ago as a regimen of strict best practices based on the popular U.K. ITIL standard. Riggs is also working to integrate IT security as a global discipline into ITIL best practices. 'As a company we've been banging the drum about customer service, and we're pushing hard to ensure things are done in a systematic, disciplined way to make the customer experience even better,' Riggs says. Reuters' customers measure performance in hundredths or thousandths of seconds; latency is not tolerated. Thus it is dogged work tracking a complex environment of real-time data feeds, historical databases and an infrastructure of 30,000 switches, routers and more than 1,300 firewalls. A standardized service management approach is the only logical means of keeping such complexity reined in, Riggs says. In addition, the company has unified operations and security around incident, problem, configuration, change and release management processes. For example, Reuters' security analysts examine every security incident--whether it caused a disruption or not--to understand a root cause of the management behavior that failed and why a service was not resilient. Finding the root cause allows Riggs' team to apply that information elsewhere and mitigate future events. Modeling exercises, meanwhile, allow them to anticipate problems in the event of future incidents or scheduled network changes, which can number hundreds per week.

"You always expect your infrastructure to come under attack. But if it fails, you have to understand the real underlying root cause. Was it a network design problem, a third-party quality failure, capacity overrun or did it fail because of a configuration problem?" Riggs says. "We want to pinpoint this as well as any aggravating factors and triggers...and then see where we may be exposed elsewhere and fix it before it causes customer pain."

Riggs has tried to instill that uniformity up and down Reuters' supply chain as well.

"I treat them as a virtual extension of my team, and expect them to behave in a certain way." Riggs says. "That's what I expect of the products they deliver...."

Now playing: Jimmy Buffett - Island

via FoxyTunes

No comments: