The Department of Veterans Affairs is developing a new “infrastructure operations as a service” model to reduce downtimes and streamline services, the agency’s Deputy Chief Information Officer of Infrastructure Operations Reggie Cummings told GovCIO Media & Research.
Cummings is responsible for four main portfolios: VA Enterprise Cloud, VA Platform One, Identity and Access Management (IAM) and SharePoint Online Services. His team is also building out a new model he termed "infrastructure operations (IO) as-a-service" to align with broader VA business models.
“We identified and created a methodology within IO to help us better align to some of the more traditional business models. We call it ‘IO-as-a-service,’ and in our approach, we’re creating and driving our current and future capability roadmaps,” Cummings said.
From the "IO-as-a-service" perspective, Cummings’ team follows three key principles: customer-centric service, operating in a bimodal manner and continuous communication.
IO’s customer-centric service creates a central intake, baselines and assesses product areas, and provides a service catalog to show services offered and how to access them.
Cummings’ team uses a general DevSecOps framework to align with OIT, which focuses on strong security operations, automation, modernization and innovation. By taking a bimodal approach to software development, IO allows waterfall software development to continue without interference from DevSecOps practices within its infrastructure.
“That’s how we’re approaching it from a bimodal perspective. We don't allow one area to infringe on the other. Both are still focused on security operations and automation; it's just the speed at which they do that is a little different,” Cummings said.
The IO team will also take a deeper look at services operations from a component perspective to ensure resiliency.
One example of this introspective approach is the progress made with VA’s IAM services. VA developed IAM with three-nines availability, meaning the agency could maintain the environment without more than 50 minutes of downtime per month. But 50 minutes of downtime per month is still disruptive to daily operations, Cummings said.
“We had to critically look at how we are instrumented, how we're monitoring, and then how we're building on top of what we already thought was a highly available system,” he said.
To solve the problem, IO embarked on a four-nines environment for IAM to increase resiliency. VA created an “active-active” solution, where in the case of a failover, the standby site would be available for use immediately.
“If either one of [the sites] failed at the time a session occurred, it would automatically failover to the non-impacted environment, without any real indication to the user that it occurred,” Cummings explained. “We got through the first two or three phases of that. We have several products that are running and operating in that active-active model.”
Automation is also playing a leading role in VA’s IO unit. To improve automation development and integration, Cummings is focusing on three key areas: creating self-service models and catalogs, infrastructure-as-code (IaC) and AIOps.
“Automation is — to me — part of our customer-centric focus in IO,” Cummings said.
An AIOps pilot is currently underway to automate air handling and incident responses. From the pilot, VA will learn how to address actions “in the air” without causing an outage. The proactive approach is moving VA’s actions “left of boom,” Cummings said.
“That pilot has been going on for the better part of nine months or so. I’m looking forward to putting some things in production and seeing more growth from that point forward,” he added. “Part of AIOps is getting your pillars and post stood up. I think we have that. Now, it's a matter of bringing some collective intelligence to the things that we're doing so that we get to that place where we're more comfortable with the systems’ reaction.”
Moving into the new year, VA’s infrastructure operations team will focus on improving user responsiveness by building out its IO-as-a-service model and bolstering resiliency for its core infrastructure services.
“We commissioned IO as a service about two years ago. We're getting into a majority place where we now understand and have real personas out there to elevate some of the items that make a real difference for our customers,” Cummings said. “I hope that 2023 results in some real service catalogs that are usable, meaningful and valuable for our customers.”