The General Services Administration (GSA), National Science Foundation (NSF) and Department of Veterans Affairs are bolstering data capabilities by accelerating data sharing efforts to improve cross-agency collaboration and gain insights faster, ultimately driving agency missions.
"One of the things that's the most challenging about this is that we are trying to change the gasoline in an airplane that's already in flight, and it's not landing,” Dorothy Aronson, NSF CIO, said during an FCW virtual event. “We have to deal with today’s problems, and answer today’s questions, at the same time as we’re working to build toward the future and make it better.”
Aronson explained that having a standardized data inventory is critical to data sharing and accurate analysis. She noted that all of the coding systems that NSF has used have not been shared, which creates challenges with linking datasets and deriving insights.
“One of the things that people want to understand is ‘what’s the history here?’ We can design, and we are designing, our new systems with standardization more in mind than we did 40 years ago,” Aronson said. “We've escalated that to a point where we want to use shared services, we want to use standards of performance and tools.”
GSA is also building out its data governance model to better leverage data assets to inform decision making and improve the data lifecycle process, Payman Sadegh, the agency’s chief data officer and presidential innovation fellow explained. To get there, GSA is improving people processes to streamline the integration and use of technology tools and systems.
"Data is a lifeblood of this entire system,” Sadegh said. “We need to make sure that we are managing our data properly, so it can be processed through this system that we are thinking about building, and we have the data inventory defined, and so on. Without that, it will be very difficult to accomplish any goal as far as your data environment is concerned.”
As agencies look to accelerate data sharing, they are emphasizing the importance of privacy. Sadegh explained that data resides in multiple different environments in different formats, so data sharing and standardization is essential to ensuring needs are met; however, data protection should not be overlooked.
As GSA fosters more collaborative efforts and data sharing agreements, the agency will strengthen transparency and protect privacy. Ultimately, Sadegh said that he’d like to “share data by default,” and GSA is improving its governance structure and technology tools to get there.
“There’s always a tradeoff to be made. On one hand, we have to have transparency and make sure that we provide the data to folks that are in need in need of the data. On the other, we need to make sure we’re protecting privacy,” Sadegh said. “It’ll take a combination of technology and governance to really” address this problem.
At VA, leaders are accelerating efforts around paperless medical records to improve the understanding, standardization and use of veteran health data, Dr. Amy Justice, CNH Long Professor of Medicine and of Public Health at Yale University and Staff Physician with VA Connecticut Healthcare System, said during GovCIO Media & Research’s AI Gov: Mastering Data virtual event. To properly analyze and interpret data, VA is driving data sharing and collaboration.
“It’s not like having the data solves the problem. You have to have the data, but you also have to make sense of it clinically. That requires close collaboration between clinical experts who understand how that data has been recorded and people with epidemiology and informatics expertise,” Justice said.
Rafael Fricks, Lead AI Tech Sprints Coordinator at VA, explained that the agency is also leveraging synthetic data to understand the functions of certain AI tools that sift through data stores. He explained that using synthetic data expedites results and boosts collaboration because there are less privacy concerns compared to sharing real patient data.
“We can generate a completely artificial set of data points that is reminiscent of the population, but doesn’t represent any particular individual,” Fricks explained. “There are significantly fewer or no privacy concerns with sharing synthetic data, so we can often share through the open data initiative, and it’s an ongoing project.”
Looking to the future, building trust is pivotal. Many agencies are looking to buildout new strategies to ensure that systems, tools and technology that is handling large, sensitive datasets are secure and trustworthy. Aronson explained that NSF looks to continuously improve systems to ensure they are trustworthy, as well as educate senior leadership on the value of technology and new projects.
“You have to prove the value of the capability with small things. You call your shot, then you make your shot, and then you show people what we did, so that you can build trust in your organization's capabilities,” Aronson explained. “Then people will give you more problems to solve and more resources that you need, and it builds on itself.”
As agencies continue to build out their strategies, near real-time insights will remain at the forefront of data initiatives. Justice explained that cross-agency data sharing will be pivotal to successful data analysis and interpretation.
“The federal government and agencies need to get together and find ways to share data that are in compliance ... in near real time and appropriately linked ... There are a number of ‘mother may I’s’ that really need to be in place before we can bring these data sets together and really have an understanding of an individual’s experience.”