U.S. Citizenship and Immigration Services "is a data agency," according to its chief data officer, Beth Puchek.
With the agency handling thousands of citizenship applications every year and receiving the most Freedom of Information Act (FOIA) requests of any other federal agency, managing data is its strong suit.
"I'm a structural engineer by training, and a systems engineer by experience," Puchek said a Jan. 21 ATARC event. "That makes me really passionate about data analysis, process and visualizing abstract concepts.”
To help manage the overwhelming amount of data, the agency in 2020 set up a robotic process automation center of excellence to help "establish the relationship between [office of information technology], the [office of the CDO] and all of these really excited and willing partners that wanted to automate really redundant processes,” Puchek said.
With that effort, the agency spent its initial efforts standing up its governance model and building out its infrastructure and pipeline to roll out bots, she added.
“Right now we're operating attended bots, speeding up the manual work that individuals have to do, and this year we're focused on getting out the unattended bots," she said. "We wanted to start with more of the administrative-type tasks to automate, and this year we'll start to pivot not just into the unattended bots, but also some more of our operational processes.”
At the Office of Intelligence & Analysis at Department of Homeland Security headquarters, Deputy Chief Data & Analytics Officer Emily Barbero said data sharing is incredibly important to the office’s mission.
“We set up our data and analysis functions separate from our CDO — we are that bridge between the department and the intelligence community,” she said during the event. “We manage our responsibilities on the classified domain. We take a lot of best practices from the intelligence community and put those in place in the department. So we work with a lot of components of the data to make it shareable from a policy standpoint.”
Her office is leaning heavily into natural language processing to understand data before sharing it. This means automating the digitization of handwritten documents or notes that provide intelligence value.
“We don't necessarily want artificial intelligence to completely take the operator or analyst out of the loop," Barbero said. "We're looking at capabilities that help us, from a machine-learning standpoint, assess the risk calculus of some of our data. There's a lot of inherent U.S. persons information in DHS data, and we'd like to build out machine-learning capabilities to help us identify the likelihood that this individual may or may not be a U.S. citizen.”
Immigration and Customs Enforcement Chief Data Officer Ken Clark said he works closely with Puchek given their agency mission overlap. Interoperability, he said, is one of the biggest challenges with regard to data at DHS. ICE and the rest of DHS are already working hard to address those issues. Because ICE’s FOIA, data records and privacy teams all operate out of Clark’s office, he’s able to coordinate the best data strategy to serve each team’s needs.
“One of the things we're doing is establishing a data standards program and a taxonomy that gives classification of our data into various high-level categories based on the business needs and drivers,” he said. “We also have a data governance board established and we're working on a data strategy roadmap, and also recently developed an information-sharing and access agreement policy, which will identify roles and responsibilities and practices on how to do sharing of data outside of the organization in a bidirectional manner."
Like Barbero's office, ICE sees natural language processing, AI and machine learning as key drivers of efficient data management, though Clark is wary of seeing these technologies as full problem solvers.
Clark, the AI point of contact at the agency, said he’s excited about ICE’s Data Innovation Lab work with AI, biometrics, data analytics and the potential for more efficient and effective data management.
“AI cannot solve all the problems, you have to have the human aspect in there, particularly in law enforcement,” he said. “One of the things we've been looking at is how tools such as natural language processing and AI can help sift through the vast amount of data that's out there.”
Clark said he wants ICE to really focus on its data program, taxonomy and interagency exchanges “that are going to help document policy and process.”
“We're also looking at a data maturity model to help put some measures and benchmarks into growing our program, but also growing the workforce and helping to move the culture to a more data-driven organization,”he added.
Barbero said her focus in 2021 is overhauling the data acquisition process to make it more “supportive” of the office’s mission.
“What's really critical and an enhanced focus for us is really looking at our acquisitions process and data literacy,” Barbero said. “That's a focus for us, as well as the data literacy piece, so analysts know about the data, use it appropriately and have the tools to enhance the mission. Data literacy is really critical for us so we have a workforce that's really empowered.”
USCIS plans to focus more on implementation in 2021 after spending 2020 bringing interoperability across systems. In addition to setting up a “single source of truth” data set to provide clarity across the agency, Puchek said they’re working on a data dashboard to “give executives the insight into the data they need.”