In our rapidly changing world, agencies recognize the importance of data. However, data’s potential can only be unlocked when it is effectively understood. Data proficiency must extend beyond specialists like data scientists, analysts, and engineers. To meet mission objectives, staff at all levels of the federal workforce must confidently engage with data and use it to advance the mission.
This is where data literacy comes in. The term “data literacy” refers to the ability to read, understand, create, and communicate data as information. It is a critical skill that empowers individuals to make informed decisions based on data insights. When personnel better understand data, agencies reduce the risk associated with inappropriate use and optimize potential for priorities like inter- and cross-agency data sharing.
Despite its importance, expanding data literacy remains a significant challenge for agencies. Complex data analysis tools, the need for specialized knowledge in programming or query languages, and the intricacies of building reports and dashboards all create barriers. Fortunately, advancements in generative AI offer promising solutions to these challenges, making data more accessible and comprehensible.
Across the federal government, Chief Data Officers and their partners in Human Resources are implementing data-centric training programs. However, facing time and resource constraints, federal leaders can benefit from additional strategies that augment and enhance data literacy initiatives. By harnessing technology, the federal workforce can gain greater confidence in using data in their core tasks, increasing efficiency and productivity.
Generative AI—a subset of artificial intelligence that focuses on creating new content and generating human-like text, images, or other media—can play a pivotal role in advancing data literacy.
The term “generative AI” came to the forefront with the advent of ChatGPT, soon followed by tools such as Gemini and Copilot. These are examples of publicly available generative AI managed services. As such, these tools are not appropriate for use with data such as personally identifiable information (PII), health records, intelligence data, and other sensitive data sets.
The public sector benefits from agency-managed platforms that keep AI models and data within their networks. The data literacy examples we describe in this white paper are best suited to an agency-managed generative AI solution. Here we explore six applications of generative AI aimed at improving data literacy, making it easier for the federal workforce to interact with and leverage data.
Generative AI models equipped with natural language processing capabilities can interpret complex data sets and generate easy-to-understand narratives. Instead of deciphering intricate charts or raw data, federal staff can receive concise, plain-language explanations of what the data represents and the insights it offers. By converting complex datasets into natural language summaries, AI tools can make data more accessible to non-experts.
Consider the applicability of natural language processing in analyzing shipping documents and customs declarations. By leveraging natural language processing, a generative AI solution could extract and interpret key information from documents such as invoices, bills of lading, and certificates of origin. Customs and Border Protection agents could more readily identify discrepancies, such as mismatched quantities or incorrect tariff codes, which could indicate potential smuggling or fraud.
Additionally, generative AI could help predict shipping patterns and detect anomalies that deviate from established norms, thus alerting officials to high-risk shipments that require further inspection. This combination of generative AI and natural language processing enhances the efficiency and accuracy of customs processes and strengthens national security by ensuring the integrity of international trade.
Generative AI lets users ask questions about data in natural language without the need to learn SQL or other query languages. Asking questions in plain language reduces reliance on data analysts and data scientists for simple and even moderately complex queries.
A CMS staffer without experience writing SQL queries wants to investigate potential correlations between specific medications prescribed and hospital readmission rates within a particular time frame, broken down by insurance provider. Writing in plain language, the staffer could inquire: “For Medicare beneficiaries admitted between January 1st, 2023, and December 31st, 2023, what is the readmission rate for patients prescribed empagliflozin compared to those who weren’t, segmented by insurance provider?”
The AI can query relevant electronic health records and insurer data, combining results from both data sources to calculate readmission rates. Generative AI empowers the staffer to explore data without extensive SQL knowledge and frees up time for data analysis.
Generative AI-powered tools can create interactive data exploration experiences. Users can ask questions and instantly receive dynamic visualizations that evolve based on their inquiries, making the process both intuitive and engaging. By providing immediate, visually compelling answers to complex queries, generative AI helps staff uncover insights and patterns that might otherwise remain hidden.
A homeland security professional wants to monitor potential threats based on social media activity in specific locations. Data of interest may include keywords or hashtags related to potential threats and sentiment analysis as well as geospatial data focused on critical infrastructure locations.
The generative AI could be set up to continuously scan social media data for keywords related to threats (e.g., bomb-making, demonstrations) and sentiment. The AI would generate real-time, map-based visualizations showing locations of concerning social media activity, sentiment of the activity, and critical infrastructure. The user could expand on the data visualized through natural language query: “Show me areas with an increase in negative social media posts mentioning potential violence near government buildings.”
By analyzing user behavior and preferences, generative AI can provide personalized insights tailored to a user’s roles and responsibilities. This ensures that employees receive relevant and actionable information without being overwhelmed by unnecessary data. As federal systems increase in complexity, helping staff focus on the right information at the right time helps increase efficiency.
As agencies modernize their applications, staff have become accustomed to home pages and dashboards tailored to their responsibilities and functions. Generative AI can expand personalization based on learned preferences and past experiences. For example, dynamic help functionality can highlight frequently used features to improve efficiency or recommend alternative workflows based upon historical success rates of similar users. Systems could generate role-aligned tutorials, providing contextual help beyond user guides or FAQs.
Creating comprehensive reports and dashboards can be time-consuming and complex, often requiring development resources with experience in the agency’s visualization tools. Generative AI can automate this process, easing the creation of detailed reports that highlight key metrics, trends, and insights. With data summarized and compiled, federal staff can focus on analysis and recommendations.
When disasters strike, time is of the essence. Instead of compiling large volumes data from the National Weather Service, local and state emergency management personnel, and federal data on population density and infrastructure in the affected disaster zone, a FEMA staffer could leverage generative AI for real-time data integration, automated analysis and visualization, and creation of maps that predict storm path and intensity. Generative AI could chart resource needs (e.g., food, water, shelter) based on historical data and population density information. The result—less time accessing and compiling data, more time evaluating impact for informed action plans.
Federal staffers often need to brief senior leadership on priority topics quickly. Combing through available information can be burdensome and time-consuming: drowning in documents, pictures, and videos, there’s little time for analysis. Generative AI can review structured and unstructured data, then create concise summaries that highlight the most critical information. Staff can quickly produce professional briefings that summarize findings, freeing up time to focus on insights and impact.
An NIH staffer is tasked with briefing senior leadership on the latest research developments related to a newly identified and concerning infectious disease. This briefing will inform decisions about funding research efforts and developing public health strategies. Facing the need to summarize research papers, grant proposals, news articles, social media data, and perhaps even images and videos, staff finds the task daunting.
Using generative AI, staff can summarize research papers, proposals and news articles, extracting key findings; analyze microscopic images to highlight significant virus features; interpret map-based visualizations, showing trends and patterns; and create a concise briefing summarizing the current state of knowledge about the disease, potential public health risks, and promising research avenues.
As agencies continue to invest in data literacy, generative AI supports more efficient and timely insights. By strategically deploying generative AI, leaders can empower the workforce to unlock data’s potential and foster a culture of data-driven decision-making. When integrating generative AI into an agency’s ways of working, consider the following best practices:
Generative AI offers powerful capabilities to overcome common barriers to data literacy, making data more accessible, understandable, and actionable. By embracing generative AI, organizations can enhance data literacy, democratize data insights, empower the workforce, and drive better outcomes.