Boosting Data Literacy with Generative AI

In our rapidly changing world, agencies recognize the importance of data. However, data’s potential can only be unlocked when it is effectively understood. Data proficiency must extend beyond specialists like data scientists, analysts, and engineers. To meet mission objectives, staff at all levels of the federal workforce must confidently engage with data and use it to advance the mission.

This is where data literacy comes in. The term “data literacy” refers to the ability to read, understand, create, and communicate data as information. It is a critical skill that empowers individuals to make informed decisions based on data insights. When personnel better understand data, agencies reduce the risk associated with inappropriate use and optimize potential for priorities like inter- and cross-agency data sharing.

The importance of a data-literate federal workforce

Informed decisionmaking

Data-literate employees make more informed decisions based on accurate and relevant data. This leads to better policy development, resource allocation, and overall improved outcomes for public services.

Enhanced efficiency

Understanding data allows federal workers to identify inefficiencies and areas for improvement within
government operations. This can lead to more effective use of taxpayer dollars and streamlined processes.

Innovation

A data-literate workforce fosters a culture of curiosity and innovation. When federal employees are comfortable with data, they are more likely to explore and experiment, leading to more innovative problem solving.

Strategic edge

Even within the public sector, staying ahead of the curve is vital. Agencies with a dataliterate workforce can leverage data insights to improve their services, making them more responsive and effective in meeting the needs of the public.

Risk management

Data literacy is essential for identifying potential risks and mitigating them proactively. By understanding data trends and patterns, federal employees can foresee potential issues and take preemptive actions to protect public interests.

Despite its importance, expanding data literacy remains a significant challenge for agencies. Complex data analysis tools, the need for specialized knowledge in programming or query languages, and the intricacies of building reports and dashboards all create barriers. Fortunately, advancements in generative AI offer promising solutions to these challenges, making data more accessible and comprehensible.

Advancing Data Literacy through Generative AI

Across the federal government, Chief Data Officers and their partners in Human Resources are implementing data-centric training programs. However, facing time and resource constraints, federal leaders can benefit from additional strategies that augment and enhance data literacy initiatives. By harnessing technology, the federal workforce can gain greater confidence in using data in their core tasks, increasing efficiency and productivity.

Generative AI—a subset of artificial intelligence that focuses on creating new content and generating human-like text, images, or other media—can play a pivotal role in advancing data literacy.

The term “generative AI” came to the forefront with the advent of ChatGPT, soon followed by tools such as Gemini and Copilot. These are examples of publicly available generative AI managed services. As such, these tools are not appropriate for use with data such as personally identifiable information (PII), health records, intelligence data, and other sensitive data sets.

The public sector benefits from agency-managed platforms that keep AI models and data within their networks. The data literacy examples we describe in this white paper are best suited to an agency-managed generative AI solution. Here we explore six applications of generative AI aimed at improving data literacy, making it easier for the federal workforce to interact with and leverage data.

Data interpretation using natural language processing (NLP)
Automated data query
Interactive data exploration
Personalized data insights based on roles and preferences
Automated report generation
Data summarization

A cargo ship with containers guided by tugboats, symbolizing AI's role in predicting shipping patterns and detecting high-risk anomalies.

1. Data interpretation using natural language processing

Generative AI models equipped with natural language processing capabilities can interpret complex data sets and generate easy-to-understand narratives. Instead of deciphering intricate charts or raw data, federal staff can receive concise, plain-language explanations of what the data represents and the insights it offers. By converting complex datasets into natural language summaries, AI tools can make data more accessible to non-experts.

Solution at work

Consider the applicability of natural language processing in analyzing shipping documents and customs declarations. By leveraging natural language processing, a generative AI solution could extract and interpret key information from documents such as invoices, bills of lading, and certificates of origin. Customs and Border Protection agents could more readily identify discrepancies, such as mismatched quantities or incorrect tariff codes, which could indicate potential smuggling or fraud.

Additionally, generative AI could help predict shipping patterns and detect anomalies that deviate from established norms, thus alerting officials to high-risk shipments that require further inspection. This combination of generative AI and natural language processing enhances the efficiency and accuracy of customs processes and strengthens national security by ensuring the integrity of international trade.

2. Automated data query

Generative AI lets users ask questions about data in natural language without the need to learn SQL or other query languages. Asking questions in plain language reduces reliance on data analysts and data scientists for simple and even moderately complex queries.

Solution at work

A CMS staffer without experience writing SQL queries wants to investigate potential correlations between specific medications prescribed and hospital readmission rates within a particular time frame, broken down by insurance provider. Writing in plain language, the staffer could inquire: “For Medicare beneficiaries admitted between January 1st, 2023, and December 31st, 2023, what is the readmission rate for patients prescribed empagliflozin compared to those who weren’t, segmented by insurance provider?”

The AI can query relevant electronic health records and insurer data, combining results from both data sources to calculate readmission rates. Generative AI empowers the staffer to explore data without extensive SQL knowledge and frees up time for data analysis.

A portrait view of a glowing digital globe with interconnected lines, symbolizing AI-powered integration of electronic health records and data.

3. Interactive Data Exploration

Generative AI-powered tools can create interactive data exploration experiences. Users can ask questions and instantly receive dynamic visualizations that evolve based on their inquiries, making the process both intuitive and engaging. By providing immediate, visually compelling answers to complex queries, generative AI helps staff uncover insights and patterns that might otherwise remain hidden.

Solution at work

A homeland security professional wants to monitor potential threats based on social media activity in specific locations. Data of interest may include keywords or hashtags related to potential threats and sentiment analysis as well as geospatial data focused on critical infrastructure locations.

The generative AI could be set up to continuously scan social media data for keywords related to threats (e.g., bomb-making, demonstrations) and sentiment. The AI would generate real-time, map-based visualizations showing locations of concerning social media activity, sentiment of the activity, and critical infrastructure. The user could expand on the data visualized through natural language query: “Show me areas with an increase in negative social media posts mentioning potential violence near government buildings.”

4. Personalized data insights based on roles and preferences

By analyzing user behavior and preferences, generative AI can provide personalized insights tailored to a user’s roles and responsibilities. This ensures that employees receive relevant and actionable information without being overwhelmed by unnecessary data. As federal systems increase in complexity, helping staff focus on the right information at the right time helps increase efficiency.

Solution at work

As agencies modernize their applications, staff have become accustomed to home pages and dashboards tailored to their responsibilities and functions. Generative AI can expand personalization based on learned preferences and past experiences. For example, dynamic help functionality can highlight frequently used features to improve efficiency or recommend alternative workflows based upon historical success rates of similar users. Systems could generate role-aligned tutorials, providing contextual help beyond user guides or FAQs.

A ow of cars trapped in high floodwater on a rainy day, surrounded by dense green trees.

5. Automated Report Generation

Creating comprehensive reports and dashboards can be time-consuming and complex, often requiring development resources with experience in the agency’s visualization tools. Generative AI can automate this process, easing the creation of detailed reports that highlight key metrics, trends, and insights. With data summarized and compiled, federal staff can focus on analysis and recommendations.

Solution at work

When disasters strike, time is of the essence. Instead of compiling large volumes data from the National Weather Service, local and state emergency management personnel, and federal data on population density and infrastructure in the affected disaster zone, a FEMA staffer could leverage generative AI for real-time data integration, automated analysis and visualization, and creation of maps that predict storm path and intensity. Generative AI could chart resource needs (e.g., food, water, shelter) based on historical data and population density information. The result—less time accessing and compiling data, more time evaluating impact for informed action plans.

6. Data Summarization

Federal staffers often need to brief senior leadership on priority topics quickly. Combing through available information can be burdensome and time-consuming: drowning in documents, pictures, and videos, there’s little time for analysis. Generative AI can review structured and unstructured data, then create concise summaries that highlight the most critical information. Staff can quickly produce professional briefings that summarize findings, freeing up time to focus on insights and impact.

Solution at work

An NIH staffer is tasked with briefing senior leadership on the latest research developments related to a newly identified and concerning infectious disease. This briefing will inform decisions about funding research efforts and developing public health strategies. Facing the need to summarize research papers, grant proposals, news articles, social media data, and perhaps even images and videos, staff finds the task daunting.

Using generative AI, staff can summarize research papers, proposals and news articles, extracting key findings; analyze microscopic images to highlight significant virus features; interpret map-based visualizations, showing trends and patterns; and create a concise briefing summarizing the current state of knowledge about the disease, potential public health risks, and promising research avenues.

Close-up of a pipette dispensing liquid into a test tube, with hexagonal patterns and a blue-orange gradient backdrop.

Empowering the workforce: best practices for employing generative AI to
improve data literacy

As agencies continue to invest in data literacy, generative AI supports more efficient and timely insights. By strategically deploying generative AI, leaders can empower the workforce to unlock data’s potential and foster a culture of data-driven decision-making. When integrating generative AI into an agency’s ways of working, consider the following best practices:

1. Identify highest-value use cases

Determine the areas where generative AI can have the most significant impact. Assess where data collection, understanding, and visualization proves most burdensome today and measure potential return on investment in terms of overall savings in cost, time, and staff toil.

2. Communicate governance and security guidelines

Clearly outline the intentions of the generative AI program, what data types it will use, what outputs it will enerate, and who can use it. Guide teams in the implementation of strong access measures to minimize unauthorized use and mitigate risk.

3. Integrate with existing systems

Ensure that generative AI solutions integrate with existing data infrastructure and tools. This minimizes disruptions and leverages existing investments in data technology. Consider how to best incorporate new generative AI capabilities into existing systems and workflows to increase adoption.

4. Provide training and support

While generative AI can simplify data interaction, employees will still need training and ongoing support to help them optimize use of these tools and understand the insights they generate.

5. Foster a data-driven culture

Encourage a culture of data literacy by promoting the use of data in everyday decision-making. Recognize and reward employees who effectively use data to drive results.

6. Monitor and evaluate

Continuously monitor the effectiveness of generative AI tools in improving data literacy. Gather feedback from users and make necessary adjustments to enhance the experience.

Generative AI offers powerful capabilities to overcome common barriers to data literacy, making data more accessible, understandable, and actionable. By embracing generative AI, organizations can enhance data literacy, democratize data insights, empower the workforce, and drive better outcomes.

Enhancing Data Literacy Using Generative AI

The importance of a data-literate federal workforce

Advancing Data Literacy through Generative AI

1. Data interpretation using natural language processing

Solution at work

2. Automated data query

Solution at work

3. Interactive Data Exploration

Solution at work

4. Personalized data insights based on roles and preferences

Solution at work

5. Automated Report Generation

Solution at work

6. Data Summarization

Solution at work

Empowering the workforce: best practices for employing generative AI toimprove data literacy

Empowering the workforce: best practices for employing generative AI to
improve data literacy