Raspberry Pi Foundation research sets out new framework for teaching AI and data science

ResearchAISTEM

8 Jan

Findings from the Raspberry Pi Computing Education Research Centre outline a shift in how learners should be taught about AI, machine learning, and data-driven systems.

The Raspberry Pi Foundation has published new research arguing that teaching learners about artificial intelligence and data science requires a fundamentally different approach from traditional programming instruction.

The research, shared in a blog post by Manni Cheung at the Raspberry Pi Computing Education Research Centre, introduces a new “data paradigms framework” designed to help educators understand how students learn about data-driven systems and how classroom activities should be structured to avoid misconceptions about AI and machine learning.

The Raspberry Pi Foundation is a UK-based charity focused on enabling young people to understand, create with, and critically evaluate digital technologies. Its research arm examines how computing is taught and learned, with the aim of improving classroom practice and curriculum design.

Shift from rule-based programming creates teaching challenges

The research distinguishes between knowledge-based and data-driven approaches to system design. In knowledge-based systems, developers explicitly define rules that determine outputs. These systems are explainable by design, with a clear path from input to output.

By contrast, data-driven systems rely on large datasets to train models rather than predefined rules. The internal workings of these models are often opaque, meaning that even developers cannot clearly explain why a specific output was produced.

According to the research, this shift represents a significant challenge for education. Learners accustomed to rule-based programming may expect systems to produce a single correct answer, while data-driven systems often generate probabilistic or non-deterministic outcomes. When instruction does not address this difference directly, students risk developing incorrect assumptions about how AI systems work.

To investigate how data science and AI are currently taught, researchers analyzed 84 academic studies focused on teaching and learning in data science. Learning activities were categorized based on whether they were knowledge-based or data-driven, and whether the underlying models were transparent or opaque.

This analysis led to the development of four data paradigms that describe the types of modeling activities students encounter in classrooms.

Four paradigms highlight gaps in current instruction

The framework identifies four combinations: knowledge-based and transparent, data-driven and transparent, data-driven and opaque, and knowledge-based and opaque. The research found that most classroom activities involving machine learning fall into the data-driven and opaque category, where students can see the data but cannot inspect or explain how the model produces outputs.

Researchers argue that this imbalance can lead learners to overestimate the reliability or objectivity of AI systems. Without explicit instruction on concepts such as model confidence, data quality, and evaluation, students may assume data-driven systems function like fully explainable rule-based programs.

The study also found no examples of knowledge-based but opaque systems in K–12 education, highlighting how opacity is closely tied to data-driven modeling rather than rule-based programming.

Teaching opaque models requires explicit scaffolding

The research suggests that data-driven opaque models should still be taught, but with greater emphasis on evaluation and testing rather than explanation. Learners need to explore how outputs change when inputs vary and understand the limitations of models trained on data.

To support this, the researchers propose that instruction should begin with rule-based transparent systems or simpler data-driven transparent activities, such as linear regression or data visualization. These approaches can act as a bridge, helping learners distinguish between systems built from logic and those trained on data before engaging with opaque machine learning models.

New study launched with primary teachers in England

Alongside the framework, the Raspberry Pi Foundation has announced a new collaborative study focused on teaching data-driven computing to learners aged nine to 11. The study will work with upper key stage 2 teachers in England to examine how pupils understand data, probability, and data-driven systems.

The project will run through 2026 and include workshops with participating teachers, with the aim of identifying practical classroom strategies that build confidence and conceptual understanding rather than procedural knowledge alone.

ETIH Innovation Awards 2026

The ETIH Innovation Awards 2026 are now open and recognize education technology organizations delivering measurable impact across K–12, higher education, and lifelong learning. The awards are open to entries from the UK, the Americas, and internationally, with submissions assessed on evidence of outcomes and real-world application.

Featured