Atomisation – Compound Shapes

Recently I used the word coined by Bruno Reddy, atomisation, to describe the process of breaking ‘solve two simultaneous equations’ into 13 different sub-tasks.


Even more recently. Ben Gordon used a similar approach to turn ‘find the area of a compound shape’ into 28 different sub-tasks!

I’m struggling to find anything missing… the obvious one that gets overlooked is ‘find unknown lengths between parallel lines,’ but it’s in there.


So, can anyone find anything that’s missing?  Can we split the atom any further, and turn 28 sub-tasks into even more?  (Assuming calculation / arithmetic is already secure.)



Posted in Uncategorized | 2 Comments

Mixed Ability, Sets, and Streams – a teacher’s perspective – Part 1

I’ve taught sets, mixed ability, and streams.

What follows isn’t any rigorous analysis, or appeal to research.  From what I’m aware of the research, the conclusions aren’t exactly conclusive: lower attainers benefit from mixed groupings, higher attainers suffer.  Mark McCourt reiterates this point, and takes it further by pointing out that those conclusions aren’t necessarily subject or key stage specific, while Lucy Rycroft-Smith suggested there was broadly no impact either way (MrBartonMaths Interview, roughly 52 min in.  Research Espresso.)

So, this has nothing to do with research, just my thoughts and feelings having had some experience of all three.

I’ve split it into four parts:

  1. Setting
  2. Mixed Ability
  3. Streaming
  4. Conclusion


This is Part 1 – Setting



In my first school, where I taught for two years, pupils were set, as seems to be most common.  They were in up to 8 ability groups.  Each child knew with group they were in, and classes were named based on that number e.g. 9.7 for Year 9 Set 7.

There were many things I disliked about this system.


First, language.

I found all teachers used the language of sets when talking to pupils.  For example:


“I expect better behaviour from a top set!”

Thus implying that we don’t expect better behaviour from every other class…

“Maybe if you work hard, you’ll be able to move up a set…”

…up a set where you’ll finally learn something, because no-one learns in this class.

I was not above this.  In clueless moments of desperation I have uttered these words and hated myself while saying them.  I think the language that I’ve seen sets engender in teachers probably sums up all their worst features. But then…


Then, set changes.

Every time a child moves sets, information is destroyed.

I found I could teach the pupils I knew best, best, since I knew what they knew, knew what we’d discussed, and could draw on historic experiences as prompts, or build relationships between knowledge.

 I deeply disliked it when a new pupil joined my class, and wasn’t thrilled when one left.  This wasn’t just caused by human bias against change, and the emotional effort of forming new relationships, it was also, arguably mostly, because I wasn’t confident that I could be a better teacher to them than the previous person who knew them better.

If these changes happened rarely, I might consider them manageable.  In my experience in this school, with the exception of the ‘protected’ top set, the churn was incessant, and the consequences dire.  Considering our relatively poor ability to accurately measure performance, never mind learning, its doubtful that these set changes were truly meaningful, or helpful.

See this slide by Dylan Wiliam for a glorious example of what I mean.


Test reliability 0.9, predictive validity 0.7, 100 students, 50% in ‘correct’ set

FYI – 0.9 reliability is like, stupidly high compared to what you can expect from most school/teacher set assessments (which I think from memory is typically closer to 0.7, but shout out if you know better and I have that wrong.)


The protected top set.

There was an undercurrent of belief throughout my school that the top set were the only people who would truly learn.  Everyone else was grist was the C grade mill.  Once they banked it, they would be chucked out of the target groups and into the second set, where they were expected to more or less languish… maybe pick up a B if they were lucky, but you know, no biggie.


Child trading.

A sub-set of set changes – if a pupil and teacher don’t get along, it was generally understood that one or the other could petition the head of department to have the pupil moved up or down a set, so they didn’t have to be together.

Again, I was not immune to this.  I would be more than happy to advocate that maybe a belligerent should move up a set, if I dreaded seeing them each day.


Finally, obviously, self-concept and stereotype threat.

Best demonstrated by this 9 second clip from Tough Young Teachers


“Bottom set, what does that mean to you?”

“Dumb!  We’re dumb.”

“Not very smart.”

“So, is that what you think of yourselves…?”



Pretty typical of students in the bottom set.  But then actually, pretty typical of almost all students who aren’t in the top set, I found!

Then you get the kids in the bottom of the top set, who are obviously pretty high achievers, but think they’re the worst, because in their little sphere of experience, they are.

Then you get the kids who do well, think they’re really smart, think everyone else thinks they’re smart, and now don’t want to try for fear of failure and losing that impression – it’s okay to fail if you weren’t even trying, not okay to try and fail and signal that you’re not that smart after all.


So, yeah, sets sucked in a lot of ways.


Posted in Uncategorized | 5 Comments

The world’s most effective learning experience

Create and provide the world’s most effective learning experience, accessible to all.

This is the promise of Up Learn, a growing educational start up driven by the belief that every child can learn whatever we have to teach them, if only we could get the teaching right.  It reminds me of a point I made at the bottom of this post.


Cognitive science forms the bedrock of its instructional approach, and machine learning adaptive algorithms personalise the experience to each student.

I know, everyone’s promising computers will change everything, but I’m writing about this, in part, because having met and supported more edu-entrepreneurs than I can now recall, two things about this one stood out.

The first was just how far its co-founders had come in terms of reviewing material from the literature of cognitive science, applying it to their service, and writing it up into an internal summary that would give Craig Barton a run for his money.

The second was their promise of an A or A* to every student who used their service, or they would return the money of anyone who paid; they were backing themselves to get this right in a way I haven’t seen from others.


For now, Up Learn funds its research by providing an alternative to expensive one-one tuition for students studying A Levels.  Video explanations are scripted in full, and lessons are driven by a careful selection of questions before, during and after, based on principles from cognitive science and other research into educational video design.  Explanations are then recorded and animated by a professional team.

In its first year of operation, 95% of students who took its pilot Economics course achieved an A or A* in their final exam.  Despite this, the team somehow decided the videos weren’t good enough (?!) and so redesigned the course from scratch.  Last year, 100% achieved an A or A*.

To date, over 20,000 students have signed up to the platform, and this forms the other part of the reason I’m writing about this: a couple of weeks ago I joined Up Learn as their Director of Education, and we’re trying to find really, really, really good lesson designers to create content for new A Level subjects, and it’s not easy.  I’m hoping you might know someone who’ll be interested, and point them our way.

You don’t get the day to day hustle and bustle of a lively school environment, but what you do get is the chance to sit for hours on end, uninterrupted, super-deep in thought about the best possible way to sequence and communicate your subject’s content; then, best of all, you get to roll it out to thousands of students and have intelligent deep-learning analytics tell you what’s working, and what isn’t, so you know exactly what needs to be improved.  I love it, but it’s probably not for everyone; this one’s a job that will probably best suit someone you know who massively loves their subject, and spends forever learning more, and thinking about the minutia of how to teach it.  They may even live as a bit of a tortured soul where they see students not getting the top grades as resulting from their failure to teach well enough; if only they had more time to plan…

From all this work, we’re hoping to eventually derive some general principles of instruction that can be applied by teachers of all subjects, and all ages, in all classrooms, up and down the country, so it will probably also suit someone who values scale of impact over proximity to impact.

We’re prioritising chemistry, biology and psychology for now, but would still be keen to hear from teachers and tutors of other subjects.

Please, if you know any great teachers who you think would love to do this, ask them to take a look and get in touch.

Posted in Uncategorized | 4 Comments

Knowledge Organisation

There was all kinds of fuss and frustration expressed by individuals earlier in the year, when several teachers started extolling the virtues of Knowledge Organisers.

A quick Google for this will reveal several examples of these on Google Images.

Most of the criticism was asinine, in some cases seeming to go out of its way to be obtuse.  For example, one criticism I can recall was ‘But pupils need to learn more than this!’ at a time that precisely no-one had claimed otherwise, and most had expressed how they were using knowledge organisers as a tool to develop schema forming (e.g. through self-quizzing key factual knowledge outside of lesson time, so the teacher could focus on fleshing out further knowledge, relationships, interpretations etc. during lessons.)

This is unfortunate, because it is possible to levy real criticism at knowledge organisers.

Another way of putting it would be to question whether knowledge organisers are just the first step in a greater journey of expanding our understanding of knowledge organisation, more broadly.

Organising Structures

Frederick Reif defines three kinds of knowledge organisation:

  1. Associative network
  2. List
  3. Hierarchy

The first is how our mind works.  Concepts are associated with other concepts through some relationship.


Lists are exactly what they sound like.  A list has a heading, and then its sub-points simply continue in length.  If more than one list is presented, the assumption is that there is no real relationship between them:


A hierarchy can be thought of as a series of lists connected by grouping or categorisation.  E.g. Susan is one example of ‘human beings,’ which are collectively one example of ‘mammals,’ which are collectively one example of ‘life forms’…


Reif argues that hierarchies are the most desirable structure – associative networks are difficult to mentally navigate, and lists have too little organising structure, being little more than a single grouping.

These are not the only ways of organising knowledge, a quick glance through the work of Nancy Duarte will quickly reveal that.


Or even just taking a look at MS Smart Art:


Siegfried Engelmann also has a fascinating series of principles around how knowledge should be organised and presented to learners, but that’s a story for another day.


The reasonable criticism that I think can be levelled at Knowledge Organisers, as they currently stand, is that they all seem to be making use of the list structure, and only the list structure.

This doesn’t mean they will be ineffective, not at all, but it does mean they might be less effective than they could be.

By ‘less effective,’ I mean it might take a learner more time to commit the facts to long-term memory, that they might do so with lesser storage strength, and that the relationships between the facts will be weaker, than if an alternative structure were adopted.

Note: Before I go any further, it’s worth noting that I can also see very good reasons for constructing knowledge organisers the way that everyone is constructing them, and I’ll speak to that at the end.


For now, I’m going to use a single example to talk through this, characters in The Tempest.

I finally saw this for the first time last month, and set about learning the plot and characters so that I could follow what was happening on stage.  SparkNotes presents character information as a list, similar to how it would be presented in a conventional Knowledge Organiser:


While reviewing this, I found I had to do a lot of mental work to formulate a mental image of who was whom, and how they related to one another.  I voluntarily undertook that work, but it would be easy for this to be presented to a child who didn’t automatically undertake that effortful work, or didn’t know how to, or struggled to hold and process everything in working memory simultaneously.

In that instance, the pupil would only see a list of unrelated loosely connected names, and wouldn’t develop the mental schema of how they relate to one another – it would resemble the so-called ‘rote learning’ that many teachers viscerally fear.

An alternative would be to present the information like this (click for larger image):


Here, relationships between characters are made explicit.  Size of box and colour show relative importance of the character in the play, and hint at who to read about first.  Ultimately, this is the mental schema we would want pupils to construct in their own mind; by laying it out transparently we guarantee success for everyone.

Note: I had wanted to try to follow Reif’s advice and construct a hierarchy, but for something like this there didn’t seem to be a useful way of grouping the relationships into ‘higher’ and ‘lower’ levels.  Anything I might have chosen – e.g. relationship to one another, importance in play, proximity on stage – all fell apart when trying to bring the groups together.

Pros and Cons

The associative network I drew, above, has some drawbacks.

If presented altogether, it is probably more overwhelming – more likely to induce cognitive overload – than the list; where do you start reading?  What do you read next?  etc.  (I ran into this problem a few times when trying to present several different overviews of all of mathematics to pupils – it was always too much information shared at once, and I didn’t see that they couldn’t possibly navigate it the way that I did.)

This can be overcome by introducing it to pupils in stages – something that Engelmann would do e.g. Start by showing only Prospero’s box, then the Orange Boxes…  However, this restricts its utility as a self-quizzing tool, something that pupils can make use of independently of teachers (indeed, Engelmann’s chapter on knowledge organisation assumes that the information is being presented to pupils by a teacher.)  It also limits its ability to serve as a single sheet of paper, given to pupils at the start of a new sequence of lessons, that presents all the most important facts in a given unit.

It’s a lot more difficult to construct: prone to perfectionism, and consumes much more time than a simple list.

Finally the benefit of the spatial layout is offset by its consumption of space – you can fit far less information into the same space.


As I said right from the start, there are good reasons that Knowledge Organisers are being constructed the way they are, and their utility and ease of construction means that probably shouldn’t be abandoned.

There are alternative structures, though, that could increase the probability of a given child’s speed in committing facts to memory, the storage strength and enduring retrieval strength of those facts, and their knowledge of the relationships between the facts.

It would be interesting to see the results if teachers started to experiment with these structures, in addition to the list.

Posted in Uncategorized | 11 Comments

My best planning. Part 4

Craig Barton interviewed me recently, during which I discussed a series of lessons I planned and taught on solving simultaneous equations.

I could be wrong, but I think this was the best planning and teaching I ever did.

Several people have asked if I would share examples of what I described during the interview, so I’m adding that here. It’s a bit lengthy, but hopefully provides the detail many people were asking for, as well as some insight into how Siegfried Engelmann’s Theory of Instruction can be applied to the classroom.

I’m splitting the post into four parts:

  1. Specification of content
  2. Sequencing of content
  3. Pedagogy / Instructional Approach
  4. Limitations of Atomisation


This is Part 4 – Limitations of Atomisation


The Limits of Atomisation

I promised some commentary on the question of how far we should break concepts into a greater number of smaller and smaller pieces.  I’m borrowing a word that Bruno Reddy suggested to me years ago to describe this: atomisation.

The benefits of atomisation are simple: it increases the probability of success for each child.

Picture one of the classes you teach.  Now for each of those children, picture them with a probability hovering over them: for any given thing you might strive to teach them, selected at random, in any particular lesson, this is the probability that they will succeed in learning it.

Note: The way I’m using the word ‘learning‘ here, is incorrect, but speaks to our intuition about what’s happening in the classroom.  Learning is a long-term effect resulting in changes in long-term memory.  What I really mean here is the probability that pupils will respond successfully to predetermined questions, for that lesson only, which I would argue is a necessary but not sufficient condition for learning to eventually take place.

Seating Plan 1

The way in which we teach can affect these probabilities, but the general distribution will likely remain – I suspect it is probably not true that ‘some methods work better for some children and less well for others,’ and there is a deeply sinister and insidious consequence of this line of thinking, as well, hinted at towards the end.

If, as Daniel Willingham says, we are more alike in how we learn than we are different, then changing the instructional method likely increases or decreases everyone’s probability of learning successfully, while more or less maintaining the distribution, the landscape of probabilities.

Important Note: Sometimes we think that ‘one way of teaching is better for some children than others’ because we switch to a different explanation, analogy, instructional method, and find that, when we do, a given child ‘finally gets it.’  We conclude that we had finally hit upon ‘the successful method’ for that particular child, but Willingham argues that it is more likely that the child simply needed more time, and trying different things gave them the time they needed to process the idea, or that they may have simply needed more examples / analogies, in other words, the eventual success was not caused by the particular example we gave them last, but by the cumulative effect of having given them three examples; whichever example we started with would still have resulted in failure.  This is important because it’s this kind of reasoning that led us into the traps of ‘VAK learning styles’ and ‘left-brain / right-brain dominance,’ ideas that seek to categorise and therefore limit what we believe people to be capable of.

In this model, I am suggesting that atomisation raises the probability of each child being successful:

Seating Plan 2

Suddenly, a class that seemed to have just a few ‘super smart’ kids in it now looks as though it has a whole bunch of them, with only a narrow gap in ‘ability’ for most.

This increased success likely results for several reasons that cognitive science can explain; I won’t go into them all here, but a simple one would involve the way atomisation helps us to avoid overloading Working Memory.

But there is no such thing as a free lunch.

The limitation: With increased atomisation, comes an increase in time needed to cover the content.

There were people in the class who previously had a very high chance of learning the content before atomisation, so what happens to them, do they lose out as more time is spent on the same topic?

What happens if we take this to an extreme, and break things down into the smallest components possible; what was once treated as a single idea, is now treated as a hundred – in this case is it possible that the time needed to cover everything in such minute detail would result in a diminished return?

The solution

…is not a simple one, but does exist.

As with most things in life, it’s a balancing act.

First, we’re generally so bad at this (speaking for myself,) and our standard textbooks tend to be equally bad at it – in other words, at the moment, atomising more will probably lead huge gains for most children, in most circumstances, so I would judge that there’s little risk in you striving to apply it.

Second, yes, it leads to an increase in time to teach initially, but it also results in the guaranteed initial apprehension of concepts that otherwise would have had a very low probability of being communicated successfully.  This means the increase in time spent at the start is rewarded by increased probability of learning future content, thus spending time now, in order to save time in the future (a return on investment.)

Time for Mastery

Third, atomisation reveals concepts that are otherwise implicit, and overlooked in the curriculum.  For example, we played with adding and subtracting three and four equations, and looked at adding equations without any intention of eliminating a variable, ideas that are often overlooked if the ‘process of solving simultaneous equations’ is simply taught from beginning to end.  As a result, even the ‘higher attainers’ are learning more than they would otherwise (I spoke to this in the podcast.)

Finally, there is still a balance to be had.   Too much of anything is bad, by definition, and too much atomisation is probably possible.  I wonder whether the appropriate balance shifts from pupil to pupil, and this in turn leads me to wonder what role streams (a potentially better version of setting) and differentiating by time might play.  Perhaps a top stream would experience less atomisation compared with a lower stream, but the lower stream would be gifted more time with their teacher to mitigate against the time cost.


The problem with traditional teaching

Engelmann has a way with words.  Much as I find labelling unhelpful, what most people refer to as ‘progressive’ teaching, for him, is ‘traditional.’  To him, everything that has come before his ideas is ‘traditional.’

It is traditional, because in his mind the traditional position in education is that some kids can learn very well, and others can’t.

In stark contrast with this, Engelmann believes that if you get the teaching right, all children will be successful.  The diagrams with the percentages hopefully speak to the image this conjures in my mind.

Consider the following three classes, with their respective probabilities of success in any given lesson:

Seating Plan 1


Seating Plan 2


Seating Plan 3

Engelmann would argue that the traditional teacher sees three classes, with different pupils.

He sees three classes of the same pupils, with different teaching.

Posted in Uncategorized | 6 Comments

My best planning. Part 3

Craig Barton interviewed me recently, during which I discussed a series of lessons I planned and taught on solving simultaneous equations.

I could be wrong, but I think this was the best planning and teaching I ever did.

Several people have asked if I would share examples of what I described during the interview, so I’m adding that here. It’s a bit lengthy, but hopefully provides the detail many people were asking for, as well as some insight into how Siegfried Engelmann’s Theory of Instruction can be applied to the classroom.

I’m splitting the post into four parts:

  1. Specification of content
  2. Sequencing of content
  3. Pedagogy / Instructional Approach
  4. Limitations of Atomisation


This is Part 3 – Pedagogy / Instructional Approach


Pedagogy / Instructional Approach

The choice of instructional approach varies depending on the type of concept being communicated, but in the podcast I talked about the sequence I used to teach addition and subtraction of equations, so I thought I’d share that in full here.

In this instance, the concept was treated as a Transformation, in Engelmann’s taxonomy.

My words for this sequence were scripted, most equations were chosen in advance, there was a single teacher example, and pupils responded to questions on mini-whiteboards.  There were 14 questions, and the whole sequence took around 20 minutes (I think it might have been possible to reduce the time used.)

My responses to what pupils wrote on their whiteboards were not scripted.  I wrote all equations up onto the whiteboard, and where minimal changes from one example to the next had been planned, I allowed pupils to watch me rub out the thing that had changed, and watch me input its replacement (e.g. changing a + sign to a – sign.)  This is referred to by Engelmann as Continuous Conversion, which I may write about separately at a later date.

Teacher Example

“In primary school you were taught how to add together numbers.  When you joined us in Year 7, we taught you how to add letters, unknowns, variables.  I’m now going to show you how to add together entire equations.  My turn first:”

The following was already written up onto the board as show.  I then proceeded to fill in the three boxes, in silence, with the numbers 7x, +10y, and 25.

Equation Example

I then paused for around 6-7 seconds, giving pupils reading and thinking time, before saying:

“Because 5x add 2x is 7x, 4y add 6y is +10y, and 20 add 5 is 25.”


“Your turn.  Add together the following two equations.  I only want to see the final result on your boards; what would go into the three boxes at the bottom.”

The following is the sequence of questions pupils were now asked to respond to.  I will add the intention at each stage, and whether it was designed to be minimally different from the previous example (apologies, some images have come out a little distorted.)  Although not shown here, each of these was set up the same as in my example above, with response boxes, until I state otherwise.  In every case, the response success rate was 80-100%, unless stated otherwise.

Equation 1

Identical to my example pair of equations, just with different numbers, still chosen for the arithmetic to be within everyone’s ability to compute mentally.

Equation 2

Minimally different, interleaving practice with simple negative arithmetic (5 add negative 2.)

Equation 3

Minimally different, intended to expose elimination of a variable as a possible consequence of adding two equations together.  Responses from pupils were a roughly even mixture of ‘0y’ and ‘     ‘  (blank space.)  We paused to discuss whether each was acceptable.

Equation 4

Back to the original set up.  Pupils were this time asked to subtract the second equation from the first.

Equation 5

Minimally different, intended to test whether pupils would correctly interleave a more complex negative arithmetic calculation with what they learnt (10 subtract negative 3.) Success rate dropped to ~70%, due to a variety of errors with this negative arithmetic.

Equation 6

Minimally different, but two changes made at the same time.  This was intended to test one of the most difficult negative arithmetic calculations (negative subtract negative,) at the same time as introducing the idea that elimination could occur through subtraction as well as addition.

Success rate dropped to ~50%, mostly involving errors around the negative arithmetic.  I regretted making two changes at once, especially where one was such an important concept, and the other was predictably going to result in many mistakes.

I opted to return to addition only for the rest of this sequence, and further develop that conceptualisation.

Equation 7

A return to our initial set up again, but with a single change designed to catch out lazy System 1 thinking.  About 50% success rate, but the failures all involved writing ’12x’ and ‘8y.’  When the mistake was pointed out, all of those pupils reacted in a way that suggested they immediately recognised their mistake, and wouldn’t make it so easily again!  (As opposed to not understanding why their response was incorrect.)

For this question, and future questions, the structure given by the response boxes was removed.

Equation 8

Minimally different, designed to test how pupils would respond to having an atypical set up, with no other x term to add.

Equation 9

Minimally different, designed to test how pupils would respond when there were no x or y terms to add.  This question added a layer of challenge, since the x and y terms presented were lined up.  This didn’t prevent the same typically high success rate as before, however (10x + 2y = 20,) which might hint that pupils learnt their lesson from the trap that was set two questions earlier.

Equation 10

Two new equations, intended to show that this addition concept was a feature of equations, not a feature of ‘letters,’ and to provide something concrete to show that the addition was producing true statements.

Pupils were asked not to simplify the left side of their equations after they added them, however many of the higher attaining pupils in the class did, responding with “25 = 25.”

Equation 11

Maximally different, designed to offer some significant challenge now, both by incorporating negative numbers again, and by removing the ‘set up’ that previous questions had enjoyed.  Pupils had to align these two for themselves, or else process them in their heads.

Equation 12

Maximally different.  Following our previous toying with changes to the variables presented, I wanted to test how pupils would respond when the variables were different in each equation (i.e. the equations are not simultaneous.)

To my surprise, the high success rate continued, with nearly every pupil responding with 10x + 5y + 2a + 3b = 20.

Equation 13

Minimally different.  This question was intended to push things even further, and test whether pupils returned ’13x’ on the left hand side.  I had reason to believe that they wouldn’t; up until now the inference they should have formed is ‘add the expressions on the left, then add the expressions on the right,’ but it is possible that some would generate a false inference – a misconception – upon seeing the x term return, and this would be an opportunity to correct that if so.

To my enduring surprise, every child bar one responded with “10x + 5y  + 2a + 3b = 20 + 3x.”

It took me a while to understand why one of the highest attainers had not got this right, to unpick her misconception (since she hadn’t written ’13x’,) until I finally realised that she had rearranged the equation, to return “7x + 5y + 2a + 3b = 20.”

I’m sure that should have been obvious, but you know what it can be like when under pressure in front of a class!

Equation 14

This was a final question, intended to apply the concept in one final new way, before wrapping up.  Again, the success rate was 100%.


Posted in Uncategorized | 7 Comments

My best planning. Part 2

Craig Barton interviewed me recently, during which I discussed a series of lessons I planned and taught on solving simultaneous equations.

I could be wrong, but I think this was the best planning and teaching I ever did.

Several people have asked if I would share examples of what I described during the interview, so I’m adding that here. It’s a bit lengthy, but hopefully provides the detail many people were asking for, as well as some insight into how Siegfried Engelmann’s Theory of Instruction can be applied to the classroom.

I’m splitting the post into four parts:

  1. Specification of content
  2. Sequencing of content
  3. Pedagogy / Instructional Approach
  4. Limitations of Atomisation


This is Part 2 – Sequencing of Content.


Question 3

How should we sequence this content over the time we have available?

In some ways, this is the wrong question; rather than trying to cram content into the time available, we should allocate time based on what we believe is needed for 100% comprehension.

However, time is always a constraint for us teachers – in fact, it is the ultimate constraint.  At the least, we should perhaps be prepared to argue ‘There isn’t enough time to cover all of what we might like to, so we will cut the content short… cut it off at X’ rather than trying to rush through content, and risk leaving pupils feeling that ‘they’re stupid,’ because they do not understand (when they couldn’t possibly, at the pace we were going.)

In this instance, we demonstrated this restraint when actively choosing not to teach solving by substitution.  In recognising explicitly that we have not taught that, however, we can plant a flag that says we need to come back to it at a later date (e.g. in Year 10.)  Using simultaneous equations for modelling (e.g. converting word problems into simultaneous equations, to solve the word problem) was also grudgingly left out, for now, as our time was cut shorter than initially expected.

The table below shows how the thirteen components settled upon were sequenced across the five lessons.

I – Initial instruction

R – Revisit (which can include interleaving concepts together)

Table 2

It’s probably worth noting a couple of points,

1) We spent three lessons adding and subtracting equations, before we switched to having to decide whether to add or subtract, in order to eliminate one variable.

2) We spent four lessons (7 hours) before finally ‘putting it all together,’ i.e. before actually solving any simultaneous equations.  This is in stark contrast with my own prior planning, when I would begin a series of lessons on this topic by showing pupils how to solve what I considered to be a simple pair of equations, from start to finish, right in the very first lesson.

3) In this instance, no more than two new things were introduced each time, yet anything from five to eight ideas could be tested, expanded, retested and integrated in any given lesson.

4) Finally, it might seem strange that deciding whether to add or subtract was covered in only one lesson.  In the following lesson we moved on to multiplying to find a common coefficient; one of the great challenges for weaker pupils is in deciding between many options available to them i.e. ‘knowing what to do.’  To minimise the number of decisions they needed to make at this stage of the learning process, we taught them to always cross multiply the coefficients of x, then subtract – there was no longer any ‘decision’ to be made as to whether or not to add or subtract.

This certainly isn’t optimal.  By approaching the task of solving this way, pupils will sometimes have to deal with larger numbers or more complicated mental arithmetic, when they could have used various short cuts, e.g. sometimes one coefficient is a multiple of another, so both equations don’t need to be altered, or sometimes multiplying by the coefficients of y would yield smaller numbers to work with, or sometimes adding would result in simpler arithmetic.

We made the choice fully aware of these limitations, which I often argue is what is most important.  For now, we judged that ‘always do it this way’ was more likely to result in consistent success for the weakest members of our team.  Also, recall that we have taught all of the components required to take advantage of those more efficient processes (addition as well as subtraction, and deciding whether to add or subtract.)  When placed at the heart of the full length process, however, we knew that having to make those decisions would result in cognitive overload for the weaker pupils, whereas our stronger pupils would likely (and indeed did) leverage what we taught them to spot that there were more efficient processes they could make use of.  For those who wouldn’t spot this on their own (most pupils) this is something we can come back to in Year 10, once we have more time available – we can say ‘You know that thing you can do really well?  Turns out, we can make that simpler and easier for you – let me show you how,’ and we would then be focusing only on decision making, and efficiences of different processes that all result in the same solutions.

As we saw it, the choice was between ‘Try to teach them all the clever short cuts, that seem really easy to us as experts, or don’t.  If we do, most will fail completely, only our strongest pupils will succeed.  If we don’t, everyone succeeds, our strongest pupils will probably spot the short cuts independently, and we can still come back and fill in those gaps for every other pupil at a later date, once they’re 100% secure in the foundation process.’

So we opted for 100% success.

More on this in Part 3.

Posted in Uncategorized | 3 Comments

My best planning. Part 1

Craig Barton interviewed me recently, during which I discussed a series of lessons I planned and taught on solving simultaneous equations.

I could be wrong, but I think this was the best planning and teaching I ever did.

Several people have asked if I would share examples of what I described during the interview, so I’m adding that here. It’s a bit lengthy, but hopefully provides the detail many people were asking for, as well as some insight into how Siegfried Engelmann’s Theory of Instruction can be applied to the classroom.

I’m splitting the post into four parts:

  1. Specification of content
  2. Sequencing of content
  3. Pedagogy / Instructional Approach
  4. Limitations of Atomisation


This is Part 1 – Specification of Content.



Year 9, mixed prior attainment (no sets.)  The spread of prior attainment reached from what would be the ‘bottom set’ in most schools, to the ‘top set.’

9 hours of time spread across five lessons, to teach ‘solving simultaneous equations.’

I created this process, guided by Theory, but taught one out of three of the year 9 classes, and so much of the overview was co-planned with Lydia Povey, who taught the remaining Year 9 classes.  For this reason I will sometimes refer to ‘I,’ and sometimes refer to ‘we.’

Question 1

What’s the most difficult question type we would like all pupils to be able to respond to correctly, by the end?

Solve a pair of simultaneous equations, where both equations must be changed to provide a common coefficient for one of the variables.


Final Goal

Solving by elimination will be covered, but substitution will not.

While needing to rearrange one or more of the equations is not stated explicitly here, it is an obvious additional step to include that would interleave prior content, and could therefore easily be incorporated into worksheets to challenge pupils – it is not, however, the 100% goal.

It might also be possible to touch on more than two equations and/or more than two variables during classroom practice, but again, this won’t be our measure of success.

Question 2

What are the sub-components of solving simultaneous equations that we should teach explicitly?

In this case, thirteen were identified:

  1. Solve 1-Step equations
  2. Substitute into x and y
  3. Show that (x, y) is a solution to an equation
  4. Identify when equations are unsolvable e.g. 3y+2x=10
  5. Add / Subtract two or more equations
  6. Identify when equations have an infinity of solutions e.g. 3y+2x=10
  7. Find some solutions to an equation that has infinite solutions
  8. Decide whether to add or subtract a pair of equations
  9. Identify when equations have an infinity of solutions, from their graph
  10. Determine whether a given value for (x, y) is a solution, based on the graph
  11. Multiply two equations to get a common coefficient
  12. Put everything together to solve a pair of simultaneous equation
  13. Find the unique solution to a pair of simultaneous equations based on their graphs

The first three were recognised as having been covered in previous lessons, but were not assumed to be known by the pupils.


At the time, this felt pretty comprehensive.  Looking back at it now, I can see how it could be broken down much further.  For example, Identify whether two equations are simultaneous is an important component that was left out.

In The Myth of Ability, John Mighton explains how tutors learning to use his JUMP Math programme are often shocked by how many components a concept can be broken down into; what they used to consider ‘one step,’ it turns out, might be five.  Realising this, though, naturally invites the question: How far should we take it?  Should we break one idea into a hundred micro-pieces if we can?  Is more always better, or is there a trade off?

I’ll add some commentary on this in Part 4.


An important point to note:

Each of these is written in terms of a behaviour that we would expect to see a pupil exhibit.  Take Point 4 as an example.  This could be written ‘Know why some equations are unsolvable.’  This is superficially more desirable, since knowing *why* things are the way they are is obviously our end goal.  An implicit assumption is also often made: that if a pupil can express ‘why,’ then they should be able to apply that knowledge to the task of identifying equations that can’t be solved.  But… how do you assess whether a pupil really knows why…?  Ultimately, all we have are proxy measures, inferences we draw from behaviours that are actually observable.  A pupil might write an explanation as to ‘why’ quite convincingly, but perhaps they are just ‘regurgitating’ what they have been told, or copying and pasting what they read in the textbook…  This then leads to teachers feeling they have to withhold the why until the pupil ‘figures it out for themselves,’ introducing an extraordinary (I might offer, unacceptable) level of risk as to whether or not any given pupil will figure it out.  Then, even if successful, we still run into problems with transfer: a pupil who can articulate why quite eloquently still can’t necessarily solve related problems (e.g. the problem of identifying unsolvable equations.)

For these reasons, every single goal is expressed in terms of observable behaviours – if the pupil can do X, they have succeeded in learning what we intended to convey.

The bet being placed, here, is that ‘understanding’ and ‘why’ are functions of this web of related knowledge that is being slowly constructed, piece by piece.  We may screw that up by missing out important tasks or explanations in places, but that would mean those tasks or explanations simply need to be included so that the list of 13 concepts grows in size, rather than necessitating a change in approach.

That said, Point 4 was treated as one type of concept in Engelmann’s taxonomy that requires a follow up question, one which does ask for a ‘why.’  E.g.

Is this equation solvable?

– No.

How do you know?

– Because it has more than one unknown.

The explanation given here has been communicated to pupils directly, and the options available are very limited (either it has more than one unknown, or it doesn’t.)

This is not an exercise in reasoning; in Engelmann’s taxonomy, concepts of this type are ‘understood’ or recognised by their correlation with some other concept.  In this instance, solvable/unsolvable correlates with the number of unknowns in the equation:

Number of unknowns 2

Concepts of this kind are referred to by Engelmann as correlated-feature joining forms.


In Part 2 I’ll take a look at Question 3:

Question 3

How should we sequence this content over the time we have available?

Posted in Uncategorized | 11 Comments

Maths: Conceptual understanding first, or procedural fluency?

Should you teach conceptual understanding first, or focus on raw procedural fluency?

This question drives endless debate in maths education, but its answer is very straightforward: it depends.

I can demonstrate this quickly and easily with a single example, by teaching you how to multiply logadeons (e.g. 5-:-9,) something you’re probably not familiar with already.

Observe the following examples:

8-:-20     *   2-:-5       = 10-:-25

9-:-20     *   2-:-5       = 11-:-25

100-:-50 *   30-:-7     = 130-:-57

19-:-20    *   5-:-5      = 24-:-25

By this point, you can probably multiply logadeons together quite comfortably.  If you’d like to give it a go, try these two (answers at the end.)

30-:-17  * 4-:-3

17-:-0.5 * 9-:-2

But even if you can evaluate those correctly, you’re probably still not comfortable about all this; you probably don’t feel like you understand it, and for two reasons:

The first, is *why* does multiplying two logadeons result in us adding the digits?  It’s reasonable to assume there is a perfectly valid reason, just as is the case for adding indices when we multiply numbers in index form, but we don’t yet understand why it’s the case.

The second, is what the hell is a logadeon anyway??  I’ve spent a hundred or so words now discussing the multiplication of something that probably feels like a mental black hole in your mind; it’s very difficult focus on the process, and leave behind that question: “What on Earth is a logadeon?  What on Earth is he talking about?!”


Let’s switch to multiplying fractions instead.

If our goal is to teach pupils how to multiply together two fractions, then I would argue they first need to understand what we mean by the word ‘fraction,’ otherwise we’re saying meaningless things from their perspective (they also need to understand something of what we mean by the word ‘multiply.’)  This is to say, they need to understand fractions as a concept.  Arguably, they should be able to conceptualise fractions as parts of a whole, as ratios, as representing division of two numbers, and as positions on the number line, all of this, before we try to teach pupils how to perform arithmetic operations on this concept we call ‘fraction.’

Assuming we’re successful in that, we now wish to teach them multiplication.  Traditionally, we might simply explain that we multiply numerators together, and denominators together, and nothing more.  There is a reason that process works, and it can be explained in several ways.  It is also possible to relate the arithmetic process of multiplying fractions to other conceptualisations, such as a visual representation.  Here, we run into problems, though.  I’ve heard people argue that this visual representation offers an explanation, a why for the arithmetic process; it doesn’t.  It offers no proof, therefore no ‘why,’ simply an alternative means of conceptualisation, or to think of it differently, it’s another process for multiplying fractions, one that is significantly less efficient than the arithmetic process.

Visualisation of fraction multiplication

Now, is the proof necessary?  Are the alternative conceptualisations necessary?  We *must* conclude that no, they are not.  We must conclude that no, they are not, first because pupils can comfortably learn to multiply fractions without them, but more, because there is an almost infinity of proofs and conceptualisations within mathematics that can be related to the things we teach, we cannot possibly teach them all, and we therefore cannot possibly deem them all necessary.

But are they desirable?  Absolutely, and this is an important clarification.

Where something is necessary, such as developing the understanding of a concept before we move on to discuss what we can do with that concept, we have no choice.

Where something is desirable, such as providing proof for a process, or alternative conceptualisations, we have a lot of important decisions to make, and we must make them with careful and deliberate intentionality.

Often, standard processes are simple.  Proofs are relatively complex, as are alternative conceptualisations.  Try proving why the formula for the volume of the pyramid works, for example, which requires calculus.  Things get even messier for the cone, and sphere.  Deriving the quadratic formula is far, far more complex than using it to solve quadratics.  Pedagogically, it’s very difficult to turn these proofs into cognitive work for pupils, as well, making it tricky to fit them meaningfully into lessons (though certainly not impossible.)

In conclusion, there are times when we have to teach the concept first, and there are times when we often do not pay this close enough attention, jumping into processes before the concept is locked in place e.g. vectors, equations, irrational numbers… almost everything really.

There are other times, however, when this is being conflated with proof and alternative conceptualisations.  These should usually probably come after fluency with processes and algorithms is achieved, since those processes are often more simple than their derivations.  They must also be selected and taught carefully, since we have finite time: which proofs and which alternative conceptualisations will most help pupils develop a relational understanding of mathematics?  And when in a sequence of learning – which can take place over years – is teaching them going to be most successful?



30-:-17  * 4-:-3 = 34-:-20

17-:-0.5 * 9-:-2 = 26-:-2.5

And logadeons are not a thing, but sometimes it’s good to remember what it’s like to learn something without pre-existing expert knowledge.




Posted in Uncategorized | 8 Comments

Never ask pupils a question to which they have not already been told the answer.

Never ask pupils a question to which they have not already been told the answer, unless they know enough that answering the question requires them only inching forwards.

Years ago I wrote on questions and questioning, a seemingly important aspect of teaching.  For anyone interested, here:

28th May 2013

29th May 2013

16th June 2013

15th March 2014

Which is really all to say that despite the irreverence, three years on and the question of questions hasn’t disappeared.

At this point, I would say they are a vitally important part of teaching.  During my training I was told that they were a vitally important part of teaching.  So where did it all go wrong?  Our understanding of the role of questions is flawed.

Consider the following two views:

1 – “Never ask pupils a question to which they have not already been told the answer.”


2 – “Use questions to ‘move pupils’ thinking forward,’ or to give them a chance to ‘apply what they have learned.'”

The current status quo is focused on Point 2, but does it badly, leading to bad results.

Point 1 is immunisation from questions such as:

“What do you think we mean by Globalisation?”

A question I’ve seen posed to a Y12 BTEC class.


“What is a revolution? What revolutions have you heard of? What might be the key features of a revolution?”

Posed to a Y9 class *before* being taught anything about revolutions.

These are both highly typical, but terrible sets of questions.  I’ve discussed them with teachers, and can say that people really, really think that these are appropriate questions to use in educating.

They stem from the view that we need to ‘explore what pupils know,’ or that pupil voice matters. It’s true that prior knowledge is the greatest indicator of future success, and that pupil voice in a lesson can be important, but these kinds of questions:

(a) aren’t the best way of evaluating prior knowledge *and*

(b) given the context it’s probably simpler and more efficient to assume no knowledge, and try simply to anticipate prior knowledge that might interfere with current understanding

With respect to (b), a good example of this for me was in learning about the Carnot Engine in Year 1 Thermodynamics. The lecturer didn’t quite set up the introduction well enough, so when she started talking about this hypothetical heat ‘engine,’ I struggled to dissociate it from the kinds of physical mechanical engines we’re already familiar with in everyday life, like in a car.


A Carnot Engine is absolutely *nothing* like this

So there *is* some important need to estimate the kinds of prior knowledge that might interfere with future conceptualisation, but, again, as in point (a) these kinds of questions are not the best way to do it (I’m not going to go into what is.)

Point 1 above is therefore powerful inoculation against this kind of sloppy thinking – if you’re going to ask pupils a question, make sure you’ve first taught them the answer.

Problem: following this line of reasoning, all Q&A becomes ‘factual regurgitation;’ to use less loaded language, there is no opportunity for pupils to try to generalise or apply what they’ve learnt to novel contexts – this would be a limited form of education.

So Point 2 above *is* necessary; the real question is how do we find the line of demarcation between when Point 1 is valid, and Point 2 becomes valid.

For this, variation theory in mathematics and ideas such as those from the Michel Thomas and Pimsleur language courses become a source of inspiration. These all rely on:

  1. Telling pupils explicit facts
  2. Asking them to recall those facts in response to questions
  3. Then carefully moving them on to something that hasn’t been previously taught
  4. But which is eminently within reach of their minds, given the new knowledge.


Michel Thomas

Voy‘ means ‘I am going,’ and ‘a‘ means ‘to.’

How would you say ‘I am going to?’ (Voy a)


If 2x + 5x = 7x, what is 2y + 5y equal to? What about 5x + 2x? 5x – 2x?

How do we apply this to modify the kinds of bad questions I noted at the top?

History example:

Do lots of work explaining what revolutions are. This can come in many forms, including teacher talk, reading, lists of key features, knowledge organisation, fact systems (see Engelmann), comparative case studies of situations that are and are not considered to have been revolutions.

In terms of questions, we now have two forms:

(1) Having studied the French revolution, ask pupils to explain what made it a ‘revolution’ (‘regurgitation of facts,’ or more precisely, responding to a question with the answer they’ve been taught – recall / testing effect)

Then later

(2) Give them something about the Russian revolution to read, and then ask whether is was a revolution or not. Or, the industrial revolution.

(I’m not saying these are great examples considering the structure and constraints of a real school curriculum and time in class, and my limitations as a history teacher!  I’m just using them in an attempt to exemplify the theory)

Globalisation example:

In this case, preempt in speaking to pupils that they have probably heard the word before, along with some of the things they *might* think it means; explain that there is much more to it and that it has some technical specification beyond how we use it in everyday speech; explain these features as above with revolutions; then go into questions around the fuzzy boundaries ‘Are these a feature of / or caused by Globalisation?’ ‘How will Globalisation impact on that other thing we previously learnt about?’ etc.  The challenge is in ensuring enough knowledge has been previously embedded that these don’t become questions that require ‘guessing,’ but rather require only a small leap in logic.

Perhaps this is a good summarisation:

Never ask pupils a question to which they have not already been told the answer, unless they know enough that answering the question requires them only inching forwards.

This is worlds apart from ‘guess what’s in my head,’ sprawling ‘what do you think,’ or ‘who knows how to do this and can tell everyone else, so I don’t have to’ style questions.

It’s hard.  Really hard, to do this and get it right.  The goal in variation theory and the language courses mentioned is not to ‘explore’ what students think about the language, but to help them connect prior knowledge to new knowledge in a novel context (whilst leveraging the retrieval effect.)

The goal is generalisation, transfer, flexible knowledge.  In these programmes, students should always be able to respond correctly to the questions based upon what they have been taught before, and getting that right is damned difficult; it places all of the responsibility for pupil success viscerally on the shoulders of the teaching.



Posted in Uncategorized | 3 Comments