Bobby Caples (Education)

Education & Youth Development Consultant

Menu Close

Category: Data & Statistics

New RtI Study: Not What It Seems

The education world seems to be up in arms recently about the new federal report which seems to indicate that, on the surface, RtI doesn’t work. The report used an interesting statistical model to compare students who received vs did not receive RtI services, and found that students who received RtI services either did not benefit, or ended up doing worse. This, on the surface, appears to be a true claim. To generalize this finding as support that RtI “doesn’t work,” however, simply doesn’t work. Here’s why:

First, the study only compared students right around the cutoff score for receiving services. It did not examine whether RtI was an effective model for the rest of students receiving RtI. There’s simply no data to support this. Second, the study found that 40% of students who received RtI services were in schools in which all students received RtI services – not just struggling students. 

More importantly, this report does NOT suggest that RtI failed as a model. It failed, for the specific students in the study, as implemented. Granted, the researchers did focus on schools who had appeared to implement RtI with fidelity, but even if it had been implemented with fidelity, fidelity to what? Looking at the practices of some of the schools (e.g., all students receiving RtI services, schools using single assessment scores as criteria for inclusion, etc.), it’s questionable that RtI was implemented with fidelity to best practice.

So, what this report does say is that – for the students served – RtI didn’t work. It doesn’t say RtI didn’t work for the rest of students served, nor does it suggest that RtI can’t work. It suggests we’ve got more work to do.

Necessity & sufficiency with educational interventions: The myth of heightened expectation

In education, we’re a fan of magic bullets. Single-gender classrooms, personalized learning, standardized/state testing, etc. – the list goes on. Educators is hard, so when we come across something that works, our optimism kicks in and we want to think that it will work really well – so well, in fact, that said strategy will transform education. What inevitably happens is that the strategy in question does not magically fix all of education, so critics come in and start to argue that it didn’t work, we’ve been wrong all along, and if we just now do this, all will be well. Of course, that doesn’t work, and so the cycle continues.

So, this may be obvious – we all know the problems with expecting something to be a cure-all. What we don’t often catch as often is that we essentially do the same thing in a lot of more specific research scenarios by expecting something to singularly affect change, and drawing the conclusion that that something didn’t work if it didn’t.

Case in point: Head Start. Head Start, in the research community, has an infamous reputation of being something that doesn’t work. Any initial gains noticed, cite researchers, fade quickly over time, with there being no real difference between kids who did or did not participate past a certain point. The central point of my blog post today is that I believe this is a faulty conclusion: Just because an independent variable is not powerful enough to affect the dependent variable by itself does not mean the strategy either 1) did not work, or 2) is not necessary.

I’m a fan of car analogies, so let’s go down that path for an example: If a car’s engine is broken and the tires are flat, you have problems. If you fix the engine, the car will still not function correctly. However, it would be incorrect to assume that fixing the engine is unimportant or unnecessary – it just wasn’t enough. Head Start could very well fit into this category – Head Start may not be enough to transform the educational career of a child from an at-risk background, but that form of early educational intervention may very well be necessary.

To be sure, I’m not making the case for Head Start here (nor am I making a case against it, though). My point is simply that discovering something wasn’t enough by itself doesn’t mean we should throw it out.

Let me take a more specific, and slightly more contemporary against. In this study, Marcus Winters & Jay Greene discovered that certain educational placement strategies had an initial affect, but faded over time. They were studying retention based on state tests, along with assignment to a “more effective” teacher, summer supports, etc. In one intervention condition, they noted that the effect of the condition faded over time. Their conclusion? That the intervention condition didn’t work. I disagree, at least with drawing that conclusion based solely on the knowledge the intervention condition didn’t sustain.

In reality, that educational placement could have been huge – it could have provided the child exactly what s/he needed at the time, and been the difference between failure, at that point in time, and success.

Let me take a step back and give another example, this time in medicine. Let’s say you take a medication that lowers blood pressure, and for 5 years this helps keep your blood pressure down, preventing a heart attack. Then, in year 6, you stop taking that medication and have a heart attack. Could you reasonably conclude that the medication didn’t work in Year 1 because it didn’t provide a benefit in Year 6?

In short, I think we need to be reasonable with what contributions we expect an intervention to make, and not rule them out as a failure simply because they aren’t a panacea, or because they don’t live up to some other expectation we might have. If we aren’t going to expect an oil change to last 5 years, and we aren’t going to expect blood pressure medication to last 5 years, why would we expect Head Start to last 5 years? Just because we want it to?

“Pruning” Systems

I love to modify programs. I love to change them, tweak them, grow them, add procedures, add forms, add spreadsheets, etc. This often frustrates (to no end) those I work with, even though what I do is in the spirit of continuous improvement. Procedures & systems, after all, are just as much about efficiency as they are about efficacy. To the extent that they change constantly – even for the better – they hinder efficiency because the people that use and rely on those systems have to keep re-learning those systems, and lose fluidity with how they move throughout the system.

So, if systems and procedures are to change, I think they need to change effectively – in a certain way – and not just because the outcome may be better (i.e., the system is technically better). Rather, the way in which they change for the better is vital if you’re going to keep staff happy, or keep them at all.

First, a disclaimer since I’m throwing around the term “systems” here: “Systems change” often refers to large scale (and accompanying small scale) structural changes to a system (e.g., a new special education model in an elementary school). What I’m talking about in the article may be related, but it’s not the same – I’m talking about procedural systems – not what’s happening, but the way in which they’re happening – the forms, routines, expectations, and protocol used by users to accomplish the same (or similar) tasks in the same large-scale “system” that was in place before. What I’m referring to here could be anything from how users navigate the internal web system to how special education processes are documented.

Back to effective procedural systems change, then: There are better and worse ways to do it, and I’d like to spend a few minutes in this article highlighting one particularly helpful element to include in procedural change: pruning.

Do you remember TPS reports from the movie Office Space? If you haven’t seen it, see it – worth it beyond just understanding this analogy. The issues with TPS reports, and all of the office procedures the concept was satirizing, were duplication, redundancy, and over-documentation. The interesting thing is that TPS reports (using the term metaphorically here) don’t actually (always) start off bad. Generally, there is some need that’s noticed that needs fixing, so a procedure is developed to address that issue. The problem is that the procedure is often just tacked on to the existing procedural structure without considering the overall context in which the TPS report is placed. There may already be other forms that address the same issue, for example.

So, over time, there become tons and tons of TPS reports. Teachers know them well – many call them “assessments.” All sarcasm aside, assessments are crucial to education, but many teachers have experienced that assessments upon assessments are tacked on in a seemingly haphazard methods, leading to duplication of assessments, over-testing, etc. On a smaller scale, teachers are asked to document so many things in so many ways, that it becomes almost a major part of the job just to document. (Side note: If you teachers think you have it bad, talk to someone who has to bill medicaid).

Needless to say, when too many TPS reports build up, something needs to happen. One such thing that could happen is pruning. If you’ve studied neurobiology (which I haven’t really), you’re probably already familiar with the analogy of “pruning.” I’m not a neurologist, so pardon my paraphrasing here, but the idea is that the brain grows tons of connections over the first years of life, then eventually starts to “prune” away less used or needed connections to promote efficiency.

The same is a vital task that needs to happen within systems. The problem, though, is that pruning isn’t automatic like with your brain. There is no natural system of checks and balance within organizations that automatically triggers pruning once a certain number of TPS reports build up. The leaders/change agents need to do this manually. That being said, good systems managers tend to do this naturally – they tend to remove forms, merge forms, simplify procedures, streamline processes, etc. Sometimes it’s simple, sometimes it’s complex – in terms of execution. But, the good news is that – at least conceptually – the process is relatively straightforward.

In short, the idea of pruning is simple: Find the easiest, shortest, and most efficient path from Point A to Point B. Period. Here’s any example:

A few years back, I was working for an organization in which we needed to track details of kids’ behavior across various contexts. We wanted to know who kids behaved well for, when they behaved well, in which activities, etc., and exactly which behaviors were happening – in what frequency – each day of the week. We then wanted a way to analyze these data strategically and flexibly, for example being able to run reports to answer specific problem-solving questions generated in our behavioral assessment process. What we ended up doing during the first year was creating an elaborate electronic data collection system in which 5-10 data points were entered for each behavioral incident, then aggregated in Microsoft Access and analyzed via Crystal Reports. The system was, in short, beautiful. It was unlike anything I had ever seen or worked with, and we created it. We were proud, and could give you all kinds of information about all kinds of behavior – crazy levels of detail about behavior you’d never previously been able to have. The problem? It was cumbersome. It took staff forever to enter data, and encouraged them to focus more on data-entry (and problem behavior, since that’s what we were recording) than their actual interactions with kids. In short, we had created a system and procedural structure that was amazingly powerful, but needed pruning.

Over the next few months, then, we pruned. We ended up moving to a manual entry system (rather than digital) because we found that our fancy iPods were actually slowing us down. We collected less information about each behavioral incident, because we found that – while pretty cool – we weren’t really using some data as much as we thought we would. We also restructured data entry at the end of the day, as well as data aggregation and reporting afterward. In short, we pruned. We started off building up an elaborate system of procedures that were pretty effective, but not really efficient. We then went through the necessary pruning stages to reduce inefficiencies and duplications, and wound up with something that was not only effective, but usable. Don’t get me wrong – it was still a lot of work, but the work at least seemed commensurate with the result – worth our time.

Our organization was, undoubtedly, still change-prone and unstable (in a good way) because we were focused on continuous improvement. However, our pruning – in my opinion and experience – provided some level of counter-balance against the added systems and procedures we developed over time as well. This led to a net result of changed procedures, not additional ones, and a system that was ultimately usable.


Note: This commentary has been cross-published on

Just a quick observation in response to a recent post on another education blog. Over the past few years, there has been continuous focus on “over-test” of students, from curriculum-based type assessments such as DIBELS to, of course, full-blown state tests that “count.” A recent blogger felt that, in her experience, the education system had crossed the line past “over-testing.”

My response is far from novel, but I believe an important one: If “over-testing” has indeed occurred, it’s quite easy to peg the blame on all testing. That is, because too many tests have been given, every single one is welcomed with spite and animosity. Yet another one. The recent emphasis on using data in education has even been under fire because, at times, it’s misused.

If you can’t already see my point, here it is: Data is GOOD. Assessments are GOOD. Sure, too many is bad, and using data in a bad way doesn’t work, but when we’re attacking assessments and data, let’s stay focused on that. Instead, we’re seeing more and more posts condemning it all – tests, assessments, etc.

Question: If we add another test, will you view it as inevitably bad? Is there no chance that it could be effective? If we integrate data in a new way into our educational framework, is it bound to overburden, or does it stand the chance of helping?

It’s easy to get frustrated with “too much” and stand levying the blame on each individual component, rather than remembering that is isn’t every particular assessment measure that’s broken, but a flaw in the overall system in which those individual components are implemented.

Fixing Indicators

Linking to another article I recently published on another site: Do enjoy!

Fixing Indicators

© 2018 Bobby Caples (Education). All rights reserved.

Theme by Anders Norén.