Online Trust Alliance (OTA) Executive Director and President Craig Spiezle testified today before the U.S. Senate’s Homeland Security and Governmental Affairs Permanent Subcommittee on Investigations, outlining the risks of malicious advertising, and possible solutions to stem the rising tide.
“Today, companies have little, if any, incentive to disclose their role in or knowledge of a security event, leaving consumers vulnerable and unprotected for potentially months or years, during which time untold amounts of damage can occur,” said Spiezle. “Failure to address these threats suggests the needs for legislation not unlike State data breach laws, requiring mandatory notification, data sharing and remediation to those who have been harmed.”
It is important to recognize there is no absolute defense against a determined criminal. At the hearing, OTA proposed incentives to companies who adopt best practices and comply with codes of conduct.
Spiezle emphasized that these companies “should be afforded protection from regulatory oversight as well as frivolous lawsuits. Perceived anti-trust and privacy issues must be resolved to facilitate data sharing to aid in fraud detection and forensics.”
When education investors talk about so-called adaptive learning, in which a computer tailors instructional software personally for each student, the name Knewton invariably surfaces. The ed-tech start up began five years ago as an online test prep service. But it transformed the personalization technology it uses for test prep classes into a “recommendations” engine that any software publisher or educational institution can use. Today the New York City company boasts it can teach you almost any subject better and faster than a traditional class can. At the end of 2012, 500,000 students were using its platform. By the end of this year, the company estimates it will be more than 5 million. By next year, 15 million students. Most users will be unaware that Knewton’s big data machine is the hidden engine inside the online courses provided by Pearson or Houghton Mifflin Harcourt or directly by a school, such as Arizona State University and University of Alabama.
The Hechinger Report talked with David Kuntz, Knewton’s vice president of research, to understand how the company’s adaptive learning system works. Kuntz hails from the testing industry. He previously worked at Education Testing Service (ETS), which makes the GRE and administers the SAT for The College Board. Before that Kuntz worked for the company that makes the LSAT law school exam.
(Edited for length and clarity)
Question: On your company’s home page, there’s a McDonald’s hamburger counter that says you’ve served more than 276 million recommendations to students. What exactly are they? Are they like a book recommendation on Amazon?
Answer: It’s not like book recommendations on Amazon. Amazon’s goal is for you to buy the book. The goals that are driving our recommendations are the big things you need to learn. This recommendation is just one piece along the way for you to get there.
The question our machine is trying to answer is, of all of the content that’s available to me in the system, what’s the best thing to teach you next that maximizes the probability of you understanding the big things that you need to know? What’s best next?
It’s not just what you should learn next, but how you should learn it. Depending on your learning style, it might be best to introduce linear equations through a visual, geometric approach, where you plot the lines and show the intersection. For others, they might respond better to an algebraic introduction.
Q: How are the recommendations “served”?
A: That depends on how our partner [the developer of the educational application, such as Pearson or University of Arizona] designs its online course.
Sometimes, the recommendations drive the whole course experience. The student comes in, signs on, and the system will say to them, “Let’s work on this.” And they work on this. There’s formative feedback all the way through. And then the machine picks the next lesson based on how the student did in that lesson. “Now, let’s work on this other thing.”
In other cases, it may be a study aid sidebar, “Okay, you’ve just completed your assignment, and didn’t do as well as you might have liked. Here’s something you should do now that will help improve that.”
It can be tailored remediation, or the full scope and sequence of the course or a blend of those.
Q: Bring us inside your black box. How have you programmed your adaptive learning platform to come up with these recommendations?
A: We have a series of mathematical models that each try to understand a different dimension of how a student learns. We model engagement, boredom, frustration, proficiency, the extent to which a student knows or doesn’t know a particular topic. Imagine three dozen models.
Take proficiency. We use an IRT, or Item Response Theory Model, which is commonly used in testing. It estimates the probability that a student is able to do something based on an answer to a particular question.
The data gets filtered through all those models in order to come up with a recommendation in real time.
Q: Where does the data come from?
A: We know nothing about a student until he logs on. But then we can have the full click stream. That’s every mouse movement, every key press. If they click on a link to review part of a chapter and then they scroll down and scroll back up a couple times, those are things we can know. If they highlight stuff on the page, if they ask for a hint, if they select option choice C and then change their mind 4 seconds later and select option choice D, those are all pieces of information we can know. That’s part of that click stream.
Q: I’m told this is “big data.” How much data are we talking about?
A: It’s a ton. The storage component of that data is the largest portion of our Amazon Web Service’s bill (laughs). It’s fair to say that we have more than a million data points for each student who’s taking a Knewton platform course for a semester. That’s just the raw click stream data. After the raw data goes through our models, there’s exponentially more data that we’ve produced.
Q: There’s a lot of concern by parents and policymakers about how companies are exploiting and safeguarding private student data. How do you keep it private?
A: We don’t know anything personal about the student at all. We don’t know their name. We don’t know their gender. We don’t know where they live. No demographic information whatsoever. All we know is that this is a student in this particular course.
Q: Can all subjects be taught through an adaptive learning platform? Or is it best for math?
A: We love math. Math is great is because it has a rich, deep structure to it. The concepts build upon one another. Physics and chemistry are similar to math that way.
Biology has a totally different structure. It’s more about different clusters of things, connected by crosswalks.
Often, it’s less about the subject than the goals of the course.
Take freshman philosophy. If it’s a survey course of great ideas and the evolution of those great ideas, it may start with the Greek philosophers all the way up to Rene Descartes (“I think therefore I am”), up through European Western twentieth century civilization. You talk about logical positivism, and then post positivistic philosophy…
Q: (as Kuntz is rattling this off, I can’t help but multitask and Google his bio. Yep, he was a philosophy student at Brown).
A: In that case, there really is an evolution to those ideas that can be described in a knowledge graph. And our models can recommend content on this knowledge graph for students to learn.
But the other kind of freshman philosophy course is less about the subject of philosophy itself. It cares about exposing to students to some of the great ideas in the survey course. But the goal is to use these great ideas as a spur to promote creative critical analysis and discussion. In this case, most of the interaction and most of the evaluation takes place in class discussions and in written papers.
For a freshman philosophy course that is focused around teaching students how to think on their feet, come up with counter examples rapidly, and interact with other students in an engaging and intelligent way, our approach may not work as well.
Q: How does your machine decide whether to focus on a student’s weaknesses or to go deeper into an area that a student is really interested in?
A: Student engagement and interest – those are factors we try to take into account. We try to balance areas where student is having problems with areas that a student is really interested in.
If our partner [such as Pearson] has enabled direct expression of a student’s interest, we can take that data and incorporate that into the process of making a recommendation.
Q: Does the data know better than an experienced teacher’s wisdom? Does the Knewton machine ever recommend something that runs completely counter to what a veteran teacher would do?
A: One day we’ll have some really good answers to that question. What we have seen, in some cases, is that the engine has made recommendations that teachers have found surprising. But pleasantly so. Something they hadn’t anticipated that the student would need. But when presented with it, the teacher recognized that it was something good for the student to be doing.
Q: This is hard to understand.
A: It requires putting aside a lot of the things we take for granted because we grew up and were educated in the current system. We think about things in terms of syllabi and chapters. It’s hard to step back from that. Are there better and different ways that we can organize and present content?