Data Series: Ensemble Learning for Healthcare

Oct 5

“I hate Benchmarking! Benchmarking is Stupid! Why is it stupid? Because we pick the current industry leader and then we launch a five-year program, the goal of which is to be as good as whoever was best five years ago, five years from now. Which to me is not an Olympian aspiration.”
- Tom Peters, Innovation is Actually Easy

There is an unintentionally harmful mindset in healthcare, and still prevalent in other industries, that there should be one right way of doing things. One right way to do an operating room checklist, one right way to do nursing assessment and documentation, etc. And that there are static best practices that should be learned and applied to all organizations in each local environment. Data science principles demonstrate that this is both harmful to patients and to the people caring for patients. This kind of thinking tends to comfort some of the organization’s leadership, but also harms the organization over the long-term and squelches innovation.

In recent blog posts, I’ve described the need to move away from fragmented silos and build small, diverse teamsaround whole definable patient processes. I’ve demonstrated that one of the core principles of systems and data science is to decentralize data into the context of each whole definable patient process in each local clinical environment. In this blog post, I want to explain one of data sciences most powerful principles – the Ensemble Model for Learning. From it, we realize that we are all connected and that we literally need each other if we want to have a sustainable healthcare system. It demonstrates the need to not only decentralize data, but to decentralize learning and innovation.

In the reductionist science model for learning, there are mythical static “truths” that can only be discovered by the best and the brightest researchers, at the most esteemed institutions, funded with millions of dollars to perform rigorously controlled research studies. But this science is not valid. In our real biologic world, change is constant and biologic variability is not controllable. The principles of systems and data science are valid in the real world and are all about measurement and improvement, relying on decentralized learning systems in each local clinical environment.

In previous blog posts, I’ve also described how much faster we can learn with a systems and data science infrastructure in place and how much our world suffers when we don’t have this infrastructure in our global healthcare system, especially during a pandemic. I’ve explained how data analytics should be used in healthcare to identify patient subpopulations and match them with the optimal treatment, because there is no one-size-fits-alltreatment. The same drug, device or diagnostics tool that might benefit one group of people, will contribute to unintended harm for another group and will be wasteful in yet another group of people. There is nothing in healthcare that is “safe and effective” for all people – that is just not reality.

The Ensemble Model for Learning has mostly been applied as a machine learning tool and has been demonstrated in the programming of IBM Watson to compete against humans on Jeopardy and in a competition called Kaggle.

I previously wrote that IBM Watson’s programming team needed to program the computer to understand the “context” of Jeopardy questions and answers, but they also needed to program the computer to understand the English language used in the context of playing Jeopardy. Ultimately, a team using trial and error over a period of years developed an approach they called DeepQA that demonstrated ensemble learning. They used over 100 different natural language processing programs, combining them, to allow the computer to understand and answer the questions correctly most of the time, and usually faster than the human competitors.

The winners of a data science competition called Kaggle are teams that regularly use ensemble learning by combining their algorithms. For example, one contest was called the Netflix Prize. The winner had to improve a Netflix algorithm by 10%. It took three years, but the winners figured out that combining their algorithms would likely be more successful than attempting to continue to improve their own single algorithm. In September 2008, the team from AT&T Research (Bellkor) reached out to a small start-up team from Austria (Big Chaos), and formed an alliance, Bellkor in Big Chaos. Although this collaboration led to winning the annual progress prize of $50,000, they didn’t achieve the Grand Prize. So, this collaborative team reached out to another team, Pragmatic Theory, and became a team of three teams, Bellkor’s Pragmatic Chaos. On June 26, 2009, they broke through the 10% barrier and won the $1 million Grand Prize. The second-place team achieved the 10% barrier only ten minutes later. They were also a team of teams, called “The Ensemble.”

If we don’t collaborate to solve complex problems that we see in healthcare, the goal of a sustainable healthcare system will continue to be elusive. The ensemble learning concepts have been described by the Google AI team as Federated Learning and Federated Analytics. A book by Frederick Vester has described these concepts as Network Thinking.

There are significant benefits of this model for learning. By combining the learning from many small teams and networking that knowledge (ensemble learning), the complexity is reduced. And because the raw data remains in each local environment, the privacy and security of the data is maintained. There is even one published example of ensemble learning in healthcare, used to accurately predict the diagnosis of Type-2 diabetes and maintain privacy.

For healthcare to be sustainable, we will need to learn and apply the principles of systems and data science I’ve described in this blog the past seven weeks. We’ll also need to change our thinking and break free from our lower-brain mindset that craves certainty and control. The concepts of certainty and control are harmful illusions. The higher brain can overcome this reductionist thinking and allow our human potential for empathy, creativity, discovery, and innovation to be unleashed. The concept of one static “best practice” is an oxymoron. We’re all in the global healthcare system together and we need each other, learning together and networking those learnings and algorithms if we want a sustainable system to emerge. Who doesn’t want this? Why wouldn’t they?

For over 25 years, I’ve had the privilege of traveling around the world to teach and learn from others in healthcare. My first international trip was to Moscow, Russia in 1994, where I performed the first laparoscopic procedure in the largest cancer hospital in Eastern Europe. My last international trip was to Nanjing, China in October 2019, just before COVID-19 was discovered in Wuhan. I can assure you that the healthcare professionals–doctors, nurses, hospital administrators, etc.–I’ve met do not agree with the divisiveness that we see in the media and in politics. We know that we need each other if we are to successfully learn together and improve our global healthcare system. If we continue to focus on what divides us, especially when we focus on insignificant things like skin color or job title, this has the potential to destroy us. There are much more significant core values that should unite us, starting with the need for health and healing in our world. This will only happen if we learn to change our thinking. In the next series for the Transforming Healthcare blog, I’ll describe the higher-brain mindset that can lead to positive change and sustainable improvement for our world.

Data SeriesEnsemble Model for LearningMindset Shift

Bruce Ramshaw

Data Series: Ensemble Learning for Healthcare

Lower Brain, Higher Brain

Data Series: Human-Computing Symbiosis in Healthcare

Home About Blog Contact