Earlier this year, I completed the ‘Artificial Intelligence in Health Care’ course at the MIT Sloan School of Management. I found it to be fascinating, thought provoking and very worthwhile. Specifically, it took a subject I was generally interested in and helped me focus on the precise issues of relevance to not only my practice but the future of NHS health care overall.
In this series of blog posts, I am repurposing my written assignments as articles so colleagues in the NHS and globally can share in that thought process. You will see that, at this stage, there are more questions than answers, but hopefully together we can work to move the theoretical towards the practical.
How AI can work with the Quality Outcome Framework to improve the management of chronic health conditions within the NHS
I have been working as a primary care physician in the UK since 2010. The sector manages acute and chronic conditions; the latter includes hypertension, atrial fibrillation, ischemic heart disease, chronic kidney disease, dementia, learning difficulties, hypothyroidism and stroke among others. This is provided via a national framework known as the Quality Outcome Framework (QoF), which runs in annual cycles from April to April. It relies on guidelines by the National Health Service (NHS) and the National Institute of Clinical Excellence (NICE). NICE produces guidelines that rely on clinical trial data and feedback loops from various healthcare segments.
Payment is made according to targets set by the same service. QoF relies on myriad sources of incoming data such as bloodwork results, letters from specialist clinics, patients and clinical staff inputs. All this generates a huge amount of data that requires significant effort and workforce to process and subsequently assign activities and patient-facing encounters. I propose that QoF is an excellent use case for AI so that it can harness the vast data generated by QoF and in doing so it can:
Recognize patterns contained in incoming data that would likely be missed by human operators
Homogenize incoming data that would aid clinical decision making; this would likely lead to precision medical outcomes.
Improve the quality of chronic disease management outcomes and reduce avoidable hospital admissions.
Allow for better allocation of workforce and financial resources.
Enhance planning for future cycles.
To gauge the suitability of AI in the QoF use case, I would propose the following questions:
1. Is the quality of data generated by QoF high enough to use in AI training? As stated above, QoF-generated data is a combination of multiple steams of incoming data. The human element is dominant, with significant variation in the level of skills when processing data. One example is that the data in specialist clinic letters is coded manually by human operators using SNOMED coding system instead of adopting an NLP model. Thus, the quality of data used to train the AI model could be affected which in turn could reduce the quality of the outcome and undermine the curve area score. Similarly, other sources of data such as patient input (blood pressure, pulse rate etc.) and pathology results can suffer due to the human factor.
2. Who is responsible for AI training, regulation and protection against bias? In the British healthcare system, there are dividing lines between its various segments (primary care, secondary care, social care and political decisionmakers). The flow of data between these various segments is not always smooth or timely and suffers delays, conflicting goals and agendas. Although the ultimate owner of the healthcare data is the NHS, it is all too often subject to heavy political and financial influence. Therefore, it would be important to introduce a mechanism or an impartial entity to be responsible for establishing the data vectors and set used in the AI training. Will the same entity be responsible for introducing guardrails into the model against biases and distributional shift? Or should there be a different body charged with oversights and regulation?
3. Will the AI model be able to help address regional variations and inequality in chronic conditions management? “Postcode lottery” is phrase used in the UK to describe the unfortunate inequality of access to healthcare management and resources. This is demonstrated by the wide regional variation in data showing numbers of patients on waiting lists for surgical and medical treatments. The current national number stands at approximately 7.3 million. In some cases, patients can wait for up to nine months before they have treatment. These waiting times vary between different nations, regions and counties across the UK. Can an AI model help identify the underlying reasons for these variations? Will it suffer with reporting accuracy or distributional shifts?
4. Who should be auditing the AI model outcome and how regularly? Given that current AI technology has limitations, such reporting accuracy and reliance of machines that may become obsolete as the algorithms and data set grow beyond their capacity, how can we determine the frequency of auditing the outcomes reported by the AI model? It is crucial that this question is continually posed and adjusted as LLM and deep learning develop. Do we rely on AI to audit AI if and when AI exceeds our human ability to audit and assess the output?
These questions are obviously just the start of a much larger and very exciting conversation as we advance towards the inevitable integration of AI into our health care systems. In the next blog post, I will continue my deep dive into the NHS case study, with a focus on disease diagnosis and patient monitoring. I hope you’ll join me.