S3-E14.2 – Can We Manage Statistical Error in Ballooned Hepatocyte Analysis Better?

Can We Manage Statistical Error in Ballooned Hepatocyte Analysis Better?
This Episode is Sponsored By Histoindex: What Can We Learn about the Sources of Statistical Error We Create in Ballooned Hepatocyte Analysis and How Can We Manage It Better?

One major discussion at NASH-TAG this year was about the inconsistency in ballooned hepatocyte identification and how this inconsistency inflates screen fail rates and possibly placebo response across studies.

This conversation is part of a thorough exploration of this issue. It starts with Mazen Noureddin raising two questions about the entire subject of ballooned hepatocyte scoring in NAS assessment: should we use it at all, and if we should, should we move immediately to AI as the key to analysis?

Quentin provides nuanced answers to both questions. On the issue of ballooned hepatocytes, he notes that these were originally designated to characterize individual patients, not to create semi-quantitative scores. Today, he notes, we are asking far more of ballooned hepatocyte assessment than it was designed to do. On the issue of AI, Quentin notes his care to use the phrase “AI-assisted tehcnology,” that we need pathologists to confirm that what the patient has is, in fact, NAFLd instead of, for example, autoimmune hepatitis. Once that is proven, then we can ask AI to provide a more quantiatively consistent assessment.

Jörn Schattenberg begins his comments by noting and agreeing with the idea that we are asking more of ballooned hepatocyte assessment than it was designed to do. He proceeds to ask whether we can augment hepatocyte analysis with a liquid biomarker or with a different stain.

Quentin suggests that the solution will not lie in stains. The idea or liquid biomarkers is more promising, but first we will need to reduce error in the assessment and then focus our attention on biomarkers (he mentions NIS-4) that were designed to assess hepatocytes.

Roger Green finishes this conversation by noting the the phrase “semi-quantitative” itself invites analytical error. We power studies assuming that the important error is statistical and can be resolved by sample size, whereas the bigger challenge is in the qualitative assessment of the slides. If qualitative variability is dramatic, as it is here, it will dwarf statistical error and mean that all samples are underpowered. Roger concludes by asking whether we should forestall major shifts until we understand how much we can reduce analytical error through AI assistance. Quentin concurs.

The episode and this conversation are sponsored by HistoIndex. Conversation 14.5 is a discussion of how artificial intelligence driven assistive technology can improve the consistency of ballooned hepatocyte scoring in advanced fibrosis and support development of robust outcomes for fibrosis studies.