Skip to main content

Making a model: Part 2 - achieving calibration criteria


Now, we have a nice set of calibration criteria. How do we go about fulfilling them? What I usually saw/heard being used, and what was a natural starting point for me as well is what I’d call a “manual approach”. This involves a lot of plotting state variables of a model-to-be-improved, poking the model with various protocols, and long nights spent staring at the screen until it becomes apparent how to improve the model so that it fulfills a calibration criterion it didn’t fulfill previously. A crucial and infinitely annoying complication of such manual approaches is that to improve the model, you usually have to first make it worse than where you started.

Example: consider model X, which behaves very nicely in most features of interest (perfect action potential and calcium transient shape, etc.) except, say, feature F. You ultimately find that this is partially due to a typo in the code compared to experimental data and that current C should have an order of magnitude higher conductance [1]. Very happy about this finding, you plug the data-driven improvement into this model… and of course, the perfect action potential and calcium transient break down as a direct consequence of your improvement, even though you may (or may not) have fixed the feature F.

Unfortunately, this is pretty much how it is – almost always. Models are constructed in such a way that when they contain a problem [2], they also need to have other properties broken (compared to physiology/nature) to compensate for the problem and allow the model to behave consistently with living cells in the main features. Obviously, when you fix the original problem, the compensatory factors start messing things up. But how do you find them? There is no reason why it should be a single isolated problem which you can pin down easily for good measure – it can be a mixture of balance of ionic currents (that’s the good case), but also it may be that there are structural problems or issues with formulations of various ionic currents. If the latter, which currents have structural problems? What are the problems? How do you fix them not to violate the currents’ agreement with data? Or is it that some data on which the current is built are problematic? Also, if we accept that in order to make the model better, we need to first make it worse, what do we try to do to make the model better? There are infinite possibilities with regards to making a model worse, so where do we start? Parts 4,5 of this series go over some concrete and relatively systematic approaches to manual model development. However, to give a minor spoiler – this can take quite long. Can’t we do it differently?

… Of course, that was a rhetorical question. We can. What if we specified what are changeable parameters of our model (e.g., ionic current conductances), we formulated the calibration criteria a function which may be optimized, and we used an automatic optimizer to find the best model parameters for the given criteria? That’s what I’d term an “automated approach”. In the context of the previous paragraphs, automated approaches have one very desirable property – they test and search models much more quickly than a human can. Some of them may also apply many changes to the model at once, potentially completely bypassing the problem of having to make the model worse first. It may simply be that among the many models considered in an automated model development, there will be eventually one which not only contains the “correct” improvement, but also removes [3] the compensatory means of the original model. There are at least two caveats to the automated approach to model development. First, it may not give you any solution which fulfills the calibration criteria well. Second, it may give you a solution which fulfills the calibration criteria very well, but is completely useless.

In an early phase of the ToR-ORd development when I was testing various approaches to automated model optimization, I used a criterion of goodness (“fitness” of a solution) based on AP morphology (measured on 10 points in time on the reference median trace of experimental data, starting around 20 ms, as I didn’t want to constrain the peak too much initially), and calcium transient amplitude. Both criteria were fitted super-well with almost no error by an automatic optimizer. Quick victory? Let’s check the membrane potential of the super-good cell over time…



I didn’t keep the figure, but this is a reasonably accurate reconstruction. Yes, the model did have a nice calcium transient, and yes, its AP morphology was excellent… from 20 ms on (which is precisely what I asked the optimizer to optimize for). Unfortunately, a living human myocyte that manifests peak of 25 mV, drops to -60, and then returns to 20 mV is yet to be published, which is why the model swiftly went from the hard drive to an imaginary bin.

This example really shows how the automatic optimizers are very good at delivering something that fits your fitness function well, but is not something you wanted. It reminds me of a short-story by the Czech writer Karel Michal, called Dead Cat [4]. This is about a journalist conversing with a dead cat that he finds. While the cat is dead, it speaks, and in a grammatically rather strange way. The writer eventually finds the cat is very smart and logical – below is my translation of an excerpt:

“Sorry”, the journalist said to the cat and turned a knob on the radio next to her head.
“…and that’s why we have to focus at the question which is particularly relevant to our work: how to prevent the disease of cereals in a radical way?”
The journalist reached to the radio and turned the knob back. Cereals didn’t have much place in his inner world. He played with a new cigarette.
“Don’t grow cereals,” the cat broke the silence.
The journalist quickly put down a box of matches.
“What?”
“Don’t grow cereals,” the cat repeated. “Don’t grow new cereals and burn the old ones. It’s radical.”
“That’s stupid, don’t you think? We could not live without any cereals.”
“The question was not whether you can live without cereals, but how to prevent their disease in a radical way. The answer is: don’t grow new cereals, burn the old ones.”

Even though this story comes from sixties, it captures perfectly what the issue with today’s artificial learning may be. I also heartily recommend this article which summarizes various funny fails/cheats of automatic optimization and/or learning of this sort.

Ok, so manual approaches to model optimization are not great. Automated approaches are not great. Does that mean we don’t make models then? Yes, run for it while you can. Of course not! What seemed to work reasonably well for me was to simply interlace these two approaches. As a default, I’d start with an automated solver (the next section is about this). First, for testing, I took only the most basic criteria to optimize for (e.g. AP shape and calcium transient amplitude), gradually adding more. If you’re lucky, it works and you get a great model. In my case, after addition of some nontrivial criteria, the optimizer would stop producing models that fitted these. In such a case, I’d investigate more manually to understand why it is not possible to achieve the criteria at once. Upon gaining the understanding and coming up with a solution (these two are both important, but unfortunately rather distinct events), I’d make some changes to the model that go beyond the optimized parameters (e.g., swapping a current formulation for a different one) and could run the automated optimizer again. This is repeated ad libitum, ad nauseam, or ad mortem.



[1] This case is entirely made up and any resemblance to anything of this sort that may have happened is purely coincidental. For once, I’m even serious.
[2] This is largely about level of detail one looks at at. One can probably find issues in pretty much any aspect of a model, it’s just how closely you look, and subsequently whether or not it’s relevant. The out-of-the-Box statement of “all models are wrong but some are useful” definitely holds true. In the sentence, I mean things that are obviously incorrect (sort of like using IKs data to create IKr, etc.). A lovely variation of the quotation, this time by Prof. Flavio Fenton, is: “all models are wrong but some are more wrong than others”.
[3] A pessimist would remark that it can also just add a new layer of compensation - that is also possible.
[3] Massively niche-within-a-niche, but there is a song by Eman E.T. called Dead Cat, that’s obviously linked to this short story: https://www.youtube.com/watch?v=E-kbHAbRd0c – you don’t get much more niche than 3000 views. Eman E.T. is a pseudonym of my favourite Czech drummer Ludvik Kandl who played with Hudba Praha at a certain stage, or Plastic People of the Universe – and this song is by his spin-off band, I believe. And this is him with Hudba Praha, about 20 years younger (one of central songs of my generation’s childhood, soon after the Velvet revolution): https://www.youtube.com/watch?v=IgFDoihzujU

Comments

Popular posts from this blog

Several tips on junior fellowship applications

It turns out I was fortunate enough to receive the Sir Henry Fellowship from Wellcome Trust. This is a four-year international fellowship that will allow me to spend some time at UC Davis, California as well as University of Oxford, focusing on interdisciplinary investigation of diabetic arrhythmogenesis. It was certainly an interesting and person-developing experience (obviously viewed more favourably given the outcome). I had the advantage of working under/with highly successful people who gave me valuable advice about the process and requirements. I am quite sure that  I would not have gotten the fellowship without the support of Profs. Manuela Zaccolo, Blanca Rodriguez, and Don Bers, to whom I'm deeply grateful. However, not everyone has such nice and investing-in-you supervisors and beyond very generic advice, there is very little information on the internet on what the process of applying for junior fellowship entails [1]. The aim of this text is to share some findings I ma...

Making a Model: Part 0 - Introduction

Welcome, dear reader. This is the start of a short series of blog posts aimed at providing some insight into the process of development of a computational model of a cell. The type of the model we’ll focus at is one which simulates the development of ionic concentrations and behavior of ionic currents and fluxes over time (probably most relevant for excitable cells such as cardiomyocytes or neurons). I'm hoping that tips and observations in this series will be of use to graduate students and researchers who are interested in computer simulations. While the posts are about the development of human ventricular cardiomyocyte model ToR-ORd ( https://elifesciences.org/articles/48890 ), I mostly try to focus on general observations (usually those I wish I knew about when I started). I decided to write up the topics in the form of blog, given that scientific publications tend to have a somewhat rigid format, and tend to focus at what is done, how, and what it means, rather than at ...

Making a model: Part 1 - Development strategy

This is just a short post about the criteria that one sets for the model to fulfill when making a model. In our paper, we decided to strictly separate criteria for model calibration (based on data that are used to develop the model) and validation (based on data not used in model creation that are used to assess it after the development has finished). Drawing a parallel to the world of machine learning, one could say that calibration criteria correspond to training data, while validation criteria correspond to testing [1] data. In the world of machine learning, the use of testing set to assess model quality is firmly established to the degree that it is hard to imagine that performance on training set would be reported. Returning from machine learning to the literature on cardiac computer modelling, it becomes rather apparent that the waters are much murkier here. Most modelling papers do not seem to specify whether the model was directly made to do the reported behaviours, or wh...