Salutations!
I am a physician, computational biologist, and medical statistician.

About me

Conjuring Slander and Lies

Let’s commence with a brief introduction, shall we? Being a sprout of the nineties, I witnessed both the rise of the internet and the inception of social media megalomania; also the demise of decent television, which I still haven’t gotten over. This website is my digital business card, portfolio, archive, and kind of personal playground.

After graduating, I co-founded a culture magazine with some former classmates. It quickly became one of the largest national webzines at the time, collaborating with notable global players such as Sony and Warner. Besides managing the production team and negotiating contracts with PR agencies from around the world, I pursued press photography, conducted interviews, and stood in front of cameras. Some years later, I came to the unwavering conclusion that writing essays about artists was hardly a sustainable career. Taking with me many unforgettable memories, I therefore decided to close this chapter.

My fascination with medicine led me to the lab, where I infamously failed to cure either cancer or Covid by shaking and shattering flasks ad nauseam. Nevertheless, I came to appreciate the complexity of life. Having become quite adept at programming, I decided to combine my data science skills with my biomedical knowledge. Over time, I ventured into more hands-on areas of medicine, working as a hospital physician and leading medical digitization projects in the private sector. This is where I have found my niche and where I have felt quite comfortable ever since.

I hold a bachelor's degree in biochemistry and a medical doctorate, complemented by postgraduate training in economics, data science, and healthcare digitization. I specialize in bioinformatics, healthcare digitization, health econometrics, and medical statistics. Some colleagues spread preposterous rumors of me being in a mutually abusive relationship with my stats software, R – but what do pesky SPSS noobs know anyway, right?

Besides working, I like to indulge in creative pursuits every once in a while. I am quite enthusiastic about global history and mythology, always enjoy a decent neo-noir or cyberpunk story, and I like to indulge in some good old Japanese martial arts. No satori for me yet, though.

So long!
Arne Kowalewski

Below, I portray excerpts of my professional expertise in Medicine, Bioinformatics, Data Science, Statistics, and Health Economics.

Medicine

The Vigor of Cut-Throat Therapies

My medical dissertation focused on achalasia — a rare motility disorder of the esophagus. This section provides a rough overview of its content, methodology, and results.

The lower esophageal sphincter is a muscle that usually provides the necessary muscular tension to prevent reflux of gastric fluids. Its ability to relax during deglutition is an important function that allows for food to pass into the stomach after being swallowed. This complex mechanism is regulated by specific esophageal ganglia cells that have been observed to degenerate in achalasia patients, causing contractile malfunctions along the entire swallowing process. Major clinical symptoms typically reported for achalasia are dysphagia, recurrent episodes of chest pain, food regurgitation, weight loss, and eventually pulmonary aspiration. In rare end stages of the disease, the esophagus might dilate and bend sideways, losing its typically tubular form to a characteristic sigmoidal shape.

Per-oral endoscopic myotomy is a novel minimally-invasive treatment option for achalasia introduced in 2010. An endoscope is inserted through the patient’s mouth into the esophagus, where the mucosa is cut above the esophagogastric sphincter. Through this opening, a submucosal tunnel is created that reaches down into the stomach, where the luminal circular muscle layer of the sphincter is then dissected without the need for open surgery.

As per the date of first publication, middle- and long-term outcomes after per-oral endoscopic myotomy had been sparsely reported in the body of literature so far, and especially the impact of previous treatments was in dire need of clarification. I conducted a study to assess whether preceding interventions affect the outcome after per-oral endoscopic myotomy. It includes 374 patients and stretches across a mean follow-up duration of 37 months. Multiply imputed multivariate regression models were fit for the odds of treatment failure after two, three, and five years, and for the hazard over time. All models were thoroughly validated utilizing advanced computer-aided statistical methods and deemed well fit. A literature review is provided. Implications and limitations are discussed.

My doctoral thesis and other exquisitely crafted publications
by Yours truly are freely available below:

Bioinformatics

A BLAST of Comparative Genomics

In my bachelor’s thesis, entitled “Structure and Sequence Analysis of Potentially Virus-Specific Proteins”, I studied a collection of protein superfamilies that previously had been postulated to be exclusive to the virosphere. Advanced computational algorithms were utilized to identify structural homologes of these protein fragments in the genomes of cellular organisms. Subsequently, evolutionary analyses were conducted to establish sequential links between the viral protein fragments and their newly discovered non-viral counterparts.

Viruses have been playing a pivotal role in medical research for over a century. Their number in the biosphere is expected to far exceed the number of living cells in existence. Through vivid interactions with their hosts’ metabolism, proteome, and sometimes even genome, viruses bear a heavy impact on living cell organisms and may well have served as an important driving force for evolution. As it stands today, about eight percent of the human genome has been traced back to viral origins. To put this number in perspective: the contribution of the Neanderthal to current-day humans' DNA is commonly estimated to be below five percent.

Since viruses were capable of acting as transmitters of genetic material to and possibly between cellular organisms, the analysis of viral protein sequences and structures may provide keys for the reconstruction of the evolutionary descendance of species – maybe even of those whose genetic links have been lost to time. The geometrical comparison of protein folds is of particular interest, for a protein’s structure tends to be more highly conserved than its sequence.

Computational methods utilized in the study include, among others: three-dimensional structure alignments of proteins and protein fragments by unleashing the SALAMI search server on the PDB90, geometric molecular structure visualization and comparison with UCSF Chimera, pairwise sequence alignments with Clustal, reiterative multiple sequence alignments with PSI-BLAST, and evolutionary guide tree visualization with SplitsTree.

Computational protein structure alignment
Figure 1: Computational Protein Structure Alignment. Shown are two protein fragments with remarkable three-dimensional similarities after computational structure alignment and visual superposition: 1kafX, a viral transcription factor (beige), and 2bsiA, a secretion chaperone from Yersinia (blue). (Real data)

Clinical Data Science

Machine Learning & Knowledge Discovery in Medicine

Data science employs analytical and predictive methods to gain knowledge from data. The term mostly refers to the computer-aided process of excavating and deciphering previously unknown information hidden within data, finding structures and relations among noise and chaos, and transforming such information into actionable knowledge.

Traditional medical research presumes the impact of previously chosen predictors on each other and on a defined outcome. As such, it aims to confirm what it already suspects. The methods of a data analyst allow for a more generalist approach. In simple terms, where classic statistics aim to accept or reject pre-formulated hypotheses, machine learning algorithms are capable of processing high-dimensional data without knowing beforehand where to look and what to look for. The pillars of data science are association, clustering, and approximation.

Association, or rule induction, is the detection of data attributes that tend to occur together. It formulates logical pairs of premises and conclusions, and provides each of these with a confidence measure to estimate its reliability. This may be utilized to detect common denominators of critical incidents, or to detect conditions and complications that often occur together in patients.

Clustering divides data into groups based on shared attributes which are weighed by different variations of distance matrices. These groups are unknown beforehand. In a clinical setting, clustering may be used to detect whether a collection of patients treated with a specific method breaks into distinct partitions. If so, the underlying reasons and implications may be studied to improve future treatment. For example, patients within the distinct clusters may respond differently to the same treatment and might therefore profit from individual therapy adaptations. This might hint at underlying and previously unknown common denominators in patients with shared treatment complications. Such correlations tend to be hardly visible when only looking at the entirety of patients as a whole.

Approximation models functional correlations in data based upon the assumed influence of predicting effectors on a dependant outcome variable. As such, approximation provides ways to infer expected outcomes for future and as of yet unobserved effector constellations. A prime example is the readily used regression analysis, which constitutes the majority of clinical studies. It can either yield metric outcome estimates, or undergo calculatory transformation to estimate probabilities or polytomous state expectencies instead. Among the most commonly used regression models are the logit and the probit models for the prediction of an event occurring at a specific point in time, and the proportional hazards model for the prediction of the instantaneous risk of an event occuring “the very next moment” over time, also called survival analysis.

Another approximation method is the algorithmic construction of decision trees, which can illustrate complex conditional categorization flows in quite intuitive ways and also be used to model complex high-dimensional prediction curves. These algorithms are also called classification.

Gradient Boosting of Decision Tree Training
Figure 2: Gradient Boosting of Decision Tree Training. Shown is the iterative process of gradient boosted decision tree training. In each iteration, the model’s prediction error is approximated by a weak learner, which is then added to the model to refine its fit. Blue: model prediction, red: actual observations. (Modified from developers.google.com under the CC BY 4.0)

When working with high-dimensional data and possibly many hundreds or thousands of predictors, regularization or shrinkage becomes an impotant task, i.e. the selection of relevant predictors amid unimportant ones. This can, for example, be achieved by adding a penalty term for the number of predictors to a model’s minimization problem.

At the top of the machine learning food chain, recent innovations in technology have made advanced computational approaches to approximation feasible. This is where novel concepts such as evolutionary algorithms and neural networks pave the way up the staircase to artificial intelligence and deep learning.

Approximation via a Neural Network
Figure 3: Approximation via a Neural Network. Shown are the input layer (blue), an array of hidden layers (gray), and the output layer (red). If the weighted input signals received by a single perceptron through its dendrites add up to a value above the stimulus threshold, a new signal is fired across the axon to the perceptrons of the next layer (red lines). Neural networks are essential for deep machine learning. (Stylized)

Medical Statistics

What are the Odds!

Applied statistics is an ever-important pillar of medical research. This section offers a brief journey through a selection of methods and algorithms I utilized in the past. To the interested reader, I also provide a sample lecture. It was originally intented for verbal presentation, though, so its stand-alone value might be somewhat limited.

Regression Analysis

Regression models are approximation methods and, as such, a substantial pillar of data intelligence. They predict the value of a dependent outcome variable as a function of independent predictor variables. A predictor’s contribution to the outcome estimate is its effect size, or coefficient.

Logistic regression is a generalized linear model that allows for the prediction of a binary dependent variable. Since the primary outcome in medical studies is frequently modeled as such — usually treatment failure —, logistic regression is among the most prevalent statistical models of choice in clinical trials. Assuming a linear predictor correlation, logistic regression tries to predict the probability of an event occurring. To allow for an inuitive interpretation, the logistic regression equation is usually transformed via the logit function, which is the logarithm of the odds:

Survival analysis ist mostly built opon the processing of right-censored event times. This is commonly performed using proportional hazards regression. Such a model estimates the hazard in regard to a specific event happening. “Right-censored” indicates that patients may withdraw their participation or drop out of the study for reasons unknown throughout the observation period, which often happens in actual studies. It is a major benefit of survival analysis that partial information gained from the time until censoring can be included in the regression model. The hazard is the probability at any given time t of experiencing an event during the forthcoming infinitesimally short time span, given that no event has occurred before. In less technical terminology, it is the probability of a patient who did not experience an event yet to experience it the very next moment.

Model Building

Model building is a complex and challenging aspect of statistics. It looks at the essential question of which parameters to incorporate into a statistical model.

There is no gold standard of parameters that are commonly considered mandatory inclusions in statistical models. It is however often expected to incorporate certain general factors such as sex and age. Too few parameters bear the apparent risk of oversimplification. Too many parameters, on the other hand, may produce an overfitted model whose predictions might end up so tightly molded from its underlying data that the model reproduces noise and errors contained in the data rather than approximating underlying trends. Algorithmic procedures tend to be especially susceptible to overfitting because intelligent, accurate, and adequate logical conditions to end iterations and break recursion are often very hard to formulate. It can be surprisingly difficult to tell a machine when to stop.

There is a multitude of statistical measures aiming to assert a good model fit. Some of them can be used for a quick comparison of the model’s fit with the fit of another model, such as the null model, to assess whether the chosen predictors improved the model. Others are commonly used to determine the goodness of fit of a model as-is, without requiring a second model for comparison. An important measure of discrimination utilized in linear regression is the coefficient of determination, . Measures for logistic regression are — among others — the log-likelihood, deviance, miscellaneous information criteria, different log-likelihood based pseudo-R²s, Kendall’s τa, Goodman and Kruskal’s γ, Somers’ D, and the Hosmer-Lemeshow test. Last but not least, measures applicable in proportional hazards regression include the log-rank test and Harrell’s concordance.

Missing Data

Missing information is ubiquitous in clinical trials and can have major negative impacts on statistical procedures. If unaccounted for, biases within incomplete data may remain undetected and possibly render conclusions drawn from them specious at best, deceptive at worst. Many statistical procedures expect data to be missing completely at random to yield valid results and reliable effect size estimates. It is therefore essential to analyze missing data patterns before applying statistical methods. Such assessments mostly rely on testing for either homogeneity of means or homogeneity of covariances, of whom the latter is also known as homoscedasticity.

In the unique and — sadly — quite rare situation of data missing completely at random, incomplete observations may simply be discarded. This process of elimination is called listwise deletion. Models built upon thusly reduced data are complete case analyses. Collections of complete cases can easily be fed to most statistical procedures, though with the caveat of potentially valuable partial information having been thrown away and new biases having possibly been introduced as a result of that. Depending on the fraction of missing information, this loss may severely diminish the quality of subsequent statistical analyses. It may even render the remaining data ineligible for statistical modeling altogether. Then again, it is rarely justified to assume data to be missing completely at random to begin with.

If data being missing at random is plausible, yet being missing completely at random is not, analyses may be performed using a variety of statistical tools. The difference between those two scenarios is sometimes hard to grasp. It essentially boils down to data missing at random being not actually missing at random, yet still having a delineable system to their deviance from random chance. As previously stated, listwise deletion as a commonly practiced technique unfortunately tends to introduce new biases and may therefore lead to sub-par effect size estimates. A younger approach is multiple imputation, which aims to produce better results where incomplete data is at least missing at random.

Ultimately, when data is missing not at random, the observation method should be revised to yield better quality. If data is compromised to begin with, statistics should abstain from attempting to salvage it. Outside of scientific research, however, these technologies may still be utilized to great benefits, for example in the field of data intelligence.

Multiple Imputation

Multiple imputation is an advanced method for the handling of missing data. Since its introduction, the increasing processing power of modern computer systems has allowed it to become the focus of attention of many researchers. It aims to improve upon older strategies that required the removal of incomplete observations and thus a reduction of the available information.

Unlike methods such as listwise deletion, multiple imputation does not remove incompletely observed data records. Instead, a multitude of copies of the entire data set is created. In each of these, the missing observations are interpolated with plausible values based on the observed data. This factors in the uncertainty associated with the unknown values while at the same time avoiding an over-representation of the unknown values compared to the known ones. Multiple imputation keeps valuable partial information from incomplete records in the data and thus makes it available for use in subsequent statistical models and procedures. As such, this approach is especially useful in clinical settings, where a drop-out of patients over time is quite common.

There is a plethora of algorithms that can be used to impute missing data. One of the most common choices is predictive mean matching. It replaces each missing observation with an actually observed so-called donor observation, which it samples from a distribution derived from the unimputed original data. A major advantage of this method is its independence of distributional assumptions, which it maintains by not generating any new values. Thus, predictive mean matching yields good results even if the imputed data lacks normality.

Once multiple imputation has been performed and a multitude of imputed data sets have been generated, the statistical procedure of choice — such as regression analysis — is applied separately to each imputed data set. The resulting statistical measures of interest — for instance: effect sizes, standard errors, or probability values — are then each pooled into a single estimate by following Rubin’s rules, a defined set of mathematical algorithms.

Multiple imputation
Figure 4: Multiple Imputation. Shown are the densities of an incompletely observed variable (blue) and its imputations (red). (Real data)

Health Economics

The Cost of Wellness

Medical care differs from other economic markets in that its demand is temporary and tied to illness, which carries crucial risks for the personal integrity and earning capacity of those affected. By increasing labor supply and productivity, improvements in health have historically made important contributions to economic growth. However, many countries are experiencing rising healthcare costs. In Germany, health expenditure in 2021 amounted to 13 % of the country’s gross domestic product. Given the severe impact of health on the economy and welfare, reducing healthcare cost inefficiencies is a crucial socio-economic and political task.

Regrettably, there is a plethora of factors that render optimizing healthcare spending a challenge. Technological advances in machine learning have brought new strategies to the forefront of scientists’ minds, which are capable of dealing with vast current-day data sets and large arrays of highly correlated predictors, neither of which can be easily modeled using traditional statistical techniques. A key advantage of machine learning is that it has the potential to uncover hidden insights without relying on traditional, hypothesis-driven models, which typically incorporate a limited number of preselected predictors whose impact on the primary outcome is assumed a priori. As such, machine learning on large and high-dimensional data can provide new tools for the analysis and prediction of healthcare insufficiencies.

Historically, models of wasteful spending in health care have often focused on moral hazard, which is the assumption that the availability of diagnostics creates incentives for doctors to use them and at the same time increases demand from patients. In recent years, however, the body of literature has gradually opened up to other sources of inefficiency. For example, some authors suggest that differences in treatment rates between hospitals may be caused by different levels of expertise in particular diseases. Thus, they may reflect specialization rather than misdiagnosis or treatment errors. Similarly, studies on guideline adherence by health professionals aimed to ascertain whether treatments that deviate from guidelines indicate a lack of knowledge on the part of the professional, or, on the contrary, the incorporation of additional knowledge that the guidelines does not take into account. As can be seen from these examples, the accurate identification of real insufficiencies is a daunting challenge.