Prognostic Factors for Evolution of Non Alcoholic Fatty Liver Disease Patients Utilizing Poisson Regression and Continuous Time Markov Chains

In this article , Poisson regression model is used to relates the rates of transition among states to the covariates .


I. INTRODUCTION
Continuous time Markov chains (CTMC) are valuable and of great potentiality mathematical and statistical tools to be used for evaluation of disease progression over time. CTMCs are a subtype of multistate models to be utilized to study this progression in NAFLD patients, with its characteristic phenotypes NAFLD and NASH, hand in hand with the presence of associated fibrosis and its stages. The prevalence of NAFLD is quickly growing worldwide, and matches the epidemics of obesity and type 2 diabetes. Metabolic syndrome is a well-known risk factor which requires the presence of abdominal obesity distinguished by waist circumference >94 cm for males and >80 cm for females in eastern countries while it is >120 cm for males and >88 cm for females in the western countries, plus 2 or more of the following: blood glucose ≥100 mg/dL or drug treating diabetes, arterial blood pressure ≥ 130/85 mmgh or drug treating hypertension, triglyceride levels ≥150mg/dL or drug treating increased levels in blood or high density lipoprotein (HDL) levels <40 mg/dL for males and <50 mg/dL for females or drug treating this condition.
NAFLD can be modeled using the simplest form for health, disease, and death model, with one state for susceptible individuals with risk factors, such as: type 2 diabetes, dyslipidemia and hypertension, the other state is the NAFLD phenotypes, and two competing states for death: one for liver-related mortality as a complication of NAFLD, and the other death state is death causes unrelated to liver disease [1]. This is shown in figure 1: In addition, NAFLD is modeled in more elaborative expanded form, which includes nine states: the first eight states are the states of disease progression as time elapses, while the ninth state is the death state [1], as illustrated in figure 2: Moreover, a subset of the states that explicitly illustrates the phases of fibrosis process, which develops early in disease evolution cycle if the risk factors are not treated or eliminated, is modeled with CTMC to demonstrate: how covariates incorporated in a loglinear model can relate these predictors to transition rates among states, as illustrated in figure 3 [2], [3]. The presence of fibrosis is considered an ominous predictor for disease progression. This subset is a subset of states from the expanded model especially early phases or stages where reversibility of conditions in each stage can be achieved if properly treated and controlled so as to prevent reaching the irreversible damaged state which is liver cirrhosis or F4.   Singh et al. 2015 conducted a meta-analysis to evaluate the rate of fibrosis progression and thus searched multiple databases through a thoroughly systematic manner associated with author contact and found 11 cohort studies on NAFLD adult patients having at least one year apart paired liver biopsy specimens, from which they calculated a pooled-weighted annual fibrosis progression rate (number of stages changed between the 2 biopsy samples) with 95% confidence interval (CIs), and characterized the clinical risk factors accompanying this progression. They identified 411 patients with biopsy-proven NAFLD (150 with NAFL and 261 with NASH) included in those studies. Initially, the distribution of fibrosis for stages 0,1,2,3 and 4 was 35.8%, 32.5%, 16.7 %, 9.3% and 5.7% respectively, and over 2145.5 person-years of follow-up evaluation, 33.6% had fibrosis progression, 43.1% had stable fibrosis, and 22.3% had an improvement in fibrosis stage. The annual fibrosis progression rate in patients with NAFL who had stage 0 fibrosis at baseline was .07 stages (95% CI, 0.02-0.11 stages), compared with 0.14 stages in patients with NASH (95% CI, 0.07-0.21 stages). These findings correspond to 1 stage of progression over 14.3 years for patients with NAFL (95% CI, 9.1-50.0 y) and 7.1 years for patients with NASH (95% CI, 4.8-14.3 y).
Kalbfleisch and Lawless [4] related the instantaneous rate of transitions from state to state to covariates, by regression modeling of the Q transition rate matrix using log-linear model for the Markov rates.
In the present study, Poisson regression is used to model the rates among states. The counts of each transition can be modeled as a function of some explanatory variables reflecting the characteristics of the patients. This can be accomplished by using Poisson regression model or log-linear model. The Poisson regression model specifies that each response is drawn from a Poisson population with parameter , which is related to the regressors or the covariates. The primary equation of the model is

( )
The most common formulation for the is the log-linear model: And the expected number of events per period is given by: The observed counts in the transition counts matrix is used as response variables and the covariates are the risk factors for fatty liver. Then the estimated counts obtained from the Poisson regression model are used to estimate the rates using the CTMC, as the initially observed transition rates approximately equal the estimated transition rates among states, as illustrated by the author in previous 2 papers, followed by exponentiation of the estimated rate matrix. To expound this procedure a hypothetical example is used , and it is in the form of a study conducted on 150 participants over 28 years to follow the progression of the NAFLD from F0 to F4.
The paper is divided into 3 sections. In section 1, illustration of the study design is clarified. In section 2, the results and discussion of running the Poisson regression model is elucidated. In section 3, conclusion of the running this model is expounded. Supplementary materials are complementary to this paper as some information are strictly presented in these materials and not in this main paper, such materials are table1,6,8,23, and figures from figure 13 to figure 21.

Study Design
One hundred fifty participants were followed up every year for 28 years, and at each visit the characteristics of the participants were recorded like sex(0=female,1=male),age, BMI, LDL-chol, HOMA2_IR, systolic blood pressure as well as the diastolic pressure as shown in the     State4  total  State0  1909  120  15  6  0  2050  State1  36  1116  67  28  0  1247  State2  13  30  703  37  0  783  State3  11  14  23  50  22  120  State4  0  0  0  0  0  0  4200 Initial observed rates are: Using CTMC, the estimated rates approximately equal the initially observed rates, as illustrated by the author Iman Attia in previous 2 papers utilizing the simplest small model and the expanded model, where no covariates were included in the analysis. [5] The distribution of the transition counts is Poisson as illustrated in the following figures using the Statgraphics-19 software. Lowess smoother illustrates that the relationships between each of the response rate and each variable is not strictly linear, but it is curvilinear relationship, with initial part of this relation being nearly horizontal and it starts to curve upwards at some predictor point located inside the second category of each predictor. The figures illustrating these relations are in supplementary materials from figure (13) to figure (21) for each response rate to the 7 variables. For example, relationship between number of transitions from state 0 to state 1 starts to bends up where each of the six predictors are located inside the second category; where age is approximately ≥37, BMI is approximately ≥ 26, LDL-chol is approximately ≥ 85 mg/dL, HOMA-IR is approximately ≥1.7, systolic blood pressure is approximately 142 mmHg, and diastolic blood pressure is approximately ≥ 85 mmHg. All these values are located in the second category. This can give good orientation to the functional form of the variables to be used in the regression model and avoid the misspecification resulting from mal-functional form of the predictors. In this work the restricted cubic splines are used for the predictors with 5 knots using Harrell approach which is the default procedure utilized by Stata 14 software. The locations of knots are illustrated in table (10) and correlations between the transformed variables are presented in table (11). The Poisson regression was applied using the observed counts of the transition counts matrix as response variable, and the following results are obtained as discussed below in the next section.

Results and Discussion:
In the next discussion, the results of running Poisson regression to obtain the following estimated counts are demonstrated. Running Poisson regression on these transformed variables gives the estimated counts shown in table (13):

Using exponentiation of the estimated Q matrix
Step3: Calculate the expected counts in this interval by multiplying each row in the probability matrix with the corresponding total marginal counts in the observed transition counts matrix in the same interval to get the expected counts as in the following table (24) Therefore,from the above results the null hypothesis is rejected while the alternative hypothesis is accepted and the model fits the data that is to mean the future state depends on the current state with the estimated transition rates and probability matrices as obtained.
Of those patients starting at F0 ,only 5.51% will move to F1 in one year, this declines to 4.69% of patients starting at F1 moving to F2 ,while 3.48% of patients starting at F2 will move to F3 ; however, 13.57% of patients starting in F3 will move to F4, and this high percentage of patients moving towards advanced fibrosis may be due to the fact that advanced fibrosis is considered to be F3 and F4 and once the patient reaches F3 , his chance to progress to F4 is higher than being in any starting stage considered less advanced fibrosis including F0 to F2 ( by definition ), and this is obvious as shown by incidence rate ratio of this transition being the highest (5.237e+6) . It is shown that progression from F0 to F1 and from F1 to F2 is approximately equal, while transition from F2 to F3 is less and this may be to more aggressive intervention taken by the patients to hinder the progression of fibrosis by applying more intensive lifestyle modifications, but once the patient reaches stage F3 the progression to F4 is by far the most among the forward transitions. There are 2.74% of patients starting at F1 will move to F0 while this percentage decreases to 1.44% if starting at F2, and it is even less if starting at F3 (only .23 % of patients can achieve this task); hence it is more feasible to move from F1 to F0 than to move from F2 to F0 than to move from F3 to F0; that is to mean, the more advanced the stage of fibrosis the patient experiences, the less likely movement to F0 he affords to do. There is a paradox if the starting stage is F2 or F3 to F1. The movement to F1 is more obvious if the patient is in F3 (8.63% of patients move to F1) than if he is in F2 ( 3.27 % of patients move to F1); therefore, the more advanced fibrosis stage the patient recognizes , the more likely movement to F1 he can do, and may be this is due to the extensive lifestyle modification he performs to achieve less degree of fibrosis, but it remains a little bit difficult to reach F0 ( only .23 % of patient can move from F3 to F0). It is also noted that 2.74% of patients move from F1 to F0 , 3.27% of patients move from F2 to F1 while 12.45% of patients move from F3 to F2 ; in other words the more advanced the fibrosis stage is, the more likely the movement to the immediately previous stage is. Moreover if the starting stage is F3, then 13.57% of these patients move to F4, a little bit higher than moving to F2 (12.45% of the patients); whereas, movement to F1 and F0 declines (8.63% of the patients and .23% of the patients respectively, approximately movement to F0 is 2.66% that to F1). Of those patients starting in F2, 3.48% move to F3, a little bit more than moving to F1 (3.27 % of patients); nevertheless, movement to F0 is almost 44% that to F1 ( 1.44% of the patients move to F0).
Mean time spent by the patient in state 0 is approximately 17 years that declines to 12 years and 6 months spent in state 1, which further declines to approximately 10 years and 9 months spent in state 2, and ultimately reaching 2 years and 3.7 months spent in state 3. It is shown that, there is decrease in time spent in each stage as the disease process evolves over time. This huge rapid decline in time spent in state 3 is due to advanced fibrosis induced by dead hepatocytes, especially if no treatment is introduced like: lifestyle modification ,risk factors treatment, as well as anti-inflammatory and anti-fibrotic drugs, and if so, it is a matter of time to reach state 4, which is irreversible stage of damaged liver cells that will soon manifest with reduction in liver cell functions, and may be to hepatocellular carcinoma, and eventually death, if not managed with liver transplantation.

Conclusions:
Insulin resistance is a key stone for triggering all these abnormalities, the more sensitive the body cells is to insulin, the less likely the complications of NALFD will develop. The effect of risk factors or covariates as a mainstay players, like: increased insulin resistance, hyperlipidemia with increased LDL-cholesterol, high systolic and diastolic blood pressure are thoroughly explained using the Poisson regression model combined with CTMC. As concluded from the hypothetical model that for every unit increase in the transformed HOMA, the incidence rate ratio for transition from state 0 to state 1 is increased by 5909.7% and this elevation is kept rising while moving forward from subsequent state to the immediately next state, that is to mean, for every unit increase in the transformed HOMA, the incidence rate ratio (IRR) for transition from state 1 to state 2 is increased by 24017.9%, while for the transition from state 2 to state 3, it is increased by 47931.8% , and for transition from state 3 to state 4 it is increased by 5237498.4%. This increment is almost always highly statistically significant. This is in comparison with transformed LDL, as for every unit increase in the transformed LDL, the IRR for transition from state 0 to state 1 is increased by 68.7%, while for the transition from state 1 to state 2, it is increased by 36.4% , and for transition from state 3 to state 4 it is increased by 57.1%. And it is only highly statistically significant for transition from state 3 to state 4. However the systolic blood pressure is almost highly statistically significant for the transition from state 2 to state 3 as obvious by for every unit increase in the transformed systolic pressure, the IRR for this transition to occur is increased by 1114.3%. Moreover, for every unit decrease in the transformed HOMA, the IRR for transition from state 1 to state 0 is increased by 1.1%, for transition from state 2 to state 1 it is increased by 3.7%, for transition from state 3 to state 2 it is increased by 0.5%, for transition from state 2 to state 0 it is increased by 6.6%, and for transition from state 3 to state 1 it is increased by 8.4%. This emphasizes that better control of insulin resistance helps the patient to reverse his condition. To sum up, the precipitating factors should be rigorously and extensively treated and controlled by life style modifications represented by dietary restriction of high calorie diet and sedentary life, thus the predisposed persons should consume healthy diets and regularly practicing physical exercises suitable for their medical conditions. The newly discovered drugs like anti-fibrotic drugs that treat the fibrotic changes in the liver are promising drugs and await further longitudinal studies, to reveal the most effective protocol, by which they are administered to the patients, for better control of the rate of progression of liver fibrosis. This control keeps the patient out of loss of liver functions, and subsequently away from end stage liver disease, which necessitates liver transplantation with all its accompanying post transplantation complications.

Hint (programs and supplementary materials):
The above example is published with Stata data, accompanied do file, as well as the supplementary materials file on the code ocean sit with the following URL : Codeocean.com/capsule/4752445/tree/v1