An Autoregressive Moving Average Model for Short-Term Prediction of Non-Insulin Dependent Diabetes Among Farmers in Benue State

Citation

Agada, J., Kuhe, D. A., & Anthony, O. N. (2026). An Autoregressive Moving Average Model for Short-Term Prediction of Non-Insulin Dependent Diabetes Among Farmers in Benue State. International Journal of Research, 13(4), 255–278. https://doi.org/10.26643/ijr/edupub/22Style

APA

John Agada1, David Adugh Kuhe 2 and Ojochegbe Noah Anthony 3*

1Department of Mathematics and Computer Science, Rev, Fr. Moses Orshio Adasu University Makurdi, Benue State, Nigeria

2Department of Statistics, Joseph Sarwuan Tarka University, Makurdi, Benue State, Nigeria

3Department of Mathematics and Computer Science, Rev, Fr. Moses Orshio Adasu University Makurdi, Benue State, Nigeria

Corresponding Author: Email: davidkuhe@gmail.com; Tel: 2348064842229

ABSTRACT

This study employs an Autoregressive Moving Average (ARMA) time series model to forecast the short-term incidence of non-insulin-dependent diabetes mellitus (Type 2 Diabetes) among farmers in Benue State, Nigeria. The data was collected from the Benue State Epidemiological Unit, Makurdi, and covered a 20-year period from January 2005 to June 2025. The study employed descriptive statistics and normality measures, Augmented Dickey-Fuller (ADF) unit root test and ARMA (p,q) model as the principal analytical techniques and procedures used to examine the data. The descriptive statistics indicated moderate variability in diabetes cases over the years, while the Augmented Dickey-Fuller (ADF) test confirmed the stationarity of the series in level. Model choice based on Akaike Information Criterion (AIC), Schwarz Information Criterion (SIC), and Hannan–Quinn Criterion (HQC) identified the ARMA(3,3) model as the best fit for forecasting diabetic cases in the study area. The model’s high coefficient of determination (R² = 0.8905) and statistically significant parameters (p < 0.05) demonstrated its robustness and predictive accuracy. Diagnostic checks using autocorrelation, partial autocorrelation, and the Ljung–Box Q-statistics showed that the residuals behaved like white noise, indicating a well-specified model. Forecast evaluations using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) confirmed that the model accurately good for predicting out-of-sample values. The forecast for July 2025 to June 2027 revealed a potential average of approximately 6,420 diabetes cases per month among farmers, with expected fluctuations over time. The study underscored the growing public health concern of diabetes among the farming population in Benue State and its implications for agricultural productivity and postharvest losses. The study concluded that predictive modeling can serve as a vital tool for health planners to design early intervention strategies, integrate health management with agricultural development, and enhance the overall well-being of rural farmers.

Keywords: Diabetes, ARIMA, Time Series Forecasting, Non-Insulin Dependent Diabetes, Farmers, Benue State, Public Health, Postharvest Losses

1.0       INTRODUCTION

Diabetes mellitus, often simply referred to as diabetes, is a group of metabolic disorders characterized by high blood sugar levels over a prolonged period. The two main types of diabetes are type-1 diabetes, which results from the body’s inability to produce insulin, and Type-2 diabetes develops when the body either becomes resistant to insulin or produces insufficient insulin to control blood sugar levels effectively. Diabetes mellitus is a multifaceted metabolic condition marked by high concentrations of glucose (sugar) in the bloodstream Glucose is a crucial source of energy for cells, and insulin, a hormone produced by the pancreas, plays a central role in regulating its uptake into cells. In diabetes mellitus, this regulation is disrupted, leading to persistent hyperglycemia (high blood sugar) (American Diabetes Association, 2022).

Diabetes mellitus is a significant public health concern worldwide, with its prevalence increasing steadily over the past few decades. According to the International Diabetes Federation (IDF, 2019), an estimated 537 million adults aged 20-79 years were living with diabetes globally in 2021 and this number is projected to rise to 783 million by 2045. The prevalence of diabetes varies by region, with higher rates observed in low- and middle-income countries, particularly in urban areas undergoing rapid socioeconomic development and lifestyle changes. (ADA, 2022).

In Nigeria, the prevalence is estimated at 7% and 11.35% in South-south zone. The Diabetes Association of Nigeria (DAN) reviewed that, mortality rate of diabetes from insufficient management far outweighs that of HIV/AIDs, Malaria and Cancer (Olamoyegun et al., 2024)

Diabetes mellitus is significantly Impacting farmers in Benue State with prevalence rate among yam farming population estimated at 24.9% and mortality rate of 8.61% and as led to reduced labor productivity, economic impact and health complications (Teran, A.D.. 2017)

Diabetes is associated with numerous complications that can affect nearly every organ system in the body. These complications includes Microvascular: Retinopathy (vision loss) neuropathy (nerve damage), nephropathy (kidney damage), and Microvascular: cardiovascular disease (such as heart attack and stroke), others are foot ulcers and amputations. The burden of diabetes-related complications is substantial, leading to increased medical costs, reduced quality of life, and higher risk of premature mortality (ADA, 2022).

Type-2 diabetes, also known as non-insulin dependent diabetes, is a long-term condition that affects how the body processes sugar (glucose), which is an important source of energy. In this condition, the body either becomes resistant to insulin, a hormone that helps move sugar into cells, or doesn’t produce enough insulin to keep blood sugar levels normal (Sun et al., 2021). Unlike type-1 diabetes, where the immune system attacks and destroys insulin-producing cells in the pancreas, type-2 diabetes usually develops slowly over time. While it was once mostly seen in adults, more children and teenagers are now being diagnosed, largely due to increasing obesity and less active lifestyles (Sun et al., 2021).

A major characteristic of type-2 diabetes is insulin resistance, which means the body’s cells don’t respond to insulin as they should. When this happens, the pancreas tries to make more insulin to help move sugar into the cells. However, over time, the pancreas may struggle to keep up with this increased demand. As a result, sugar starts to accumulate in the blood, causing high blood sugar levels (Cloete, 2022).

Several determinants contributes to the risk of developing type-2 diabetes, including obesity, particularly excess fat around the abdomen (central obesity), A sedentary lifestyle, unhealthy eating habits—like eating too many sugary and processed foods—having a family history of diabetes, getting older (especially after 45), and belonging to certain ethnic groups are all factors that can increase the risk of developing diabetes (ADA, 2022).

In Addition to insulin resistance, type-2 diabetes can also involve problems with the pancreas, the organ that makes insulin. Sometimes, the pancreas doesn’t produce enough insulin to keep blood sugar levels in check, making high blood sugar worse (Desai & Deshmukh, 2020).

Symptoms of type-2 diabetes often develop slowly and can include increased thirst, frequent urination, fatigue, blurred vision, slow wound healing, and repeated infections. In the early stages, some people may not notice any symptoms at all, which is why regular screenings are essential (IDF, 2019).

Treatment for type-2 diabetes aims to maintain blood sugar levels within a target range to prevent serious health problems and complications. This typically involves lifestyle modifications such as regular exercise, healthy eating habits (including portion control and selecting nutrient-rich foods), weight management, and monitoring blood sugar levels. (Desai & Deshmukh, 2020).

The management and treatment of type-2 diabetes can impose financial burdens on individuals, families, and healthcare systems. In regions where healthcare costs are primarily borne by the individual or are not adequately covered by insurance, the expenses associated with diabetes care can divert resources away from agricultural investments and productivity-enhancing measures. This can directly impact agricultural communities with reduced investment into agricultural produces, reduced income and crop loss thereby affecting their livelihood (Huang et al., 2016).

Diabetes Mellitus is diagnosed when certain blood sugar levels are met or exceeded. Specifically, a person may be diagnosed if their A1C is 6.5% or higher, which reflects average blood glucose over the past few months. Alternatively, if fasting blood sugar is 126 mg/dL or higher, or if a 2-hour blood sugar reading during an oral glucose tolerance test reaches 200 mg/dL or more, a diagnosis may be made. Additionally, if an individual has a random blood sugar of 200 mg/dL or higher along with symptoms like excessive thirst, frequent urination, or unexplained weight loss, they may also be diagnosed with diabetes (Jaeger et al., 2025).

Agricultural activities, like applying chemical fertilizers and pesticides, can have environmental consequences that can indirectly impact diabetes risk factors. For instance, exposure to chemicals such as glyphosate or organophosphates used in farming has been associated with a higher likelihood of developing metabolic disorders. Additionally, environmental factors such as air pollution and climate change may exacerbate diabetes risk factors and health outcomes, potentially affecting agricultural productivity and crop yields (whiting et al., 2011). Overall, while the direct impact of type-2 diabetes on agricultural productivity and postharvest losses may be limited, the interplay between diabetes, dietary patterns, healthcare access, and environmental factors can have broader implications for agricultural communities and food systems. Addressing the complex relationship between health, agriculture, and the environment requires a holistic approach that considers socioeconomic factors, public health interventions, and sustainable agricultural practices (Whiting et al., 2011).

Overall, while the direct impact of type-2 diabetes on agricultural productivity and postharvest losses may be limited, the interplay between diabetes, dietary patterns, healthcare access, and environmental factors can have broader implications for agricultural communities and food systems. Addressing the complex relationship between health, agriculture, and the environment requires a holistic approach that considers socioeconomic factors, public health interventions, and sustainable agricultural practices (Huang et al., 2016).

This study therefore attempts to extend the existing literature and contribute to the existing body of knowledge by modeling and forecasting non insulin dependent diabetes among farmers in Benue State using autoregressive moving average (ARIMA) time series model with more recent data.

2.0       MATERIALS AND METHODS

2.1       Method of Data Collection

The data utilized in this research work are monthly secondary time series data on morbidity incidence of type-2 diabetes in Benue state for the period of January, 2005 June, 2025 making a total of 234 observations. The data was collected from Benue State Epidemiological unit, Makurdi. The data was transformed to natural logarithms using the following formula:

where  is the confirmed type-2 diabetes series observation indexed by time , while  is the natural logarithm. Hence forth  will be regarded as a series.

2.2 Methods of Data Analysis

Find below the statistical tools employed in the analysis of data in this work.

3.2.1 Descriptive statistics and normality measures

The mean of any given set of data can be computed as follows:

The sample standard deviation of any given set of data over a given period of time is computed using the formula:

where  is the sample mean,  is the sample size.

Jarque-Bera test is a normality test of whether a given sample data have the skewness and kurtosis similar to that of a normal distribution. The test was proposed by Jarque and Bera (1980, 1987) and test the null hypothesis that the series is normally distributed. Given any data set, the test statistic JB is defined as:

where  is the sample skewness denoted as:

and  is the sample kurtosis given below:

whereT is the total number of observations. The JB normality test checks the following pair of hypothesis:

and  (i.e.,  follows a normal distribution)

and  (i.e.,  does not follows a normal distribution).

The test rejects the null hypothesis if the p-value of the JB test statistic is less than  level of significance.

2.2.2 Augmented Dickey-Fuller (ADF) unit root test

The Augmented Dickey-Fuller (ADF) test helps to identify if a time series is stationary or has a unit root, indicating a persistent trend over time (Dickey and Fuller, 1979).

 It accounts for higher-order correlations by assuming the series follows an AR(p) process and incorporates lagged differences of the series into the regression to enhance the test’s precision.

.

where are optional exogenous regressors which may consist of constant, or a constant and trend, and are parameters to be estimated,β values arelagged difference terms and the are assumed to be white noise. The null and alternative hypotheses are written as:

                                                                                        (8)

and evaluated using the conventional ratio for

where  is the estimate of  and “the coefficient standard error is denoted as  

2.2.3 Portmanteau test

A Portmanteau test also called he Ljung-Box Q-statistic test is used to determine whether there is any remaining serial correlation or autocorrelation in the residuals of a time series. The test checks the following pairs of hypotheses:

 (all lags correlations are zero)

 (there is at least one lag with non-zero correlation). The test statistic is given by:

where

denotes the autocorrelation estimate of squared standardized residuals at  lags. T is the sample size, Q is the sample autocorrelation at lag k. We reject  if p-value is less than  level of significance (Ljung and Box, 1979).

2.3 Time Series Models Specification

To specify an ARIMA model which is the model framework use in this study, we first specify autoregressive (AR) model, moving average (MA) model, autoregressive moving average (ARMA) model before specifying autoregressive integrated moving average (ARIMA) model. These models are specified as follows.

2.3.1 The autoregressive (AR) model

A stochastic time series process {} is an autoregressive process of order p, denoted AR() if it satisfied the difference equation

where  is a white noise and  are constants to be determined.

2.3.2 Moving average (MA) model

A time series {} which satisfies the difference equation

where  are fixed constants with  as white noise is called a moving average process of order q, denoted MA().

2.3.3 Autoregressive moving average (ARMA) model

A stochastic time series process {} which results from a linear combination of autoregressive and moving average processes is called an Autoregressive Moving Average (ARMA) process of order p, q, denoted ARMA () if it satisfies the following difference equation:

where are fixed constants associated with the AR terms and  are fixed constants associated with the MA terms with  being a white noise. The stationarity of an ARMA () process is guaranteed if the roots of the polynomial

 lie outside the unit circle.

An ARMA () model is specified as:

 2.3.4 Autoregressive integrated moving average (ARIMA) model

Autoregressive (AR), Moving Average (MA) or Autoregressive Moving Average (ARMA) model in which differences have been taken are collectively called Autoregressive Integrated Moving Average or ARIMA models. A time series {} is said to follow an integrated autoregressive moving average model if the th difference  is a stationary ARMA process. If  follows an ARMA(p, q) model, we say that {} is an ARIMA (p, d, q) process. For practical purposes, we can usually take  or at most 2.

Consider then an ARIMA (p, 1, q) process, with , we have

In terms of the observed series,

)

2.4 Model Order Selection

We use the following information criteria for model order selection in conjunction with log likelihood function: Akaike information criterion (AIC) due to Akaike (1978), Schwarz information Criterion (SIC) due to (Schwarz, 1978) and Hannan-Quinn information Criterion (HQC) due to (Hannan, 1980). The formula for the information criteria are:

where is the number of free parameters to be estimated in the model, T is the number of observations and L is the likelihood function defined as:

Thus given a set of estimated ARMA models for a given set of data, the preferred model is the one with the minimum information criteria and maximum log likelihood.

2.5 Model Forecast Evaluation

We employed Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) accuracy measures to select an optimal model mode that is both parsimonious and accurately forecast the data based on minimum values of the accuracy measures.

2.5.1 Root Mean Square Error (RMSE)

The Root Mean Square Error is a statistical tool for measuring the accuracy of a forecast method. It is computed as:

Where  is the forecast value of the series and  is the actual series and  is the number of forecast observations.

2.5.2 Mean Absolute Error (MAE)

The mean absolute error (MAE) is a statistical tool for measuring the average size of the errors in a collection of predictions, without taking their directions into account. It is measured as the average absolute difference between the predicted values and the actual values and is used to assess the effectiveness of a model. It is given as:

where”  is the actual value of the series at time  is the forecasted value of the series and  is the number of observations. The lower the value of RMSE and MAE, the better the model is able to forecast future values.

3.0       RESULTS AND DISC0USSION

3.1 Summary Statistics and Normality Measures

This study seeks to provide a short-term prediction of non-insulin-dependent diabetes (Type-2 diabetes mellitus) among farmers in Benue State using the Autoregressive Moving Average (ARMA) time series model. Before model estimation, a preliminary analysis of the dataset was conducted to summarize its key characteristics and assess the normality of the distribution. Table 1 below presents the descriptive statistics and normality test results for the observed monthly diabetes cases.

Table 1: Summary Statistics and Normality Measures

VariableStatistic
Mean5571.321
Maximum9661.00
Minimum3624.000
Standard Deviation1769.088
Skewness0.010212
Kurtosis1.767498
Jarque-Bera Statistic15.57465
p-value0.000415
Number of Observations246

From the result of summary statistics and normality measures reported in Table 1 above, the mean value of approximately 5571 infection cases indicates the average number of recorded non-insulin-dependent diabetes cases among farmers during the study period, while the maximum and minimum values (9661 and 3624, respectively) show the range of variation in the data. The standard deviation (1769) suggests a relatively high level of fluctuation around the mean, implying moderate variability in the monthly incidence of diabetes cases.

The skewness value (0.010212), being close to zero, indicates that the distribution of the series is approximately symmetric. However, the kurtosis value (1.767498) is less than 3, signifying a platykurtic distribution, that is, the data are relatively flatter than a normal distribution with lighter tails.

The Jarque–Bera statistic (15.57465) with an associated p-value of 0.000415 is statistically significant at the 1% level, leading to the rejection of the null hypothesis of normality. This implies that the series does not follow a perfectly normal distribution, which is a common characteristic of real-world time series data.

Overall, the results suggest that while the data are fairly symmetric, they deviate slightly from normality, a factor to be considered when fitting and diagnosing the ARMA model for accurate short-term forecasting.

4.2 Graphical Examination of Diabetes Miletus Series

Examining the morbidity cases of diabetes mellitus is essential for identifying trends and patterns over time, which can provide insights into the progression and fluctuations of the disease within a population. By analyzing these visual representations, healthcare providers and policymakers can better understand peak periods, seasonal variations, and the impact of interventions. This information is crucial for planning targeted healthcare responses, optimizing resource allocation, and developing strategies to reduce disease incidence and manage complications, ultimately improving health outcomes for affected populations. The time plots of the level and log transform series of diabetes mellitus are plotted in Figures 1 and 2 respectively as shown below.

The time plots of the level series and log transformed series reported in Figures 1 and 2 below indicate that both series are covariance or weakly stationary which implies the absence of unit root in the series in level. This is indicated by the smooth trend of both series.

Figure 1: Time Series Plot of Diabetes Miletus in Benue State from 2005 to 2025

Figure 2: Time Series Plot of Natural Log of Diabetes Miletus in Benue State from 2005

            to 2025

4.3 Augmented Dickey-Fuller (ADF) Unit Root Test Result

To ensure the appropriateness of applying an Autoregressive Moving Average (ARMA) model for short-term prediction of non–insulin-dependent diabetes cases among farmers in Benue State, it is necessary to examine the time series properties of the data. A key requirement for ARMA modeling is that the underlying series must be stationary. Therefore, the Augmented Dickey–Fuller (ADF) unit root test was conducted to determine whether the series  is stationary. Table 2 below presents the results of the ADF test under two specifications: with an intercept only, and with both intercept and trend.

The ADF statistics reported in Table 2 below for both model specifications (intercept only and intercept with trend) are -15.3344 and -15.4304, respectively. These values are far more negative than their corresponding 5% critical values (-2.8731 and -3.4283). In addition, the associated p-values are 0.0000, indicating strong statistical significance. Because the ADF test statistics are well below the critical values and the p-values are less than 0.05, the null hypothesis of a unit root is rejected under both model specifications. This confirms that the series stationary in its level form. Stationarity implies that the mean and variance of the diabetes case series remain stable over time, making it suitable for direct ARMA modeling without differencing. The strong evidence of stationarity enhances the reliability of subsequent short-term forecasts produced by the ARMA model.

Table 2: Augmented Dickey-Fuller (ADF) Unit Root Test Result

VariableOptionADF Test Statisticp-value5% Critical Value
Intercept only-15.33440.0000-2.8731
Intercept & Trend-15.43040.0000-3.4283

4.4 Autocorrelations and Partial Autocorrelations Functions of the Series

After confirming that the series of non–insulin-dependent diabetes cases among farmers in Benue State is stationary, the next step in the ARMA modeling process involves examining the autocorrelation structure of the series. The Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) are used to identify the dependence pattern between current and past observations, which guides the selection of appropriate autoregressive (AR) and moving-average (MA) orders.

Furthermore, the Ljung-Box Q-statistics were computed to test for the joint significance of autocorrelations up to various lags. This test determines whether the residuals are independently distributed — a key requirement for model adequacy. Table 3 below presents the ACF, PACF, and Ljung-Box Q-statistics results for the series while Figure 3 belowpresented the ACF and PACF plots of the series.

The results of ACF and PACF reported in Table 3 below and Figure 3 show that the autocorrelation (ACF) and partial autocorrelation (PACF) coefficients for all lags are small in magnitude, fluctuating around zero. This indicates the absence of significant serial correlation in the data. None of the autocorrelations exceed the approximate 95% confidence bounds (±0.1 for a large sample size of 246), suggesting that the time series behaves like a white-noise process.

The Ljung-Box Q-statistics and their corresponding p-values across all lags (p > 0.05) further confirm that there is no significant autocorrelation remaining in the residuals. This means that the null hypothesis of no autocorrelation cannot be rejected at any lag, implying that the series is adequately described by a stationary stochastic process (Ljung & Box, 1979).

Table 3: Autocorrelations and Ljung-Box Q-Statistics Test Results

LagACFPACFQ-Statisticsp-value
10.0140.0140.04580.831
2-0.019-0.0190.13380.935
30.0040.0050.13800.987
4-0.049-0.0500.74970.945
50.0220.0240.87470.972
60.0370.0341.21650.976
70.0220.0231.34200.987
80.0170.0151.41260.994
9-0.007-0.0051.42600.998
10-0.110-0.1074.56590.918
11-0.025-0.0224.72270.944
120.0780.0756.29440.901
13-0.008-0.0126.31150.934
14-0.017-0.0276.39070.956
150.0520.0557.09700.955
16-0.035-0.0227.42260.964
17-0.012-0.0087.45990.977
18-0.088-0.0939.52130.946
19-0.054-0.05010.3020.945
20-0.092-0.11412.5670.895
21-0.026-0.03212.7500.917
22-0.115-0.11516.3690.797
230.0070.00816.3810.838
24-0.053-0.07417.1650.842
25-0.056-0.03618.0320.841
26-0.047-0.05618.6430.851
270.0550.05719.4820.852
28-0.011-0.03219.5140.882
290.0600.05720.5110.876
300.0560.04221.3810.876
310.0400.06121.8280.888
32-0.001-0.01521.8280.912
33-0.027-0.00722.0360.927
34-0.109-0.12125.4320.855
35-0.056-0.07426.3420.854
360.0660.02527.6040.841

Figure 3: Plots of ACF and PACF of Log Transformed Series

Collectively, these findings suggest that the series is not driven by persistent temporal dependence, and any ARMA model fitted to the data should yield uncorrelated and well-behaved residuals. Therefore, the dataset is suitable for ARMA model identification and estimation, and the absence of significant autocorrelation validates the appropriateness of proceeding with short-term forecasting using the ARMA framework.

4.5 Model Order Selection

Following the establishment of stationarity and the absence of significant autocorrelation in the diabetes time series, various ARMA model orders were estimated to determine the most parsimonious and best-fitting specification for short-term prediction. Model selection was based on several statistical criteria, including the Log Likelihood (LogL), Akaike Information Criterion (AIC), Schwarz Information Criterion (SIC), and Hannan–Quinn Criterion (HQC). Generally, the preferred model is the one with the highest Log Likelihood and the lowest values of AIC, SIC, and HQC. Table 4 below presents the results of the model order selection process.

Among the twenty-four ARMA model specifications estimated, the ARMA(3,3) model exhibits the highest Log Likelihood value (-24.0103) and the lowest AIC (0.2552), SIC (0.3159), and HQC (0.2958) values. These results indicate that the ARMA(3,3) model provides the best balance between goodness-of-fit and parsimony.

Table 4:Model Order Selection using Log Likelihood and Information Criteria

S/nModelLogLAICSICHQC
1.ARMA(0,1)-34.45970.29640.33490.3079
2.ARMA(1,0)-34.81940.30060.33910.3121
3.ARMA(1,1)-32.94440.29340.33630.3107
4.ARMA(0,2)-34.41070.30420.34690.3214
5.ARMA(2,0)-35.12560.31250.35550.3298
6.ARMA(1,2)-32.92560.30140.35860.3245
7.ARMA(2,1)-33.29880.30570.36310.3288
8.ARMA(2,2)-30.37710.28990.36160.3188
9.ARMA(0,3)-34.40600.31220.36920.3352
10.ARMA(3,0)-35.46880.32480.38230.3480
11.ARMA(1,3)-28.09120.27010.36160.3089
12.ARMA(3,1)-32.90280.31190.38380.3409
13.ARMA(2,3)-30.37080.29810.38410.3328
14.ARMA(3,2)-30.53040.30070.38590.3354
15.ARMA(3,3)**-24.01030.25520.31590.2958
16.ARMA(0,4)-34.11570.31800.38930.3467
17.ARMA(4,0)-35.34920.33350.40560.3625
18.ARMA(1,4)-34.44660.33020.41590.3647
19.ARMA(4,1)-35.34320.34170.42820.3765
20.ARMA(2,4)-32.00990.31980.42010.3602
21.ARMA(4,2)-26.70270.27850.37950.3192
22.ARMA(3,4)-25.40650.27990.38990.3213
23.ARMA(4,3)-33.47970.34280.45810.3893
24.ARMA(4,4)-31.42530.29620.40600.3285

Therefore, based on the information criteria, the ARMA(3,3) model is selected as the optimal model for forecasting short-term variations in non–insulin-dependent diabetes cases among farmers in Benue State. This suggests that both autoregressive and moving average components up to the third order significantly contribute to capturing the dynamic structure of the series.

4.6 Parameter Estimates of ARMA(3,3) Model

After selecting the ARMA(3,3) model as the optimal specification based on the information criteria, the model parameters were estimated to evaluate the dynamic relationship between past observations and random disturbances in the series of non–insulin-dependent diabetes cases among farmers in Benue State. Table 5 below presents the estimated coefficients of the ARMA(3,3) model, along with their corresponding standard errors, t-statistics, and p-values. Goodness-of-fit measures such as the R-squared, Adjusted R-squared, F-statistic, and Durbin–Watson statistic are also reported to assess the adequacy of the fitted model.

Table 5: Parameter Estimates of ARMA(3,3) Model

VariableCoefficientStd. Errort-Statisticp-value
C8.7686640.017218509.27610.0000
AR(1)0.3660960.02464114.857130.0000
AR(2)0.3112030.02938210.591710.0000
AR(3)-0.9123590.024212-37.681660.0000
MA(1)-0.3728280.009593-38.862770.0000
MA(2)-0.3869230.009312-41.550860.0000
MA(3)0.9823890.007644128.51600.0000
R-squared0.890511 AIC0.255229
Adjusted R20.867389 SIC0.315852
F-statistic6.914400 HQC0.295759
Prob(F-stat.)0.000951 Durbin-Watson stat.2.011502

The model estimation results reported in Table 5 show that all autoregressive (AR) and moving average (MA) coefficients are statistically significant at the 1% level, as indicated by their very low p-values (p < 0.01). This implies that past values and past error terms up to the third lag significantly influence the current level of non–insulin-dependent diabetes cases among farmers.

Specifically, the positive coefficients of AR(1) and AR(2) suggest a direct persistence effect, meaning that increases in diabetes cases in the immediate past periods tend to raise current cases. Conversely, the negative AR(3) coefficient indicates a corrective mechanism, implying that after about three periods, the series tends to revert toward its mean. The MA terms also show alternating positive and negative signs, suggesting that short-term shocks have both dampening and amplifying effects over time before dissipating.

The high R-squared (0.8905) and adjusted R-squared (0.8674) values indicate that approximately 89% of the variation in diabetes cases is explained by the model, signifying a very good fit. The F-statistic (6.9144) with a significant probability value (0.000951) confirms the overall significance of the model.The Durbin–Watson statistic (2.0115) is close to 2, suggesting the absence of serial correlation in the residuals, while the information criteria (AIC = 0.2552, SIC = 0.3159, HQC = 0.2958) reaffirm that the ARMA(3,3) model remains the most parsimonious and efficient choice.

Overall, the ARMA(3,3) model adequately captures the temporal dynamics and short-term fluctuations in non–insulin-dependent diabetes cases among farmers in Benue State, making it suitable for reliable short-term forecasting.

4.7 Model Diagnostic Checks

Following the estimation of the ARMA(3,3) model for predicting non–insulin-dependent diabetes cases among farmers in Benue State, diagnostic checks such as multicolinearity test and Ljung-Box Q-statistic tests were conducted to verify the adequacy of the fitted model. This assessment ensures that the residuals behave like white noise, uncorrelated, homoscedastic, and pattern-free over time. The test are presented in the following subsections.

4.7.1 Multicolinearity test result

Multicollinearity diagnostics were performed to make sure the variables in ARMA(3,3) model weren’t overlapping too much. Using the Variance Inflation Factor (VIF) for each autoregressive (AR) and moving average (MA) term, the test assessed how multicollinearity might affect the stability and reliability of parameter estimates. Generally, VIF values above 10 indicate severe multicollinearity, values between 5 and 10 suggest moderate correlation, and values below 5 imply no serious concern. The results presented in Table 6 show both uncentered and centered VIF statistics for the ARMA(3,3) model parameters.

The results of multicolinearity test reported in Table 6 below reveal that all centered VIF values are considerably low, ranging between 1.11 and 2.55, which are far below the critical threshold of 10. This indicates that there is no serious multicollinearity among the explanatory variables (AR and MA terms) in the estimated ARMA(3,3) model.

Therefore, the estimated parameters are statistically reliable, and the standard errors are not inflated by multicollinearity. This implies that the ARMA (3,3) model is well-conditioned, and the coefficients can be interpreted with confidence.

Table 6: Test for Multicolinearity (Variance Inflation Factors)

 CoefficientUncenteredCentered
VariableVarianceVIFVIF
C 0.000296 1.018813 Na
AR(1) 0.000607 1.779456 1.779044
AR(2) 0.000863 2.552345 2.552344
AR(3) 0.000586 1.768375 1.768101
MA(1) 9.20E-05 1.257613 1.255458
MA(2) 8.67E-05 1.213557 1.203709
MA(3) 5.84E-05 1.121942 1.111356

4.7.2 Ljung-Box Q-statistic test result for serial correlation

The Autocorrelation Function (ACF), Partial Autocorrelation Function (PACF), and Ljung–Box Q-statistics were used to test for serial correlation. High p-values (greater than 0.05) for the Q-statistics indicate no significant autocorrelation, suggesting that the residuals are random and the model is well specified. Table 5 presents these diagnostic test results for the ARMA(3,3) model residuals.

The results of Q-statistic reported in Table 5 and the ACF as well as PACF plots reported in Figure 4 show that all residual autocorrelations (ACF and PACF) are very small and fluctuate closely around zero across all 36 lags. None of the autocorrelation coefficients appear significant, suggesting that the residuals from the ARMA(3,3) model are approximately white noise.

Furthermore, the Ljung–Box Q-statistics have p-values consistently greater than 0.05, indicating that the null hypothesis of no autocorrelation cannot be rejected at any lag. This confirms that there is no statistically significant serial correlation remaining in the residuals. In addition, the Durbin–Watson statistic from the model estimation (2.0115) supports this conclusion by indicating near-zero autocorrelation in the residuals.

Overall, these diagnostic results confirm that the ARMA(3,3) model is well specified, the residuals are independently and randomly distributed, and the model provides a statistically adequate fit to the data. Therefore, the model is suitable for reliable short-term forecasting of non–insulin-dependent diabetes cases among farmers in Benue State

Table 7: Autocorrelations and Ljung-Box Q-Statistic Test Results of Residuals

LagACFPACFQ-Statisticsp-value
1-0.024-0.0240.14150.707
2-0.012-0.0120.17600.916
3-0.069-0.0701.35580.716
40.0070.0031.36690.850
5-0.126-0.1285.32470.378
6-0.036-0.0485.65410.463
7-0.017-0.0245.72940.572
80.1420.12410.8120.213
9-0.042-0.04211.2540.259
100.0460.03211.8020.299
11-0.021-0.01511.9180.370
120.0520.04412.6280.397
13-0.0250.01212.7940.464
14-0.009-0.00812.8150.541
150.0620.08013.8040.540
160.0680.05315.0190.523
170.1120.14718.3160.369
180.1090.12721.4750.256
19-0.0080.02721.4930.310
20-0.087-0.06623.5290.264
21-0.066-0.03224.7070.260
22-0.0200.01024.8100.306
23-0.062-0.05725.8550.308
24-0.048-0.06426.4800.329
250.021-0.04426.5990.376
260.020-0.03726.7040.425
27-0.033-0.06927.0030.464
280.0650.05028.1560.456
290.0520.03028.8980.470
300.0620.04629.9690.467
310.0140.04030.0230.516
320.0100.01630.0530.565
330.0420.05030.5550.589
340.0030.00430.5580.637
35-0.039-0.01330.9940.662
36-0.008-0.00131.0140.705

Figure 4:Plot of Correlogram of Residuals of Estimated ARMA(3,3) Model

4.8 Forecast and Forecast Evaluation

To evaluate the predictive performance of the ARMA(3,3) model in forecasting non–insulin-dependent diabetes cases among farmers in Benue State, forecast accuracy measures were computed. The Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) were used to assess both in-sample and out-of-sample forecast accuracy. Lower values of these statistics indicate better model performance and predictive reliability. The result is presented in Table 8.

The results of forecast comparison reported in Table 8below show that the out-of-sample forecast achieved slightly lower RMSE (0.2671), MAE (0.2310), and MAPE (2.6490) values compared to the in-sample forecast (RMSE = 0.2715, MAE = 0.2446, MAPE = 2.6781). This suggests that the ARMA(3,3) model demonstrates strong predictive capability, with minimal forecast error and good generalization performance. The model selected in forecast mode, as denoted by the accuracy measures, provides reliable short-term out-of-sample predictions of non–insulin-dependent diabetes cases.

Table 8: Forecast Comparison using Accuracy Measures

 RMSEMAEMAPE
In-Sample0.2715100.2446152.678116
Out-of-Sample**0.2671000.2310482.649005

Note: ** denotes forecast mode selected by accuracy measures.

4.8.1 Forecast of Diabetes Miletus in Benue State from July, 2025 to June, 2027

To evaluate the short-term predictive performance of the ARMA(3,3) model, forecasts of non–insulin-dependent diabetes (Type-2 Diabetes Mellitus) cases among farmers in Benue State were generated for the period July 2025 to June 2027. The forecasts were computed in natural logarithmic form and then converted to actual population estimates. For each forecast, the standard error, lower confidence limit (LCL), and upper confidence limit (UCL) were calculated at a 95% confidence level, using  . These values provide a range within which the true number of diabetes cases is expected to fall with high probability, thereby indicating the reliability and uncertainty of the forecasts. The forecast result is reported in Table 9 below while the forecast graph is presented as Figure 5 below too.

Table 9: “Forecast of Diabetes Miletus Infection Cases in Benue State from July 2025-

            June, 2027″

Year: MonthForecast (natural log form)Actual Forecast (No. of Persons)
ForecastStd. errorLCLForecastUCL
2025:066.99678896
2025:078.774050.2712433799646411000
2025:088.726550.2716693619616510499
2025:098.782040.2716703826651611098
2025:108.771320.2720653782644710988
2025:118.801410.2726723893664411337
2025:128.745190.2726723680628110717
2026:018.760880.2727903738638010889
2026:028.745850.2734553677628510741
2026:038.797250.2734663871661611308
2026:048.773660.2734763781646211044
2026:058.778250.2740403794649211107
2026:068.736480.2741103638622610654
2026:078.768030.2741143755642610996
2026:088.768100.2744733752642611005
2026:098.797290.2746523862661611335
2026:108.760260.2746693722637610923
2026:118.761130.2748243724638110936
2026:128.745040.2751113662627910767
2027:018.783410.2751213805652511188
2027:028.777340.2751523782648611121
2027:038.782230.2754813798651711183
2027:048.747160.2754813667629310798
2027:058.760580.2754813717637810944
2027:068.763130.2757593724639410978
Total210.40663  154075 
Average8.766942917  6419.7917 

Note: For 95% confidence intervals, . LCL and UCL denote lower and upper confidence limits respectively.

Figure 5: Forecast Graph of Diabetes Miletus in Benue State from July, 2025-June, 2027

The forecast results reported in Table 9 and Figure 5 above reveals that the predicted number of non–insulin-dependent diabetes cases among farmers in Benue State is expected to fluctuate moderately over the two-year forecast horizon (July 2025–June 2027). The monthly forecasts range between approximately 3,600 and 11,300 cases, with an overall average of about 6,420 cases per month and a total forecast of 154,075 cases during the study period. The relatively narrow confidence intervals across months suggest a high level of precision in the model’s predictions.

Overall, the ARMA(3,3) model demonstrates strong forecasting capability, indicating that diabetes prevalence among farmers in Benue State is likely to remain fairly stable with mild month-to-month variations over the forecast period.

4.9 Implications of the Study to Farmers and Postharvest Losses in Benue State

The implications of this study for farmers and postharvest losses in Benue State are significant from both public health and socio-economic perspectives. The findings, which forecast the prevalence of non–insulin-dependent diabetes (Type-2 Diabetes Mellitus) among farmers, suggest that a substantial portion of the agricultural workforce may experience declining health and productivity over time. Poor health conditions such as diabetes can reduce farmers’ physical capacity to engage in strenuous agricultural activities, particularly during critical periods like harvesting and processing. “This in turn increases the likelihood of postharvest losses, as crops may remain un-harvested or inadequately stored due to reduced labour efficiency and absenteeism resulting from illness”.

Moreover, “higher diabetes prevalence among farmers implies increased medical expenditures and a diversion of household income away from agricultural investment”, further compounding the problem of low productivity and waste. The study underscores the urgent need for integrated health and agricultural policies—including improved rural healthcare services, regular medical screening, health education on diet and lifestyle, and the promotion of labour-saving technologies—to mitigate the dual burden of disease and postharvest losses. Ultimately, addressing the health challenges of farmers is crucial for achieving food security, sustaining agricultural livelihoods, and enhancing overall economic resilience in Benue State.

4.0       Conclusion

The study demonstrates that the ARMA(3,3) model effectively forecasts the incidence of non-insulin-dependent diabetes among farmers in Benue State, Nigeria, The analysis revealed that the ARMA(3,3) model provided the best fit based on information criteria and diagnostic tests, with residuals behaving like white noise, indicating a well-specified and reliable model. The forecasts from July 2025 to June 2027 suggest a steady and relatively high incidence of diabetes cases among farmers, implying that the disease poses an ongoing public health concern within the agricultural population. This condition could adversely affect farmers’ productivity, increase medical costs, and indirectly contribute to higher postharvest losses due to reduced labour availability and inefficiencies in farm management. These findings highlight the interconnectedness between health and agricultural output, emphasizing that the burden of chronic diseases like diabetes extends beyond healthcare into the realm of food security and economic stability. Therefore, proactive health interventions and policy integration between the health and agricultural sectors are vital. Ensuring farmers’ wellness through preventive care, early detection, and education can significantly reduce the impact of diabetes and its broader economic consequences. The study provides empirical evidence to guide policymakers, healthcare providers, and agricultural development agencies in formulating context-specific strategies to improve both health outcomes and agricultural sustainability in Benue State.

REFERENCES

Al Zahrani, S., Al Rahman Al Sameeh, F., Musa, A. C. M., &Shokeralla, A. A. A. (2020).            Forecasting diabetes patients attendance at Al-Baha hospitals using autoregressive        fractional integrated moving average (ARFIMA) models. Journal of Data Analysis and              Information Processing, 8, 183-194.

American College of Obstetricians and Gynecologists. (2018). ACOG Practice Bulletin No. 190: Gestational Diabetes Mellitus. Obstetrics & Gynecology, 131(2), e49–e64.

American Diabetes Association. (ADA 2022). Diagnosis and Classification of Diabetes Mellitus. Diabetes Care, 45(1), S17-S38.

Atkinson, M. A., Eisenbarth, G. S., & Michels, A. W. (2014). Type 1 diabetes. The Lancet, 383(9911), 69–82.

Benue State Epidemiological Unit, Makurdi, Nigeria.
(Unpublished secondary data on type-2 diabetes incidence, 2005–2025).

Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (2015). Time series analysis: forecasting and control. John Wiley & Sons.

Carlos M. Jarque, C. M., &Anil K. Bera, A. K. (1980).Efficient tests for normality,homoscedasticity and serial independence of regression residuals. Economics Letters, 6(3), 255–259.

Carlos M. Jarque, C. M., & Anil K. Bera, A. K. (1987).A test for normality of observations and regression residuals. International Statistical Review, 55(2), 163–172

Cloete, L. (2022). Diabetes mellitus: an overview of the types, symptoms, complications and management. Nursing Standard, 37(1), 61-66.

David A.Dickey, D. A., &Wayne A. Fuller, W. A. (1979).Distribution of the estimators for autoregressive time series with a unit root.Journal of the American StatisticalAssociation, 74(366), 427–431.

Deberneh, H. M. & Kim, I. (2021). Prediction of type 2 diabetes based on machine learning algorithm. International Journal of Environmental Research and Public Health, 18, 3317-3329.

Desai, S. & Deshmukh, A. (2020). Mapping of type-1 diabetes mellitus. Current Diabetes Reviews16(5), 438-441.

Dickey, D. A., & Fuller, W. A. (1979). Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association, 74(366), 427–431. https://doi.org/10.2307/2286348

Diogo, M. V., Nunopombo, F., & Brandão, P. (2022).Hypoglycemia prediction models with auto explanation. IEEE Access, 10, 57930-57941.

Donath, M. Y., & Shoelson, S. E. (2011). Type-2 diabetes as an inflammatory disease. Nature Reviews Immunology, 11(2), 98–107.

Edward J. Hannan, E. J., &Barry G. Quinn, B. G. (1979).The determination of the order of an autoregression.Journal of the Royal Statistical Society: Series B, 41(2), 190–195

George Casella, G., &Roger L. Berger, R. L. (2002).Statistical Inference (2nd ed.). Duxbury.

George E. P. Box, G. E. P., Gwilym M. Jenkins, G. M., &Gregory C. Reinsel, G. C. (2015).
Time Series Analysis: Forecasting and Control (5th ed.). Wiley

Greta M. Ljung, G. M., &George E. P. Box, G. E. P. (1978).On a measure of lack of fit in time series models.Biometrika, 65(2), 297–303

Hirotugu Akaike, H. (1974).A new look at the statistical model identification.IEEE Transactions on Automatic Control, 19(6), 716–723.

Huang, Y., Vemer, P., Zhu, J., & Postma, M. J. (2016). The economic burden of diabetes mellitus in rural southwest China. International Journal of Environmental Research and Public Health, 13(9), 875-889.

International Diabetes Federation. (IDF, 2019). IDF Diabetes Atlas, 9th Edition. Brussels, Belgium: International Diabetes Federation.https://www.diabetesatlas.org/en/

Jaeger, B., Casanova, R., Demesie, Y., Stafford, J., Wells, B., & Bancks, M. P. (2025). Development and Validation of a Diabetes Risk Prediction Model With Individualized Preventive Intervention Effects. The Journal of Clinical Endocrinology and Metabolism, 110(12), e4023–e4029. https://doi.org/10.1210/clinem/dgaf250

Kahn, S. E., Cooper, M. E., & Del Prato, S. (2014). Pathophysiology and treatment of type-2 diabetes: perspectives on the past, present, and future. The Lancet, 383(9922), 1068–1083.

Katsarou, D. N., Georga, E. I., Christou, M., Tigas, S., Papaloukas, C., & Fotiadis, D. I. (2022). Short term glucose prediction in patients with type-1 diabetes mellitus. Annual International Conference of IEEE Engineering, Medical & Biological Society, 2022, 329-332.

Ljung, G. M., & Box, G. E. P. (1979). The Likelihood Function of Stationary Autoregressive-Moving Average Models. Biometrika, 66(2), 265. https://doi.org/10.2307/2335657

Ma, N., Zhao, Y., Wen, S., Yang, T., Wu, R., Tao, R., Yu, X., & Li, H. (2020). Online blood glucose prediction using autoregressive moving average model with residual compensation network. Journal of science, 12(2), 115-128.

Matthew, P. K., Timothy, K. N., Ajia, R., & Antyev, S. (2022).Time series modelling of diabetes disease in Taraba state, Nigeria. Science World Journal, 17(3), 406-412.

Olamoyegun, M. A., Alare, K., Afolabi, S. A., Aderinto, N., & Adeyemi, T. (2024). A systematic      review and meta-analysis of the prevalence and risk factors of type 2 diabetes mellitus in Nigeria. Clinical Diabetes and Endocrinology, 10(1). https://doi.org/10.1186/s40842-024-00209-        1

Olivares-Vera, D. A., Gutiérrez-Hernández, D. A., Escobar-Acevedo, M. A., Lara-Rendón, C., & Velázquez-Vázquez, D. A. (2021). Comparison of algorithms for the prediction of glucose levels in patients with diabetes. Nova Scientia, 13(2), 1-19.

Powers, A. C., D’Alessio, D., & Endocrine Society. (2016). Diabetes Mellitus: Diagnosis, Classification, and Pathophysiology. In Endotext. MDText.com, Inc.

Rob J. Hyndman, R. J., & George Athanasopoulos, G. (2021).Forecasting: Principles and Practice (3rd ed.). OTexts.Available online: https://otexts.com/fpp3/

Robertson, R. P. (2004). Chronic oxidative stress as a central mechanism for glucose toxicity in pancreatic islet beta cells in diabetes. Journal of Biological Chemistry, 279(41), 42351-42354.

Rodríguez-Rodríguez, I., Chatzigiannakis, L., Rodríguez, J., Maranghi, M., Gentili, M., & Zamora-Izquierdo, M. (2019). Utility of big data in predicting short-term blood glucose levels in type 1 diabetes mellitus through machine learning techniques. Sensors, 19, 4482-4498.

Schwarz, G. (1978c). Estimating the Dimension of a Model. The Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136

Sheldon M. Ross, S. M. (2014).Introduction to Probability and Statistics for Engineers and Scientists (5th ed.). Academic Press.

Singye, T. &Unhapipat, S. (2018). Time series analysis of diabetes patients: A case study of Jigme Dorji Wangchuk National Referral Hospital in Bhutan. Journal of physics: conference series, 1039, 1-11.

Smith, S. M., Boppana, A., Traupman, J. A., Unson, E., Maddock, D. A., Chao, K., Dobesh, D. P., Brufsky, A., & Connor, R. I. (2021). Impaired glucose metabolism in patients with diabetes, prediabetes, and obesity is associated with severe COVID-19. Journal of Medical Virology, 93(1), 409-415.

Spyros Makridakis, S., Steven C. Wheelwright, S. C., & Rob J. Hyndman, R. J. (1998).Forecasting: Methods and Applications (3rd ed.). Wiley

Sun, Y., Tao, Q., Wu, X., Zhang, L., Liu, Q., & Wang, L. (2021). The utility of exosomes in diagnosis and therapy of diabetes mellitus and associated complications. Frontiers in Endocrinology (Lausanne), 12, 75-88.

Teran, A. D. (2017). Effects of diabetic prevalence and mortality on households farm labour        productivity in Benue State. IOSR Journal of Agriculture and Veterinary Science, 10(7),    63-72.

Villani, M., Nanayakkara, N., Ranasinha, S., Earnest, A., Smith, K., Soldatos, G., Teede, H. &Zoungas, S. (2017). Utilisation of prehospital emergency medical services for hyperglycemia: a community-based observational study. PLoS ONE, 12, e0182413.

Wang, J., Zhang, T., Lu, X., Zhang, H., Dong, Y., & Chen, X. (2019). Application of ARIMA model in forecasting diabetes mellitus mortality in China from 2019 to 2023. Chinese Journal of Preventive Medicine, 53(11), 1121–1125.

Whiting, D. R., Guariguata, L., Weil, C., & Shaw, J. (2011). IDF diabetes atlas: global estimates of the prevalence of diabetes for 2011 and 2030. Diabetes Research and Clinical Practice, 94(3), 311–321.

Zhu, D., Zhou, D., Li, N., & Han, B. (2022). Predicting Diabetes and Estimating Its Economic Burden in China Using Autoregressive Integrated Moving Average Model. International Journal of Public Health, 66, 1604449.

Zhu, H., Capistrant, B. D., & Peng, Y. (2017). Investigating the impact of socioeconomic factors, air quality, and built environment on diabetes mellitus in China. Journal of Environmental and Public Health, 2017, 1–12.

Daily writing prompt
What’s a classic book that you think is overrated?