Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 2005-0461(Print)
ISSN : 2287-7975(Online)
Journal of Society of Korea Industrial and Systems Engineering Vol.48 No.2 pp.35-44
DOI : https://doi.org/10.11627/jksie.2025.48.2.035

Air Passenger Demand Forecasting at Singapore's Changi Airport

Geun-Cheol Lee*, Heejung Lee**, Hoon-Young Koo***
*College of Business, Konkuk University
**School of Interdisciplinary Industrial Studies, Hanyang University
***School of Business, Chungnam National University
Corresponding Author : koohy@cnu.ac.kr
25/03/2025 21/04/2025 21/04/2025

Abstract


The COVID-19 pandemic has caused significant disruptions in global air travel demand, presenting new challenges for accurately forecasting passenger volumes. This study analyzes the monthly air passenger demand data from 2010 to 2022 to identify key external factors that influence passenger demand. Our analysis shows that the number of international visitors to Singapore is a critical determinant of passenger demand. Consequently, we propose a SARIMAX (Seasonal AutoRegressive Integrated Moving Average with eXogenous variables) model to forecast monthly air passenger demand at Singapore's Changi Airport, integrating international visitor numbers as an exogenous variable. Through comprehensive model identification and parameter estimation, we select the best SARIMAX configuration. To validate the performance of the model, traditional time series methods such as SARIMA, various exponential smoothing methods, and advanced machine learning methods like LSTM (Long Short-Term Memory) and Prophet were compared for forecasting monthly air passenger demand at Changi Airport in 2023. The results show that the SARIMAX model significantly outperforms all other tested models, achieving the best performance across multiple forecasting metrics, including the Mean Absolute Percentage Error.



싱가폴 창이 공항의 항공 승객 수요 예측

이근철*, 이희정**, 구훈영***
*건국대학교 경영대학
**한양대학교 산업융합학부
***충남대학교 경영학부

초록


    1. Introduction

    The COVID-19 pandemic in 2020 caused a drastic decline in global air travel demand, bringing the industry to its lowest point in recent history. However, as the situation transitions into an endemic phase, air passenger demand is gradually increasing, approaching pre-pandemic levels. Before the pandemic, air travel demand had been on a steady upward trajectory, driven by increasing globalization, until 2019. Projections had even forecasted that by 2037, the number of air passengers would reach 8.2 billion [7].

    Air passenger demand forecasting is a critical component of effective airport management as well as strategic planning. As the aviation industry continues to grow, accurately predicting passenger demand becomes increasingly important for ensuring operational efficiency and long-term sustainability. Accurate forecasts enable airport authorities to optimize resource allocation, manage infrastructure development, and enhance service delivery, all of which are essential for maintaining competitiveness in a rapidly changing global market [12].

    Air passenger demand forecasting can be categorized based on the time horizon into short-term, medium-term, and long-term forecasts, each serving distinct purposes within the aviation industry. Long-term forecasts, spanning several years to decades, are essential for strategic planning, such as airport infrastructure development, fleet acquisition, and long-term investment decisions. Medium-term forecasts, which cover a period of several months to a few years, are used for tactical planning, including route development, marketing strategies, and capacity adjustments. Short-term forecasts, typically ranging from a few days to several months, are crucial for operational planning, such as scheduling flights, managing airport resources, and adjusting staff levels to meet immediate passenger needs [16].

    In this study, we propose a short-term air passenger demand forecasting method, specifically focusing on Singapore's Changi Airport. Changi Airport is one of the largest and most significant transportation hubs in Southeast Asia, serving as a critical gateway for both regional and international travel. As Singapore continues to develop as a financial and tourism hub, the demand for air travel through Changi Airport has consistently grown, making accurate demand forecasting essential for maintaining efficient operations and planning future expansions [8, 9]. The importance of Changi Airport's role in regional connectivity and its impact on Singapore's economy emphasize the need for precise and reliable short-term demand forecasts to ensure the airport's continued success and sustainability in a dynamic aviation environment.

    Given the importance of forecasting air passenger demand, extensive research has been done on this topic. Many studies have explored different ways to forecast air passenger numbers. To understand the current state of research, it's useful to look at some recent survey papers that review and summarize these studies. Most recently, Zachariah et al. [16] provided a systematic review of passenger demand forecasting, looking at how forecasting methods have changed over time. Their study covers a wide range of techniques, from traditional time-series methods to newer machine learning approaches. This review emphasizes the need for more advanced methods to improve accuracy in predicting air travel demand. Wang and Gao [12] conducted a detailed review of air travel demand studies published between 2010 and 2020. They examined 87 studies, focusing on the types of data used, the methods applied, and how these studies are related.

    There have been many studies focused on forecasting air passenger demand at Changi Airport. For example, Rui and Zhaowei [8] evaluated various time-series models to forecast the annual air passenger volume at Changi Airport, emphasizing the importance of accurate long-term predictions for infrastructure development and strategic planning. Additionally, Vu and Zhong [11] explored forecasting methods specifically aimed at understanding long-term air traffic trends at Changi Airport, further contributing to the body of research focused on annual demand. Similarly, Sailauov and Zhong [9] proposed an optimization approach to forecast air traffic at Changi Airport, predicting significant growth in passenger movements by 2023, which underscores the need for long-term planning to accommodate this growth. Xie and Zhong [15] focused on forecasting the passenger volume at Changi Airport using an ANN (Artificial Neural Network). The study specifically targets medium-term forecasts, which cover a period of 3 to 10 years. The authors use historical data from the past 20 years, including factors such as GDP, population, and other relevant variables, to train the neural network model. Li Long et al. [7] took a more innovative approach by focusing on short-term air passenger forecasting using Neural Granger Causality. Their study leveraged Google Trends data to predict monthly passenger arrivals at Changi Airport. While much of the existing research has centered on long-term, or annual, demand forecasting except Li Long et al. [7], the increasing competition among global hub airports highlights the need for more precise short-term forecasting. Accurate short-term forecasts are essential for enhancing service delivery and operational efficiency. In this study, we aim to address this gap by proposing a method for forecasting monthly passenger demand at Changi Airport.

    Several studies have focused on improving short-term air passenger demand forecasting across major airports in the Asia-Pacific region, where air travel demand is rapidly increasing. Jin et al. [5] introduced a new hybrid ensemble approach that integrates Variational Mode Decomposition, ARMA (AutoRegressive and Moving Average), and Kernel Extreme Learning Machine to forecast monthly demand at three major Chinese airports. Do et al. [9] compare two forecasting models—LSTM (Long Short-Term Memory) and SARIMA (Seasonal AutoRegressive Integrated Moving Average)—for predicting air passenger demand. The study focuses on Incheon International Airport in South Korea and aims to improve short-term and mid-term forecasts using monthly and weekly passenger data. Kim and Shin [6] utilized big data from search engine queries to enhance monthly passenger predictions at Incheon International Airport, demonstrating the potential of real-time data in improving forecast accuracy. Lastly, Xie et al. [14] developed a hybrid model combining seasonal decomposition with LSSVR (Least Squares Support Vector Regression) to effectively capture and predict monthly passenger fluctuations at Hong Kong International Airport.

    Considering the major changes in air travel patterns caused by the COVID-19 pandemic, this study proposes a new method for forecasting monthly passenger demand at Singapore's Changi Airport that takes these differences into account. The rest of this paper is organized as follows: In the next section, we analyze the data, looking at the key features of the monthly passenger demand time series and which factors influenced the demand during COVID-19. Section 3 introduces the forecasting methods used in this study, focusing on the SARIMA and SARIMAX (Seasonal AutoRegressive Integrated Moving Average with eXogenous variables) models. Section 4 explains the process of the model identification and the parameter estimation. In Section 5, we conducted computational experiments to see the forecasting accuracy of the proposed model. We compare the results of our forecasting method with other well- known benchmarks. Finally, the last chapter discusses the implications of our findings and suggests directions for future research.

    2. Data Analysis

    In this section, we analyze the time series data on air passenger demand at Changi Airport over the past several decades. This analysis includes visualizing the data using graphs to identify trends, seasonal patterns, and any irregularities. We also perform stationarity tests to evaluate the suitability of the data for time series modeling. Additionally, we will identify the external factors, particularly in the context of the COVID-19 pandemic, which led to a significant and unprecedented decline in passenger demand.

    2.1 Overall Characteristics of The Time Series

    <Figure 1> illustrates the monthly passenger numbers at Changi Airport from January 2008 to December 2022. The data were collected from the Department of Statistics Singapore website (www.singstat.gov.sg). The chart shows a general upward trend in passenger traffic from 2008 to late 2019, reflecting steady growth in air passenger demand at Changi Airport. This period is characterized by typical seasonal fluctuations and a gradual increase in the number of passengers. However, a sharp decline is observed in early 2020, the time period beginning the COVID-19 pandemic. The number of passengers dropped to nearly zero, reflecting the global impact of travel restrictions and lockdowns. The chart also indicates a slow recovery beginning in late 2021, with passenger numbers gradually increasing throughout 2022, though still not reaching the pre-pandemic levels by the end of the time series.

    As seen in the figure, the line chart also reflects a somewhat irregular decrease in 2009, which coincides with the global financial crisis triggered by the Lehman Brothers' collapse. Given that the goal of this study is to forecast monthly passenger demand for the year 2023, we have decided to set the training data starting from 2010 to 2022, excluding the irregularities of 2009. By doing so, we aim to focus on more stable patterns of demand in the normal period except the pandemic era. In the following subsections, this training dataset, covering the monthly passenger demand from 2010 to 2022 is used for further analysis.

    2.2 Stationarity of The Time Series

    In this subsection, we investigate the stationarity of the monthly air passenger time series data. Stationarity is a fundamental assumption for employing ARIMA-based time series forecasting methods. As observed in <Figure 1>, the original time series appears to be non-stationary. To confirm this, we conduct the ADF (Augmented Dickey-Fuller) test. Additionally, we examine the stationarity of the first differenced series, as well as the series that has undergone both first differencing and seasonal differencing, so-called, double- differenced time series in this study. Since the time series is monthly, the lag was set to 12 for seasonal differencing. The null hypothesis of the ADF test is that there is a unit root, indicating non-stationarity. Therefore, stationarity can be confirmed if the test results allow us to reject the null hypothesis, indicated by a p-value less than the chosen significance level. <Table 1> summarizes the ADF test results for the three series.

    The table confirms that the original series is non-stationary, as expected, since the null hypothesis cannot be rejected. However, differencing the series significantly improves its stationarity. Notably, the p-value for the double-differenced series is smaller than that of the first differenced series, indicating that applying double differencing provides the sufficient stationarity for fitting ARIMA-based time series models.

    2.3 External Factor on The Time Series

    In this subsection, we investigate the external factors that may have influenced the air passenger demand during the COVID-19 pandemic. Traditionally, many studies have explored the external factors that affect air passenger demand. The survey paper by Zachariah et al. [16] provides a detailed review of various economic, geographic, and social factors that typically influence air travel. However, these common factors are usually used to predict long-term demand for the strategic decision. Moreover, these factors may not fully explain the unique situation caused by COVID-19.

    In this study, we focus on an important point mentioned by Gunter and Zekan [3]. They state that "the majority of air passengers are also tourists." This observation is especially relevant for Singapore’s Changi Airport. Singapore is known as home to famous tourist attractions like Marina Bay Sands, Sentosa Island, and the Botanic Gardens, which draw visitors from around the world. The country also hosts various international conferences and events. Most of these visitors arrive in Singapore by air, making tourism a key factor that directly affects air passenger demand at Changi Airport.

    <Figure 2> presents the number of international visitors to Singapore (blue line) and the number of air passengers at Changi Airport (red line) from 2010 to 2022. The data on international visitors to Singapore was collected from the Department of Statistics Singapore website (www.singstat.gov.sg accessed on 26 August 2024).

    The above graph shows the monthly data for air passengers at Changi Airport and international visitors to Singapore during the corresponding period. One clear feature of the graph is the strong similarity in the patterns of the passenger and visitor lines. This similarity suggests that a significant portion of Changi Airport's passengers are also visitors to Singapore. The close correlation between these two variables provides a basis for predicting Changi Airport's passenger numbers based on the international visitor numbers to Singapore. This relationship highlights the potential of using visitor data as a predictive factor in forecasting air passenger demand at the airport.

    3. Methodology

    In this section, we present the methodology used in this study for forecasting air passenger demand at Changi Airport. In the previous section, the stationarity was established for the differenced time series, which confirms the suitability of ARIMA-based time series models. Furthermore, given the significant changes observed during the COVID-19 pandemic, it is essential to include external variables that can capture these shifts in air travel patterns. In this study, we consider the number of international visitors to Singapore as an external predictor in our model, recognizing its strong correlation with passenger numbers at Changi Airport. Consequently, we now introduce the SARIMAX model, which is basically a time series model with a function that can account for additional external factors.

    The SARIMAX model extends the SARIMA model by incorporating exogenous variables—those external factors that can influence the dependent variable. This allows the model not only to capture the seasonal and trend components of the time series but also to adjust for the impact of relevant external factors, such as international visitor numbers in our case.

    The mathematical expression of the SARIMAX model essentially extends the SARIMA model by incorporating exogenous variables. The SARIMA model itself consists of six key components: non-seasonal autoregressive terms, non-seasonal differencing, non-seasonal moving averages, seasonal autoregressive terms, seasonal differencing, and seasonal moving averages. These components are represented by the parameters p, d, q, P, D, and Q , respectively, where each component plays a critical role in capturing different aspects of the time series data. To simplify the expression, we will henceforth refer to the seasonal autoregressive and seasonal moving average as SAR and SMA, respectively, and the non-seasonal autoregressive and moving average as AR and MA, respectively.

    The SARIMAX model builds on this structure by including an additional term to account for the influence of external variables, for a time series {Yt} with time index t, represented mathematically as:

    ϕ p B ϕ P B s 1 B d 1 B s D Y t = θ q B Θ Q B s t + γ X t
    (1)

    Where, s represents the seasonal frequency. Using the backward shift operator B , ϕp (B) , θq (B ) , ϕP (Bs) , and θQ (Bs) are compact forms of the polynomials of B , which are 1 - ϕ1B - ϕ2B2-⋯- ϕpBp, 1 + θ1B + θ2B2+⋯+ θqBq , 1 - ϕ1Bs- θ2B2s-⋯- ϕPBPs and 1 + θ1Bs+ θ2B2s+⋯+ θQBQs , respectively. ϕp (B ) and θq (B ) represent the non-seasonal autoregressive and moving average components, ϕP (Bs) and θQ (Bs) denote the seasonal autoregressive and moving average components, (1 -B )d and (1 -Bs)D are the non-seasonal and seasonal differencing operators, ϵt is the error term, assumed to be white noise, and lastly, γXt is the term that introduces the exogenous variable Xt into the model, allowing it to account for factors outside the primary time series that influence Yt.

    The use of SARIMAX allows us to address the limitations of traditional univariate models by considering the broader context in which the time series operates. In the following sections, we will discuss the steps taken to identify the appropriate model parameters, fit the SARIMAX model to the data, and evaluate its performance.

    4. Fitting SARIMAX Models

    To perform demand forecasting using ARIMA-based time series models, it is essential to estimate various parameters included in the model. Before the estimation, we must first determine the orders of the components within the model, such as, p, d, q, P, D, Q , which specify the number of parameters to be estimated. The process of determining these order values is known as model identification. The entire sequence of fitting the model, including the model identification process, generally follows the Box-Jenkins procedure [1]. However, this procedure often involves a certain degree of trial and error rather than being strictly defined by precise rules. In this study, we will conduct a model identification process based on the Box-Jenkins procedure to identify the most suitable SARIMAX model for our forecasting needs.

    4.1 Model Identification

    In this subsection, we explain the model identification process, which includes how the values of p, d, q, P, D , and Q for the SARIMAX model are determined. Based on the stationarity analysis discussed in Section 2.2, where it was found that both first differencing and seasonal differencing are necessary to achieve stationarity, we can easily set the values of d and D to 1, respectively. Additionally, since the time series considered in this study consists of monthly data, the seasonal period s in the SARIMAX model is set to 12.

    Next, the selection of the remaining orders, p, q, P, and Q , is addressed. Each of these parameters holds specific significance: p represents the number of recent past values of air passenger demand that influence the current demand; q denotes the number of recent past errors that impact the current demand; P indicates the number of past values from the same seasonal period that affect the current demand; and Q reflects the number of past errors from the same seasonal period that influence the current demand. For example, if p is set to 2, it means that the air passenger demand from the previous two months have a significant effect on the current month's demand. These order values play a critical role in the effectiveness of the time series model, and they are typically determined through visual and intuitive analysis of the ACF and PACF graphs of the time series.

    Figure 3 displays the ACF and PACF graphs for the air passenger time series data used in this study. Since we have already established d and D as 1, the graphs are generated based on the double-differenced series, which includes both first differencing and seasonal differencing. In the figure, any spike that extends beyond the shaded region at a given lag can be considered statistically significant. Such lags suggest the presence of meaningful autocorrelation, indicating potential candidates for inclusion in the autoregressive or moving average terms of the model.

    However, as mentioned earlier, the model identification process is often described as an art rather than a precise science. Therefore, instead of focusing exclusively on the most prominent spikes that exceed the shaded region, we also consider adjacent lags. Particularly, for the AR order p, the PACF graph shows a clear spike at lag 1, suggesting that p values not only 1 but also 0 or 2 are worth considering. Similarly, the MA order q is identified from the ACF graph, where a prominent spike at lag 1 that q = 0, 1 or 2 is likely appropriate. For the seasonal components, the PACF graph shows no significant spike at the seasonal lag 12 for the seasonal AR order P , suggesting that P values of 0 or 1 could be considered. Likewise, the seasonal MA order Q is determined by a similar spike pattern in the ACF graph, leading to potential values of 0 , 1 or 2 for Q. As a result, the candidate values for the SARIMAX model orders are p = 0, 1 or 2, q = 0, 1 or 2, P = 0 or 1, and Q = 0, 1 or 2, which will be further tested to identify the best SARIMAX model

    In this study, we considered two or three candidate values for each of the four orders, resulting in a total of 54 (=3×2×3×3) different SARIMAX model combinations. Each model was fitted to the data, and the results were evaluated using the AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion). Both AIC and BIC are widely used for model selection, with lower values indicating a better fit to the data. The outcomes are summarized in <Table 2>, where descriptive statistics of AIC and BIC are summarized.

    <Table 2> summarizes the basic descriptive statistics for the AIC and BIC values across the 54 SARIMAX models tested. As seen from the table, even with the same SARIMAX model structure, the choice of specific orders can significantly impact the model's fit. Our primary goal is to identify the model with the lowest AIC and BIC values, and interestingly, one model, SARIMAX(0, 1, 2)(1, 1, 2)12, stands out with both the lowest AIC and BIC. The models with the second and third lowest AIC and BIC values are also similar, specifically SARIMAX(1, 1, 2)(1, 1, 2)12 (AIC: 3138.93; BIC:3160.96) and SARIMAX(2, 1, 2)(1, 1, 2)12 (AIC: 3140.85; BIC:3165.63), respectively, which involve more parameters. According to the principle of parsimony, which favors simpler models when performance is comparable, SARIMAX(0, 1, 2)(1, 1, 2)12 can be considered the best model.

    4.2 Parameter Estimation

    In the previous subsection, the final model was selected based on the model identification process. The selected model, SARIMAX (0, 1, 2)(1, 1, 2)12, indicates that the current month's air passenger demand is influenced by the error terms from the previous two months, the demand from the same month in the previous year, and the error terms from the same months one and two years ago. The magnitude of these influences is determined by values of the corresponding parameters, which need to be estimated. The table below summarizes the estimated values of parameters in the selected model.

    <Table 3> summarizes the estimated coefficients for the five SARIMA model parameters as well as the coefficient for the exogenous variable. Among the coefficients, the estimate for MA1 ( θ 1 ^ ) is relatively small, indicating a minimal influence on the model. In contrast, the coefficients for MA2, SAR, SMA1, and SMA2 ( θ 2 ^ , Φ ^ 1 , Θ ^ 1 ,  and  Θ ^ 2 ) are statistically significant, with very low p-values, indicating their importance in the model. Notably, the test statistics value for the exogenous variable ( γ ^ ) is the largest (45.04), suggesting a strong impact of international visitor numbers on air passenger demand. This suggests that while a couple of the moving average, seasonal autoregressive, and seasonal moving average components play relatively small roles, the external factor of international visitor numbers significantly drives the demand for air travel, as reflected in the model.

    4.3 Model Validation

    In this subsection, we perform model diagnostics on the SARIMAX model selected in our study. Given the focus on forecasting accuracy, instead of conducting rigorous statistical tests for model validation, we rely on visual inspection of residuals through a couple of diagnostic plots. <Figure 4> presents two diagnostic plots: (a) the ACF of the residual series, and (b) a histogram of the residuals. In the ACF plot, no significant autocorrelation is observed among the residuals, indicating that the model fit is adequate. Although the histogram is not perfectly symmetric, it suggests that the residuals are reasonably close to normality, and therefore, the model's assumptions are not severely violated.

    5. Computational Experiments

    In this section, we conduct a comparative experiment to validate the forecasting performance of the proposed SARIMAX model. As previously mentioned, the period from 2010 to 2022 is used as the training period, and the 12 months of 2023 serve as the validation period. To assess the model's forecasting accuracy, we will compare the forecasts with the actual values for each month of 2023 and calculate the forecast errors. The performance of the model will be evaluated using well-known metrics: Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). The mathematical expressions for each of these metrics are as follows.

    MAPE = t = 1 n A t F i A t / n × 100 % , MAE = t = 1 n A t F t n RMSE = t = 1 n A t F t 2 n

    where At represents the actual value, Ft denotes the forecasted value, n is the number of observations.

    Now, we introduce several benchmark methods to compare with our proposed SARIMAX model. These benchmarks are primarily univariate time series forecasting techniques. The first benchmark is SARIMA. The inclusion of SARIMA as a benchmark is particularly important as it allows us to directly assess the impact of incorporating exogenous variables in our SARIMAX model. Well-known basic forecasting methods, Simple Exponential Smoothing (SES), Double Exponential Smoothing (DES), and Triple Exponential Smoothing (TES) [13] are another benchmark. LSTM networks [4], a type of recurrent neural network capable of learning long-term dependencies in time series data, is also compared as a machine learning method. Prophet [10], a procedure developed by Meta for forecasting time series data based on an additive model with non-linear trends and multiple seasonal components is also compared.

    The results of our comparative analysis, as presented in <Table 4>, demonstrate the superior performance of the SARIMAX model in forecasting monthly air passenger demand. With the lowest MAPE of 3.94%, as well as the smallest MAE and RMSE values, SARIMAX significantly outperforms all other models tested. The contrast between SARIMAX and SARIMA (MAPE 16.24%) explains the crucial role of exogenous variables in enhancing forecast accuracy. Traditional methods such as exponential smoothing models (SES, DES, TES) showed moderate performance, with DES achieving the best results among them. Interestingly, the LSTM model, despite its sophisticated architecture, did not surpass the SARIMAX model's performance, though it outperformed most traditional methods. The Prophet model, surprisingly, exhibited the poorest performance with a MAPE of 64.38%, which can be explained to its inability to adequately account for anomalous events such as the COVID-19 pandemic. Overall, these findings strongly support the efficacy of the SARIMAX model for this specific forecasting task, emphasizing the value of incorporating relevant exogenous variables and the need for careful model selection in time series forecasting.

    To explore the predictive accuracy of each method, we plotted the actual values and forecasts for 2023 alongside the actual values from 2022 in the graph below. This visualization excludes the results of SES and Prophet due to their relatively poor performance. By including the actual values from 2022, the graph provides additional context, allowing us to better assess how well each model captures the ongoing trends and adjusts to new data in the validation period.

    <Figure 5> compares the actual air passenger demand with the forecasts generated by various models, including SARIMAX, SARIMA, DES, TES, and LSTM, over the validation period. The black line represents the actual passenger data, while the colored lines represent the predictions from each model. From the graph, we can observe that the SARIMAX model (light blue line) closely follows the actual data, especially during the early months of 2023, demonstrating its robustness in capturing the underlying patterns in the time series data. The SARIMA model (orange line) shows a significant divergence from the actual values, particularly in the latter half of 2023, indicating its lower accuracy compared to SARIMAX. The DES and TES models (green and red lines, respectively) generally provide smoother forecasts but fail to capture the sharp increases and decreases observed in the actual data, particularly around the transition from 2022 to 2023. The LSTM model (purple line) exhibits more volatility and shows a larger deviation from the actual data during certain periods, suggesting that while it captures some patterns, it struggles with consistency in this dataset.

    6. Conclusion

    In this study, we proposed a SARIMAX model for air passenger demand forecasting at Singapore's Changi Airport, incorporating the number of international visitors as an exogenous variable. Our comparative analysis demonstrated that the SARIMAX model significantly outperforms traditional time series forecasting methods such as SARIMA, exponential smoothing models, and even advanced machine learning techniques like LSTM and Prophet. The results showed that the SARIMAX model achieved the lowest MAPE, MAE, and RMSE values, indicating superior accuracy in capturing the dynamics of air passenger demand in the post-COVID-19 period.

    Furthermore, the proposed approach can be extended to other areas of transportation and logistics. For example, the model could be effectively applied to forecast cargo throughput at major international ports or passenger demand on domestic high-speed railroads by identifying and incorporating appropriate exogenous variables relevant to each problem.

    Despite the promising results and applications, this study has some limitations that open avenues for future research. First, while the SARIMAX model demonstrated superior performance, it is still based on historical data and may not fully capture emerging trends or shifts in passenger behavior that could arise in the future. Therefore, future research could explore the integration of more advanced machine learning techniques or hybrid models that combine the strengths of traditional time series models with the flexibility of neural networks. Moreover, as the aviation industry continues to evolve in the post-pandemic era, there may be new exogenous factors that emerge, such as changes in travel regulations, environmental concerns, or technological advancements. Incorporating these factors into future models could further enhance their predictive power. Additionally, expanding the scope of this research to include a broader set of airports or applying the model to different time horizons could provide further insights into its generalizability and robustness.

    Acknowledgement

    This work was supported by research fund of Chungnam National University.

    Figure

    JKSIE-48-2-35_F1.gif

    Trend of Air Passenger Time Series at Changi Airport from 2008 to 2022.

    JKSIE-48-2-35_F2.gif

    Monthly Visitors to Singapore and Passengers at Changi Airport from 2010 to 2022

    JKSIE-48-2-35_F3.gif

    ACF and PACF graphs of Double-Differenced Air Passenger Time Series from Training Data

    JKSIE-48-2-35_F4.gif

    Residual Analysis: (a) ACF graph of Residuals; (b) Histogram of Residuals.

    JKSIE-48-2-35_F5.gif

    Comparison Between Actual vs. Forecasts of The Tested Forecasting Methods

    Table

    Results of the ADF Test

    * At the significance level of 0.05.

    Descriptive Statistics of AICs and BICs of 54 Combinations of SARIMAX Models

    * Both lowest AIC and BIC are obtained from SARIMAX(0, 1, 2)(1, 1, 2)12.

    Summary of the Estimated Coefficient Values of the SARIMAX Model

    Results of the Comparison Tests.

    Reference

    1. Box, G.E.P., Jenkins, G.M., Reinsel, G.C., and Ljung, G.M., Time Series Analysis: Forecasting and Control, 5th ed., John Wiley & Sons, 2015.
    2. Do, Q.H., Lo, S.-K., Chen, J.-F., Le, C.-L., and Anh, L.H., Forecasting Air Passenger Demand: A Comparison of LSTM and SARIMA, Journal of Computer Science, 2020, Vol. 16, No. 7, pp. 1063-1084.
    3. Gunter, U. and Zekan, B., Forecasting Air Passenger Numbers with a GVAR Model, Annals of Tourism Research, 2021, Vol. 89, 103252.
    4. Hochreiter, S. and Schmidhuber, J., Long Short-Term Memory, Neural Computation, 1997, Vol. 9, No. 8, pp. 1735-1780.
    5. Jin, F., Li, Y., Sun, S., and Li, H., Forecasting Air Passenger Demand with a New Hybrid Ensemble Approach, Journal of Air Transport Management, 2020, Vol. 83, 101744.
    6. Kim, S. and Shin, D.H., Forecasting Short-Term Air Passenger Demand Using Big Data from Search Engine Queries. Automation in Construction, 2016, Vol. 70, pp. 98-108.
    7. Li Long, C., Guleria, Y., and Alam, S., Air Passenger Forecasting Using Neural Granger Causal Google Trend Queries, Journal of Air Transport Management, 2021, Vol. 95, 102083.
    8. Rui, G. and Zhaowei, Z., Forecasting the Air Passenger Volume in Singapore: An Evaluation of Time Series Models. International Journal of Technology and Engineering Studies, 2017, Vol. 3, No. 3, pp. 117-123.
    9. Sailauov, T. and Zhong, Z.W., An Optimization Approach towards Air Traffic Forecasting: A Case Study of Air Traffic in Changi Airport, Statistics, Optimization and Information Computing, 2019, Vol. 7, No. 1, pp. 40-54.
    10. Taylor, S.J. and Letham, B., Forecasting at Scale, PeerJ Preprints, 2017, e3190v2.
    11. Vu, G.X.M. and Zhong, Z., Forecasting Air Passengers of Changi Airport Based on Seasonal Decomposition and an LSSVM Model, Review of Information Engineering and Applications, 2018, Vol. 5, pp. 12-30.
    12. Wang, S. and Gao, Y., A Literature Review and Citation Analyses of Air Travel Demand Studies Published between 2010 and 2020, Journal of Air Transport Management, 2021, Vol. 97, 102135.
    13. Winters, P.R., Forecasting Sales by Exponentially Weighted Moving Averages, Management Science, 1960, Vol. 6, No. 3, pp. 324-342.
    14. Xie, G., Wang, S., and Lai, K.K., Short-Term Forecasting of Air Passenger by Using Hybrid Seasonal Decomposition and Least Squares Support Vector Regression Approaches, Journal of Air Transport Management, 2014, Vol. 37, pp. 20-26.
    15. Xie, X. and Zhong, Z., Changi Airport Passenger Volume Forecasting Based on an Artificial Neural Network, Far East Journal of Electronics and Communications, 2016, pp. 163-170.
    16. Zachariah, R.A., Sharma, S., and Kumar, V., Systematic Review of Passenger Demand Forecasting in Aviation Industry, Multimedia Tools and Applications, 2023, Vol. 82, No. 30, pp. 46483-46519.