1.Introduction
All participants in the stock market have an interest in developing trading strategies that provide profitable returns. In particular, events such as the global financial crisis have sparked a growing interest in asset management techniques, such as arbitrage strategies, that promise stable yields. Arbitrage trading is the trading of securities aimed at exploiting differences in value between financial markets.
Arbitrage methodologies have been developed for financial market analysis in recent years. Fung and Mok [9] examined the arbitrage efficiency of the options-futures markets by using both real-time transaction costs and the bid/ask quotes of options provided in the open-outcry system to eliminate the estimation errors that stem from dynamic arbitrage frameworks. Paul and Fung [22] examined the relationship between arbitrage profit and spread costs, time to maturity, types of strategies adopted, and market volatility to solve the problems that have affected prior research on the relation ship between options or futures prices and the underlying index. Ofeck et al. [21] empirically investigated the no-arbitrage relationship in the context of short sales restrictions and found a strong relationship between the rebate rate spread and put-call parity violations. Henker and Martens [12] investigated the impact of the resulting decrease in spreads on S&P 500 index-futures arbitrage and found a substantial increase in the number of arbitrage trades reported to the Securities and Exchange Commission. Taylor [31] introduced a new econometric model of the mispricing associated with differences between spot and futures and indicated that the nature of the intraday periodicity in arbitrageur behavior can be expressed as a pattern. Teddy et al. [32] proposed a novel brain-inspired cerebellar associative memory model for pricing American-style call options on British pound vs. US dollar currency futures. The model was applied in a mispriced option arbitrage trading system and displayed an encouraging return on investment of 23.1% for some of the traded options. Montana et al. [18] developed an algorithmic trading system based on flexible least squares to establish an investment strategy that exploits patterns detected in financial data streams. The system yielded profits for the S&P 500 Futures Index. Alexakis [1] examined co-integration relationships among equity indices using statistical arbitrage strategies exploiting the mean-reversion property of the long-run relationships under consideration. The relationships suggested that arbitrageurs should perform rebalancing among the examined indices when a change in a market trend is evident. Hsu et al. [14] proposed an approach based on the extended classifier system that was adopted for knowledge rule discovery. Hsu et al. [14] performed an empirical study to verify the accuracy and profitability of the system in the inter-market. Song and Zhang [28] considered an optimal pairs trading rule in which a pairs (long-short) position consisted of a long position of one stock and a short position of another stock. Song and Zhang [28] demonstrated how to implement the results using a pair of stocks and their historical prices.
Several studies have examined arbitrage trading based on statistical analyses of stock, futures, and options markets. However, there are no decision-support systems for establishing the arbitrage investment strategies that employ the price ratio (PR) between two securities based on computational intelligence techniques. The PR indicates the relative strength between two securities and is important when comparing the performance of one security relative to another security in the market. Thus, some traders use the PR as a general tool for selecting outperforming stocks.
This study proposes pairs trading rules (PTRs) for arbitrage trading using a PR between two securities in the stock futures market. A Pairs trading is one among relative-value trading strategies that buys an overpriced security and simultaneously sells an underpriced security. The trading rules are generated by rough set analysis applied to various technical indicators derived from the PR that is the calculation of the closing price of ‘Security A’ divided by the closing price of ‘Security B.’ The input variables are created by applying the PR to a GARCH (1, 1) model [3] and using movingaverage convergence-divergence (MACD) [2]. More details concerning the indicators and indicator generation process are described in sections 2 and 3. In the experiments, we employ the PR between the KOSPI 200 and S&P 500 index futures, the relative pricing of which differs from that in the equilibrium state. A moving-window method is used to generate a profitable trading rule. Through empirical studies, the PTRs obtain high profits in producing trading results compared with the original pairs trading rules (OPTRs).
The remainder of the paper is organized as follows. Section 2 reviews the concept of the futures market, the MACD indicator, volatility, the GARCH (1, 1) model, and rough set theory. Section 3 describes the PTRs development procedure. An empirical study aimed at verifying the performance of the PTRs in presented in Section 4. Section 5 presents concluding remarks.
Notice that RRTS used the technical indicators consisting of trend following indicators and oscillators as the input variables for the rules generation. Also, it used a manual reduct method for reduction and employed the Euclidian distance method that is static method when it finds a reference pattern close to the current movement. However, PRTS uses the oscillator that provides the signal (i.e. buy or sell) by recognizing reversal trend, and it uses a genetic algorithm (GA) method for the reduction and DTW for the recognition of dynamic stock pattern. The DTW is used as a core recognizer to identify similar patterns in the dynamic stock futures market. The algorithm is one of many pattern recognition techniques that can be used to measure the similarity between two time series (or two patterns) particularly when the two are not aligned properly on time axis. In empirical studies, the PRTS yielded profitable earnings from the market that overcomes RRTS
2.Research Background
The futures market is a market in which individuals exchange standardized futures contracts. Investors engage in contracts to buy specific quantities of commodities, such as soybeans, gold, oil, and financial instruments (i.e., cash instruments and derivative instruments), at specific prices with delivery scheduled for a specific time in the maturity date. A stock futures market is a futures market in which stock price movements are managed in a similar manner as non-financial commodities. In the stock futures market, traders can generate marginal revenue by acquiring a contract to buy when a bull market is expected or sell when a bear market is expected. Thus, the market position hinges on the direction of stock price fluctuations, which offer profit opportunities to traders in both markets.
The most crucial determinant of a profitable trading strategy in the futures market is the accurate prediction of price fluctuations. Globalization, removal of local regulations, and irregular behaviors by market participants, such as investors, traders, and professional analysts, make the futures market unpredictable. Technical analysis has been widely used to forecast movement in the futures market. Such analysis also examines the historical data on stock prices and volume movements and uses these data to predict future price movements [19]. Although no solid foundation has been established for technical analysis, many investors have used technical indicators to make buy or sell decisions [33]. In particular, the MACD indicator, developed by Appel in the 1960s, was proven to be a valuable tool for traders [30]. The MACD is created by calculating the difference between two exponential moving averages (EMAs). The most common MACD is the difference between a security’s 26-day and 12-day EMA at time t [2]. The formula is as follows :
Additionally, volatility is fundamental in predicting the futures market and can be interpreted as an uncertainty that investors face over their investments [4]. Volatility means a statistical measure of the dispersion of profits for a given stock and can be calculated by using the standard deviation between profits from the same stock. Many financial analysts have developed models to predict time-series volatilities. For instance, the autoregressive conditional heteroskedasticity (ARCH) model was proposed by Engle [8] to model the characteristics of time series that possess volatility clustering and a fat tail. This model takes the following form :
where c < 0 and α1 ≥ 0 for i = 1,2, ⋯, m because σ2t is positive, and L is the lag operator. However, despite the ARCH model’s usefulness, its time lag increases its forecasting volatility. Bollerslev [3] proposed the generalized ARCH (GARCH) model, which reduces the dimensionality by adding auto-regressive terms as follows :
The GARCH model described above is typically referred to as the GARCH (r,m) model. The (r,m) is a standard notation in which the first number refers to how many ARCH terms are specified and the second number refers to how many moving-average lags are specified [3]. Models with more than one lag are necessary for obtaining good variance forecasts. The simplest of these models, GARCH (1, 1), is expressed as follows :
This model incorporates the mean reversion, and the dynamics of σ2 can be illustrated through past volatility shocks α1. A more detailed discussion about the GARCH process can be found in Bollerslev [3].
Market investors have pursued absolute returns because various risks are caused by unpredictable events in the futures market. Among various asset management techniques, a pairs trading has been highlighted as a simple and powerful strategy. A pairs trading is a also popular speculation strategy. A pairs trading consists of a buy position in one security and a sell position in another security combined in a predetermined ratio [7]. Most investors and professional analysts have been searching for a profitable trading rule for two securities utilizing the predetermined ratio involved in pairs trading [10].
A pairs trading involves the formation of a portfolio of two related securities whose relative moves are different from those in the “equilibrium” state. Thus, prior to pairs trading, two statistical tests are conducted. The first test is a unit root test, meaning that the observed time series is not stationary. The test determines whether a time-series variable is non-stationary using an auto-regressive model. A significant relationship may be observed between the irrelevant variables when the non-stationary time series is used in a regression model. This phenomenon is called spurious regression [11]. In this study, because causality tests are sensitive to non-stationarities, one of the unit root tests, called the augment Dickey-Fuller (ADF) test [25], was utilized to analyze whether the {Xt} time series should be differentiated to create stationary data. The ADF test regression follows :
The second test is a co-integration test. If two or more time-series variables are themselves non-stationary but a linear combination of them is stationary, then the series is considered co-integrated. In practice, co-integration refers to correctly testing those hypotheses concerning the relationship between two time series that have unit roots. If a stationary time series is obtained after differencing the series once, the Johansen trace test [16] is adapted to verify the long-run equilibrium relationship in the two time series. As stated by Johansen [16], the likelihood ratio-test statistic for the hypothesis of the (at most) r co-integrated relationships and (at least) m = n - r common trends is given by :
where T is the sample size and are the eigenvalues of the squared canonical correlation between two residual vectors from level and first-difference regressions. A pairs trading is possible after the co-integration test is conducted.
Rough set theory has emerged as a data mining tool for managing uncertainties associated with inexact, noisy, and incomplete data [24]. In rough set theory, an information table includes knowledge that consists of objects with attributes [23]. The rows in the table correspond to objects, and the cells in the columns consist of the attribute values. The main concept is a collection of rows that includes the identical value for one or more attributes, which generates an indiscernible relationship concerning the finite set of objects (referred to as the universe). Any complete set with indiscernible objects is referred to as an elementary set, which constitutes the basis of the universe. Every subset in the universe can be signified either precisely or roughly. If a set of objects is a union of elementary sets, then it is estimated as a crisp set. Otherwise, the set is regarded as a rough set. A combination of crisp sets can represent the rough set. The crisp and rough sets are considered the lower and upper approximations for a subset of the universe. The lower approximation composes all objects that clearly belong to the set, whereas the upper approximation provides objects that may belong to the set. The boundary region of the vague concept is described by the difference between the two approximations.
Slowinski [27] introduced decision rules, which are constructed in the form of ‘IF condition(s), THEN decision(s)’, to explain the approximations. Certain rules apply to the lower approximations, whereas uncertain rules comprise the boundary region. The conditional probabilities that precisely describe the universe imply the validity and coverage factors of decision rules. Dimitras et al. [6] noted that each decision rule is distinguished by the strength of its conclusion, which is determined by the number of objects that satisfy the condition portion of the rule and belong to the decision portion of the rule. The objects are deemed examples of decisions in the process of generating decision rules based on inductive learning principles. The examples belong to a set of objects referred to as positive, and all other objects are negative. Thus, a decision rule is discriminant if it distinguishes between positive and negative examples and is minimal. The generated decision rules support the decisions due to the instruction’s ability to make decisions under given conditions.
Not all of the condition attributes are used in the information table because the highest-quality classification approximation with a minimal set of decision rules (called a reduct) must be confirmed. The minimal subset of condition attributes, which provides the same classification quality as the full set of attributes, is identified as a step in the rough set approach. Attributes excepted on a reduct may not be necessary for classifying the elements of the universe [24]. The core of the attributes can be derived if several reducts are included in an information table. The core is a collection of the most meaningful attributes and is critical for ensuring the quality of the classification. Refer to Slowinski [27], Susmaga [29], Jackson et al. [15], and Dimitras et al. [6] for a more detailed description on how to deduce rough set theory.
The Korean stock futures market has fluctuated greatly as real-time data. Thus, traders require more powerful support in their investment decisions because their capability for analyzing enormous real-time data sets is limited. For a PRTS construction for the derivative, we used the Korea Stock Price Index 200 (KOSPI 200) 30-minute interval as a datasets and considered the period from Jul. 1996 to Dec. 2006. The period is divided it into a pattern base-constructed period (Jul. 1996 to Dec. 2004) and a real-time trading period (Jan. 2005 to Dec. 2006). Then, the pattern base-constructed period was divided into a training period and a testing period. A system trading was applied to the real-time data from the pattern base-constructed period. As a default condition for the system trading, the initial capital was set to 1,000,000 won (equal to 1,000 dollar), open market interest rates were set to 5.00%, transaction cost was set to 10,000 won, and slippage was set to 25,000 won. To evaluate the trading system, the return rates were calculated for the underlying asset. The return rates represent the yearly profit rates that are calculated from the ratio of the current capital value to the initial capital value after one year of trading. The yearly profit is defined as the yearly gross profit minus transaction costs and slippages in which the yearly gross profit is the yearly short position minus the yearly long position. The slippage cost is an additional amount set aside to prevent missed trading. These conditions were applied in this empirical study.
3.Generation of PTRs
This section provides a detailed description of the procedure for generating PTRs (see <Figure 1>). The input data consist of the closing prices of two stock index futures (the KOSPI 200 index and S&P 500 index). In this study, a trading simulation is conducted using the moving-window method, which maintains a constant window size for the testing period while varying the window size of the training period to evaluate PTRs. The moving window method is used to update the training data before the next testing period. The moving window remains at a constant size and discards trailing samples (earlier data). The return rate in this study indicates the yearly profit rate quoted from Lee et al. [17].
3.1.Co-integration Analysis between Two Securities
This section, which consists of two steps, considers the co-integration analysis between two securities for the formation of pairs in the previous trading period. In the first step, the differences of each security are calculated to identify the presence of a unit root. The unit root test uses the existence of a unit root as the null hypothesis through Eq. (5), i.e., H0 : α = 0, H1 : α ≠ 0. In the second step, the co-integration relationship between two securities that have unit roots is verified. The pairs formation of two securities is determined according to the rejection of the ‘‘null hypothesis of no co-integration,” which indicates that two securities have a long-run equilibrium relationship.
3.2.Input Variable Creation Using the PR
As input variables, this study uses five indicators that were created using the PR, comprised of a closing price of the stock index futures divided by a closing price of the stock index futures in the training period. The five indicators are generated using the MACD indicator, volatility, and the GARCH (1, 1) model presented in Section 2. The five indicators are as follows : MACDPR, VOLPR, GARCH(1, 1)PR, GARCH(1, 1)MACDPR, and GARCH(1, 1)VOLPR. MACDPR is calculated using the PR and Eq. (1), and VOLPR , a basic volatility indicator, is the standard deviation of the continuous PR within a specific time horizon. GARCH(1, 1)PR, the GARCH volatility indicator, is calculated using the PR in Eq. (4) GARCH(1, 1)MACDPR and GARCH(1, 1)VOLPRare the volatility indicators derived from the GARCH (1, 1) model using the MACDPR and VOLPR indicators, respectively.
3.3.Generation of PTRs Using Rough Set Analysis
This section presents the rough set modeling process that is applicable to real-time trading. In the first step, the outliers are removed using input data exploration to make the data complete. Further data transformation is performed primarily through discretization, which essentially shrinks the set of attributes. Several discretization techniques can be considered, including equal frequency binning, a naive algorithm, Boolean reasoning, and manual cuts [22, 28]. This study utilizes an equal frequency-binning method to transform the input data.
The second step involves the creation of reducts using the transformed data. The creation of reducts is a nucleus process in rough set analysis because the core information of the transformed data creates the reducts, which is an indispensable step in producing a specific rule. Reducts can be created through several methods, including manual reducers, genetic algorithms, Johnson algorithms, and dynamic reducts [22]. In this step, the manual reducer method is used to create possible reduct combinations of the five indicators.
The final step is rule generation for trading. Using the reducts created in Step 2, the rules are explained in ‘IF-THEN form,’ which combines the condition values and decision values. An example generated decision rule is provided below.
IF {(the indicator 1 is X1) AND (the indicator 2 is X2) …
AND (the indicator N is X3)} THEN BUY (or SELL).
To apply the generated decision rules in a practical manner, successively applying the rules or the number of positions to hold must be considered in trading. Therefore, the following implementation or trading rule is used to limit the number of positions held to one.
IF {(the position at time t is BUY (SELL)) AND (the position at time t - 1 is BUY (SELL))} THEN HOLD ELSE SELL (BUY).
Finally, the PTRs are simulated in the testing period using the moving-window scheme.
4.Empirical Study
For an empirical example of PTR development, this study uses the daily data from the KOSPI 200 and S&P 500 index futures between June 9, 2000 and June 9, 2010. Each futures contract can be traded for a specific period and ends on the maturity date. The maturity date is a specific date on which the contract expires and delivery of the underlying assets takes place. The maturity date of KOSPI 200 is the second Thursday of March, June, September, and December. In this study, the first date of the period is the initial date of the futures contracts during the second futures contract period in 2000. The last date of the period is the maturity date of the second futures contract period in 2010. The window sizes of the training period and testing period mentioned in Section 3 correspond to the futures contract period. Further, the window size of the training period increases incrementally as each of the four futures contract periods elapses in a three- month interval each year. In contrast, the window size of the testing period is fixed at a specific contract period. This method allows us to determine the appropriate training period for the establishment of the pairs trading rules and to measure performance. <Table 1> presents four testing periods corresponding to the window sizes of the training period. The number of experiment sets is 36 based on the window size of the training period.
Nicholas [20] stated that the concept of pairs trading can be applied to any equilibrium relationship in the stock market portfolios of securities, some of which are held short and others long. A co-integration trace test [16] is performed to verify the long-term equilibrium relationship between two time securities (KOSPI 200 and S&P 500). The basic premise of the test is that two time series are integrated in the same order. Prior to the test, an ADF test [5] is conducted to verify stability. Because there is a unit root in the sample data, the two time series are non-stationary time series that must be differenced. Therefore, the two time series are changed to stationary KOSPI 200 and S&P 500 logarithmic yield series (see <Figure 2> and <Figure 3>). The logarithmic value of the index returns is calculated. As shown in <Table 2>, the ADF test fails to reject the null hypothesis of the presence of a unit root test for two time series, indicating that the transformed time series is stationary and oscillates around a mean value of 0.
Based on the result of the ADF test, the co-integration test is conducted according to the Johansen trace test procedure. The results of the Johansen trace test are reported in <Table 3>. The trace test rejects the null hypotheses of no co-integration relationship and at most one co-integration relationship at the α = 0.05 significance level. Thus, the KOSPI 200 and S&P 500 have a long-run equilibrium relationship, and the two securities can be paired.
For the development and evaluation of PTRs, the data of five indicators (presented in Section 3) are obtained from each training period. Using the data for each training period, decision rules are generated through a rough set analysis. The five indicators are combined into sets of three indicators to create ten reducts using the manual reducer, which in turn generates a set of rules for the reducts in each training period. ROSETTA software [22] is used to perform the rough set analysis. <Figure 4> presents the example set of trading rules extracted from the 9-month window size of the training period on first experimental set.
The return rates of the trading rules according to each window size of the training period (i.e., 3, 6, 9, and 12 months) are measured using the moving-window method during the overall period. In addition, the performance of the PTRs is evaluated against the results of the OPTRs. The OPTRs use the trading signals (i.e., buy and sell) created by the price ratio of two securities [13]. The trading rules of OPTRs specify that a position should be opened when the ratio of two securities prices hits the 2 rolling standard deviation and should be closed when the ratio returns to the mean. <Table 4> compares the return rates of PTRs and OPTRs by increasing the window size of the training period. The average return rates of the PTRs are higher than those of the OPTRs for all window sizes of the training period. As observed in <Table 4>, when a 9-month window size is applied, the average return rate of the PTRs is 8.74%, which is high compared to the average of 4% for open market interest rates. The average return rates of 3, 6 and 12 months are 3.46%, 4.41% and 3.68%, respectively. This result indicates that a trading rule has not been properly generated because the training period is too short or too long.
To select the window size of the training period in a practical manner, the Sharpe ratio evaluates the performance of the PTRs as the window size. Here, the Sharpe ratio is defined as the ratio of the expected difference between the return rates of a given portfolio and those of a risk-free asset over the standard deviation of the difference [26]. In this case, the return rate of the risk-free asset used for the Sharpe ratio calculation was based on a Treasury bill with a maturity of 3 years. <Figure 5> presents the Sharpe ratio of PTRs and OPTRs for different training window periods. The Sharpe ratios of OPTRs are all negative, whereas the Sharpe ratio of a 9-month window size is higher than those of the other window sizes (see <Figure 5>(c))
This result demonstrates that a trading rule can yield the highest return rate when a 9-month period (3 futures contract periods) is used to generate the trading rule in the stock futures market. More specifically, a 9-month period is an appropriate duration for generating a pairs trading rule applied to the stock futures market. This result is satisfactory compared to the OPTRs and implies that PTRs are useful in arbitrage trading.
5.Concluding Remarks
This study proposed trading rules for pairs trading in the stock futures market. PTRs considering price ratio between assets were found to yield sizable profits. For the development of PTRs, a window size of the training period was incrementally changed in 3-month intervals (3, 6, 9, and 12 months). The 9-month window size of the training period produced an 8.74% average return rate, which was greater than the open market interest rate (4%) as well as the return rates obtained with the other window sizes. Moreover, most stable Sharpe ratio was obtained when the number of sets was applied to a 9-month timeframe. Although this study examined PTRs examined by trading one security (KOSPI 200), the PTRs were more profitable than the OPTRs. The simultaneous trading of two securities should be considered for future PTRs. This study also considered the period in which rules were generated for pairs trading in the futures market. Thus, it is important to consider an appropriate duration that can affect the generation of trading rules with high returns in the futures market.