A COMPARISON OF CAPM AND FAMA-FRENCH THREE-FACTOR MODEL UNDER MACHINE LEARNING APPROACHING

With the economy experiencing rapid growth in recent years, more individuals have started venturing into the stock market. Precisely forecasting the rate of return can mitigate investment risks for stock investors and significantly enhance their investment returns. The Capital Asset Pricing Model (CAPM) and the 3-factor Fama-French model (FF3) are widely recognized in academic and practical settings. This model comparison provides frameworks to analyze the relationship between portfolio risk and return in inefficient markets, contributing to applied data science in finance behavior. This research utilized the Support Vector Regression (SVR) algorithm to forecast the returns of a diversified portfolio in the Hanoi stock market (HNX) from 2010 to 2022. Initially, this study calculated the factors and subsequently constructed diversified portfolios. Subsequently, the explanatory power of the CAPM and FF3 models were compared using the Ordinary Least Squares (OLS) algorithm. Finally, this research incorporated the SVR algorithm within the FF3 framework to develop a predictive model. The research findings demonstrate that the FF3 model provides a superior explanation to the CAPM model. Additionally, the study reveals that the SVR algorithm outperforms the OLS algorithm in terms of efficiency, as it yields lower Root Mean Square Error (RMSE) values. Nevertheless, despite its advantages, the FF3 model still falls short regarding explanatory factors. Consequently, the next research direction entails replacing the FF3 model with a more comprehensive multi-factor model, anticipating obtaining an enhanced predictive model.


INTRODUCTION
The capital Asset Pricing Model (CAPM) establishes the link between portfolio risk and return.Because of its accessibility and simplicity, the CAPM model has become a vital resource for asset management in recent years.The CAPM model classifies a stock's total risk as idiosyncratic and systematic.Portfolio diversification may help reduce exposure to individual risks, but it has little effect on systemic risk (Zaimovic et al., 2021;Adaramola et al., 2011).CAPM presupposes that finding the optimum risk-return profile for a portfolio is feasible.
Furthermore, the ideal portfolio must include all assets, with each asset's value-weighted, to accomplish the above since adding a new asset increases the portfolio's diversification.The efficient frontier is the set of all possible optimum portfolios, one for each possible rate of return.Because unsystematic risk may be spread over several investments, a portfolio's overall risk can be represented by a single number, beta.However, there is debate about whether CAPM should be used in actual investment settings.Some experimental CAPM tests found a linear connection between anticipated return and portfolio risk but with a slope coefficient angle too flat to be considered meaningful (Khoa & Huynh, 2022;López Prol & Kim, 2022).Dhankar (2019) analyzed data for 158 equities traded on the Bombay Stock Exchange using a battery of tests spanning from 1991 through 2002, which closely corresponds to the timeframe after the liberalization and implementation of capital market reforms; consequently, CAPM is invalid as it did not apply to the Indian stock market over the period under consideration.Peng (2021) examined data from the UK market between December 2016 and December 2019 and concluded that, experimentally, CAPM is not appropriate because it makes too many assumptions that are hard to satisfy.Adding to the CAPM's original two elements, the 3-factor Fama-French model (FF3) suggested a threefactor model considering both size and value.FF3 is a flawed model for anticipated returns since it only accounts for some variance in average returns due to profitability and investment (Khoa et al., 2022).When volatile, the market obscures the value impact (Fama & French, 2021); therefore, ensuring a linear link between the explanatory variables and the outcome is challenging.FF3 is the extended model of CAPM because FF3 added two factors, size, and value.In traditional econometrics, FF3 and CAPM are often estimated by the ordinary Least Squares (OLS); however, OLS is not as effective as compared with SVR in the variables' nonlinear relationship (Li & Li, 2021).
Econometrics has seen the emergence of Machine Learning (ML) algorithms as a viable alternative in recent years.The Support Vector Regression (SVR) algorithm is highly efficient for forecasting continuous variables (Khoa & Huynh, 2023).SVR's power lies in its ability to effectively use the nonlinear connection between variables (Zheng et al., 2021).Chiu et al. (2020) estimated future gold prices using Least Squares Support Vector Regression (LSSVR) by using the spot price of gold and the Opinion score calculated by mining data from news articles published in Taiwan between January 1, 2016, andDecember 31, 2017. Gold, silver, platinum, palladium, and the opinion score are the independent variables, whereas the one-period delayed variable is part of the period.Furthermore, MAPE was used as the benchmark for quality evaluation.The Wilcoxon and Friedman tests were to examine the performance of two prediction models and pointed out that the LSSVR method when a Genetic Algorithm optimizes its parameters, achieves lower MAPE values and increased prediction accuracy.These studies proved that the SVR algorithm is crucial in constructing predictive models.Khoa and Huynh (2021) mentioned how inefficient the Ho Chi Minh Stock Exchange (HOSE) is, which makes CAPM invalid on HOSE for the time being.The same characteristics between HOSE and HNX motivated this research to center on the advantages of the FF3 theoretical model and the SVR algorithm (FF3SVR).The combination is expected to bring higher efficiency than the original models, as this research tested the monthly HNX market cycle from January 2010 to December 2022.This research primarily aims to compare the FF3SVR model with the CAPM and the FF3 to assess the FF3SVR's efficacy.The OLS technique was calculated t.The F-test is then used to determine how successful they were.Notable results from this research include the following: • Validate the Fama-French three-factor model's superior performance in the HNX market over the capital asset pricing model (CAPM).
• Construct a return-rate forecasting model using the FF3 framework and the SVR algorithm and verify that the resultant model is very efficient.

LITERATURE REVIEW
CAPM and 3-factor Fama-French model Using a linear connection between anticipated return and risk, the Capital Asset Pricing Model (CAPM) may be used to calculate the optimal portfolio allocation.There are two types of risk in CAPM: systematic and unsystematic.Portfolio diversity may help reduce nonsystematic risk but cannot eliminate systematic risk.Inflation, interest rates, and economic cycles are all systemic risks that broadly impact the market.Unsystematic risks, however, are unique to each business and include potential corporate leadership, strategy, and culture changes.Since investors may lessen their exposure to companyspecific risk by diversification, CAPM states that only market risk should be compensated for (Peng, 2021).
In the CAPM model, here is how the regression equation looks: Some research found that size impacts economic years (Horvath & Wang, 2021;Hu et al., 2019).For example, tiny businesses may reap more profits than their larger counterparts.According to the findings, the CAPM fails to explain the scale impact adequately.Fama and French (1992) discovered other stock-related impacts beyond the well-known value-added effects.Investment returns are higher for value firms (those with a high B/M ratio compared to market value) than for growth equities (those with a low B/M).The market is one element in the three-factor model proposed by Fama and French (later called a Fama-French 3-factor model).As soon as it was released, the 3-factor model supplanted the CAPM as the dominant financial explanation.
Size, market, and B/M are the three components of the Fama-French model.More specifically, the market factor, represented by the portfolio's outstanding returns, is subtracted from the small firm's return (SMB) to determine the size factor.The B/M factor for the Mkt item is arrived at by subtracting the high-value stocks from the low-value stocks (HML).For a time series, the regression equation looks like this:

Where:
= total return of a stock or portfolio i at time t.
= Risk-Free Interest Rate.
= The market's excess rate of return.
Using thousands of randomly chosen company listings on the US market, Fama and French tested their model.They discovered that when valuation and regulatory parameters were added to the system along with the beta element, the model could account for 89 percent of returns in a diversified portfolio of companies.When an investor accounts for the 89 percent of a portfolio's return that can be explained in terms of the general market, they may create a portfolio in which the average projected return is commensurate to the relative risk they have accepted.
Multiple further trials confirmed the FF3 model's performance better than the CAPM.CAPM and FF3 models were tested in several Asian markets, including those in China, Taiwan, Malaysia, Korea, and Singapore; the results indicated that the FF3 model better captured the link between expected return (Khoa et al., 2023); and market volatility (He et al., 2015).Additionally, the FF3 model's adjusted coefficient of determination is greater than the single-factor models, and the Market Factor is the model's most influential variable.The Indian market was the test environment for Sehrawat et al. (2020), which spanned between 2003 and 2019.Adjusted R 2 coefficients for CAPM vary from 64% to 91%, lower than those for FF3.This result indicates that CAPM is less effective than FF3 (64 percent to 93 percent).

Support Vector Regression
Relying on the success of SVM (a classification algorithm) in the classification setting, some research developed SVM for real value prediction, the so-called SVR (Parbat & Chakraborty, 2020;Zhong et al., 2019).In the field of machine learning, SVM is often used for the analysis of data in many dimensions.To solve the SVM classification problem, we must determine which hyperplane is most effective for classifying the training data.This hyperplane will be used to plot the test set data.If you want to find the optimal hyperplane, you need to optimize, and you do so by making the distance between the support vectors bigger (points closest to the subclass hyperplane).As an alternative to identifying the subclass hyperplane, SVR provides a -insensitive loss function to compute a hyperplane such that the predicted value is within an interval, termed -SVR model, for the regression issue.The linear -SVR problem entails approximating a function such that the insensitive tube is as flat as possible.When minimizing w, any value will do.
Little external perturbations may make extreme changes in Eq.( 1).The presence of noise will be more prominent, resulting in a more severe error in the report.The boundary value is shifted to the outside using an offset parameter to dampen the effect of extreme data points.Eq. ( 2) provides the answer to the -SVR problem: 2) A regularization parameter C > 0 determines the trade-off between the flatness of function f and the prediction errors.As the compensatory factors decrease and the s increases, the flatness is reduced for a particular value of C. The sensitivity-free loss function is given by Eq.( 3): When the connection between the variables is not linear, a linear function may result in significant mistakes if used.To fix the nonlinear issue, -SVR projects the original data onto a higher-dimensional space where a kernel assures the linear connection.Standard kernels include the Radial kernel, Linear kernel, and Polynomial kernel, all of which are useful for problems with a similar structure to the SVM (Benkraiem & Zopounidis, 2021;Awad & Khanna, 2015).

METHODOLOGY
Ha Noi Stock Exchange (HNX) information is gathered monthly, starting in January 2010 and ending in December 2022.Size of the Market, Stock Price (M), and Book Value (B) are the Data Points Collected (B).Following the methodology of (Dirkx & Peter, 2020).HML and SMB factors are determined.Return on the HNX-Index minus the interest rate on a one-year government bond is this formula's market element (Mkt) (Fama & French, 2015).The portfolios are rearranged every six months after being sorted by beta from low to high and then divided into ten equally weighted portfolios (Fama & French, 2021).In this case, this study gives equal consideration to the portfolios' respective rates of return.Table 1 provides an overview of the factors such as the Return Rate of Portfolios, Excess rate of return on portfolios, Yields on 1-year government bonds, the rate of return of the HNX-Index, the excess rate of return of the market portfolio, size factor, value factor.The size factor and value factor were calculated from the financial statements of businesses listed in HNX.The data set fits the variables in three methods (CAPM, FF3, FF3SVR).The Capital Asset Pricing Model (CAPM) is based on several key assumptions that form the model's foundation.Firstly, it assumes that investors are rational and risk-averse, seeking to maximize their wealth.Secondly, it assumes a single-period investment horizon, meaning all investments are evaluated over the same time frame.Thirdly, it assumes that investors have access to the same information and can freely borrow or lend at a risk-free rate.Fourthly, it assumes that the market is efficient, implying that all relevant information is reflected in stock prices.Lastly, it assumes that asset returns are normally distributed and that investors hold well-diversified portfolios.FF3 is extended from CAPM, so its assumption is the same as that of CAPM.FF3SVR is an FF3 model calculated based on the SVR algorithm instead of OLS.
In practice in Vietnam, these assumptions are difficult to meet.First, lenders and borrowers have different interest rates, and both are based on the risk-free rate.Second, stock trading has transaction costs when investors make purchases through brokers.Securities in Vietnam are traded in batches of hundreds; therefore, the investor can not deal with highly divisible securities into small parcels.Finally, there are no the same homogeneous expectations from investors in Vietnam.Therefore, CAPM and FF3 violate some assumptions, so they operate inefficiently in Vietnam.
The equation for the CAPM regression model looks like: Eq (4):   −   =   +     +   (Eq.4) The equation from the FF3 regression model in Eq (5): This research compared the time series regression models of CAPM and FF3 using the adjusted coefficient R 2 .Eq.4 and Eq.5 were used to estimate the coefficients of the CAPM, FF3, respectively.
FF3SVR's coefficients were calculated using Eq.5 and SVR algorithm.After, metrics were compared to find the most efficient model (Gharaibeh et al., 2022).
The study divided the data set into two parts at the ratio of 70:30 to train/test the model.Specifically, the data from January 2010 -December 2018 are used for the training set, and January 2019 -December 2022 are used for the testing set.In addition, the study used Root Mean Square Erro (RMSE) as in previous studies by Nguyen et al. (2021).For the SVR algorithm, the study used the SVR algorithm with a linear kernel, a cost of 0.5, and epsilon = 0.5, as in the previous study by Khoa et al. (2021). (Eq.6) Where:   � is the forecast value and   is the actual value.
Finally, when comparing three predictive models, the F test determines whether the null hypothesis, H0, that "there is no difference across models," is true.

Result
The HNX equities were separated into ten groups, each representing a different beta level.There are 156 observations, one for each month, from January 2010 to December 2022.The excess of expected return is outstanding in various volatile portfolios, ranging from -0.369 to 0.287.The P4 portfolio has much to do with banks and technology, so it has a high fertility rate.Due to the impact of the COVID-19 pandemic and recent crises, the stock market tends to go down.Besides, banks' deposit interest rates increased sharply, so companies faced difficulties in capital sources.As a result, some portfolios, P3, P6, and P10, have negative average excess returns.Descriptive statistics were tabulated in Table 2.
As a rule, the risk-free rate is uniform and tilted downward.The risk-free interest rate has been hovering around zero for all of 2019 but has been particularly low at the start of the year.At this point, businesses had many issues with production and supply due to the Covid-19 outbreak.At this juncture, the government issued several stimulus measures, such as interest rate preferences.In addition, the market's volatility at this early stage means that returns are unpredictable.Following the global financial crisis of 2010-2013, the central bank instituted several measures to revive the economy, including the free trading of securities and stock market regulation.There was substantial inflation back then, and several stimulus measures have helped lower the riskfree interest rate to about 8%.(Previously up to 15-20 percent due to the impact of inflation).After the global financial crisis, there has been a general stabilization of market risks.
Standard deviations for the components' average returns ranged from 3.982 to 5.851.The Mkt factor typically returns -0.005 and a risk-free rate of 0.052, implying that investing in the market portfolio will be inefficient compared to investing in bonds.Fig. 1 depicts the changes in the parameters and the risk-free rate.
This study used time series regression to analyze data from CAPM and FF3.See Table III and Table IV for the breakdown of the data.The adjusted R 2 for the CAPM ranges from 0.049 to 0.107, whereas the adjusted R 2 for the FF3 model is higher, at 0.189 to 0.406%.The FF3 model is more effective than the CAPM by these calculations.Several authors have found similar findings in prior research, including Sehrawat et al. (2020); Fama and French (1993).Expressly, a positive linear connection between anticipated return and risk directory is implied by the fact that all beta estimators in CAPM are statistically significant and positive.The CAPM predicts that the Intercept will not be statistically significant in seven out of ten categories.portfolios.Significant computed coefficients for the size factor suggest a scale effect.Companies with a smaller market capitalization often get a more significant risk premium because of their smaller size.This finding agrees with similar research (Basu, 1983).Table 3 and Table 4 pointed out the CAPM and FF3 regression results.While compared to the CAPM model, the FF3 model is more effective when employing the same method.The FF3 model's mean error is 2.947, but only 3.147 for the CAPM model using the OLS method.The average error for FF3 when using the SVR algorithm is 2.674, the lowest of all tested methods.Table 4 shows the time series regression analysis findings consistent with this finding.The FF3 model outperforms the CAPM regarding R 2 , a measure of model fit.This result is why the theoretical framework and the estimate method are crucial components of predictive modeling theory.The results of the forecast error are summarized in Table 5.  5 disapproval of H0, there is no discernible variation among the models; in particular, the FF3 model using the SVR method has outperformed the lowest average RMSE.A formal statistical test is required to verify this finding, however.Using a One-Way Analysis of Variance (F Test), we examine this hypothesis (ANOVA).There are three forecasting models (CAPM, FF3, and SVRFF3) that are relevant to this discussion (together referred to as the "model").In this scenario, 30 numbers represent a different category's root-mean-squared error (RMSE) value.Table 6 provides a summary of the analyses of variance.According to Table 6, Fstat=MSSB/MSSW = 6.176 and Fcritical = 3.354, indicating that Fstat > Fcritical (or P = 0.006 < 0.05).This study cannot accept H0; the mean RMSE error varies between models.Considering these findings, it can be concluded that the FF3 model combined with the SVR algorithm has shown promising results.

Discussion
As with any endeavor, there is always a giveand-take between potential gain and potential loss, making risk management crucial for investors.The larger the potential reward, the greater the associated danger.Diversification may mitigate unsystematic risk, while systematic risk cannot.The CAPM model establishes a numeric value for the linear connection between the anticipated return and the systematic risk of a diversified portfolio.Based on the experimental data, it may be concluded that there is a linear connection between all categories.The CAPM theoretical framework predicts a value of 0 for the Intercept, meaning that the regression model has no statistical significance.There is a discrepancy between the experiment and the CAPM prediction in three out of ten categories.That means seven groups agree with CAPM, and four disagree.Findings supporting CAPM align with those of several prior research (Sehrawat et al., 2020;Pei, 2019).In contrast to earlier research, which found R 2 values between 0.64 and 0.92 indicative of CAPM's explanatory ability (Sehrawat et al., 2020), the current study's R 2 values varied from 0.049 to 0.107.
Because of this deficiency in explanatory variables, the CAPM cannot adequately explain why some portfolios have higher or lower projected returns than others.The CAPM serves as the basis for the FF3 model, which then incorporates two additional variables, one each for the scale and value impacts.The projected returns of the portfolios are statistically affected by these factors, and the coefficient R 2 --adjusted determination is also greatly improved compared to CAPM, with values ranging from 0.189 to 0.406 based on experimental data from the HNX market.Previous research, like that of Sehrawat et al. (2020), Chui and Wei (1998), Fama and French (1993), has shown that the three-factor model is more explanatory than CAPM; therefore, our finding is in line with that literature.Phong and Hoang (2012) analyzed the use of the first FF3 model in the Vietnamese stock market from 2007 to 2011.The authors found a positive slope between size and stock returns.The results contradict the conclusions of previous research and are likely indicative of the stock market in Vietnam.Nguyen et al. (2019)'s study of the Vietnam market from 2010 to 2017 reveals a strong foreign ownership impact, in which more foreign ownership boosts stock liquidity, profitability, and size and exposes investors to greater risk.This finding demonstrates that the FF3 model's assumed linear connection is insecure or lacks necessary explanatory elements.Our research showed that the FF3 model only had a relatively poor coefficient of determination from 2010 to 2022, ranging from 0.19 to 0.41.
This research constructs a model to foretell the return of diversified portfolios using the benefits of machine learning techniques and the underlying theoretical framework of CAPM and FF3 valuation models.Results in Tables 5 and 6 show that the combination performed effectively in the trial.Since FF3 has been shown to perform better than CAPM in statistical comparisons, adding SVR makes the combined model even more effective.Gogas et al. (2018) pointed out that the SVR and FF3 model combination is superior to CAPM and is supported by these findings.However, Gogas et al. (2018) research lacks support for the combination model.
Predicting stock market prices and trends is considered a formidable undertaking due to the inherently chaotic nature of the financial markets.The stock market can be a nonlinear, non-parametric, and noisy system deterministic, dominated by factors such as liquidity, stock availability, human behavior, news affecting the market, speculative activities, and international monetary fluctuations.Given its significance as an emerging sector of the economy and the involvement of many stakeholders, researchers and experts are interested in exploring this domain and elucidating the underlying chaotic system for trend pattern recognition.Using the advantages of machine learning algorithms, combined with the underlying theoretical framework of CAPM and FF3 valuation models, we build a model to predict the return of diversified portfolios.The combination has worked well in the experiment; the results in Tables 5 and 6 support this argument.The FF3 model has been statistically proven to be more efficient than CAPM, so the combined model FF3SVR is more efficient than OLSCAPM as a corollary.This result is consistent with the study of Gogas et al. (2018) that the combined model of SVR and FF3 is more effective than CAPM.However, a limitation of Gogas et al. (2018) is that some necessary tests have not been performed to support the conclusion.We overcame it using the ANOVA table to demonstrate the effectiveness through the Ftest.
Due to the inherent chaos of the financial markets, predicting stock market prices and trends is often regarded as an arduous task.Market liquidity, stock availability, human behavior, news impacting the market, speculative activity, and international monetary swings are all examples of deterministic nonlinear, non-parametric, and noisy system features that may be applied to the stock market.
Experts and scholars are eager to learn more about this field and clarify the chaotic system behind trend pattern identification because of its potential impact on the economy and the wide range of parties involved.This research constructed a model to forecast the profit of diversified portfolios by combining the theoretical framework of the CAPM and the FF3 valuation models with the benefits of machine learning methods.The combination has been effective in the trial, as shown in Tables 5 and 6.Since FF3 has been shown to perform better than CAPM in statistical tests, adding SVR makes the combined model more effective than OLSCAPM.Gogas et al. (2018) find that the SVR+FF3 model outperforms these findings supporting CAPM.However, not all relevant tests have been conducted to back up the findings of Gogas et al. (2018).This study got around this by utilizing the ANOVA table and the F-test to prove its efficacy.

CONCLUSION AND RECOMMENDATION
Each year, at the end of June, the market capitalization, Book to Market ratio, and other measures are used to assign a stock a rating.Combining size sorting with book-to-market ratio (B/M ratio) sorting may provide six categories from which the HML and SMB factors may be derived.In place of Mkt, this research uses the market index (HNX-index), or more accurately, the difference between the return of the HNX-index and the yield on a 1-year government bond.This research provides empirical evidence that the FF3 model outperforms the CAPM on the HNX exchange.Further, portfolio returns may be accurately predicted when the FF3 algorithm uses the SVR model.This research is unusual since it uses a mixed empirical model to analyze the HNX market and shows how effective this method is.Based on its superior predictive power, the FF3SVR model should be used by investors and risk managers in place of the more common CAPM and FF3 models.
The CAPM shows how a portfolio's projected return may be linearly related to its risk.However, it is challenging to make CAPM empirically feasible due to the excessive usage of assumptions.The CAPM-derived three-factor Fama-French model, produced by adding size and value explanation components, has outperformed CAPM.This result was also shown in the HNX market by the research.Predictive modeling has benefited from using a mixture of many machine learning methods, particularly SVR.Based on these findings, the research suggests replacing the CAPM and FF3 models with the combination model of FF3 and SVR for forecasting portfolio returns.
The FF3 model has been proven to have gaps in its ability to explain predicted returns.A comprehensive model, such as the Fama-French 5-factor model, is required because of this.Predicting portfolio returns using machine learning algorithms like SVR or Long Short-Term Memory Recurrent Neural Network is an extension that uses a 5-factor model.

Figure 1 .
Figure 1.Factor returns and the risk-free rate Source: author's work

Table 1 :
Variable Description

Table 5 :
RMSE error of the models