1.Introduction
Measuring the percentage of body fat has become a popular and standard practice for many ergonomist, physician, athletic trainers, and work physiologists. Evidence supports that obesity (excessive fat) is closely related to musculoskeletal injury, reduced motor performance, and many health problems in industry [1, 3]. Overweight individuals have a higher risk of some musculoskeletal disorders, specifically lower back [4]. Craig et al. [2] demonstrated a close relationship between the rate of handlers' injuries at work and the high percentage of body fat. In another study performed at an aluminum manufacturing company, approximately 85% of the employees who had sustained at least one injury were classified as overweight or obese [9]. In addition, many researchers showed that the indirect medical costs are also higher for obese workers than non-obese-workers [7, 14].
Hydrostatic or underwater weighing is the most widely used laboratory procedure for measuring body density. This method uses Archimedes’ principle that a body immersed in a fluid is acted on by a buoyancy force that is evidenced by a loss of weight equal to the weight of the displaced fluid. The body density measured through the underwater weighing can then be used to calculate the percentage of body fat using Siri’s equation [13]. The hydrostatic technique has been shown to be highly reliable when measurements were made over time intervals ranging from 30 minutes to a couple of days. A standard error of measurement (less than 0.002g/cc) has also been observed [10]. However, difficulties associated with implementing the underwater weighing procedure are that it is relatively expensive, difficult to perform and requires large space.
Anthropometry deals with the measurement of size, weight, and proportion of the human body. Anthropometric techniques are popular for predicting body composition in the field setting because they are cheap to implement, require little space, and are easy to perform. In addition, anthropometric procedures are noninvasive, and training can be provided without prerequisite courses. Consequently, anthropometric methods are applicable to large samples [11].
Fitting the percentage of body fat (measured through the underwater weighing procedure) to the other anthropometric measurements using the multiple regression analysis provides a convenient way of estimating body fat. In this concern, the objective of present study was to examine what anthropometric variables and how they were related to the percentage of body fat. For this study, the percentage of body fat obtained using Siri’s equation and ten anthropometric variables (i.e., body circumference measures) for 252 men were analyzed.
In general, the objective of a regression analysis is to control, describe, and predict response variables in relation to predictor variables. The present study was aimed to achieve last two purposes. For the description purpose, this study tried to explain what body circumference variables and how they are related to the percentage of body fat. For the prediction purpose, a statistical regression model developed by analyzing the relationship between the percentage of body fat and the body circumference variables can be used to predict the body fat. This regression model is easy to use, inexpensive, and convenient, as compared to the underwater weighing techniques. To meet these purposes, a multiple regression analysis was performed.
2.Source and Characteristics of Data
The data for the Percentage of body fat and body circumference were gathered from the web site http://math.arizona.edu/~jwatkins/505d/body.htm. The data were originally supplied by Dr. A. Garth Fisher, Human performance Research Center, Brigham Young University, Provo, Utah 84602, who gave permission to freely distribute the data and use them for non-commercial purposes [8].
Percentage of body fat and ten body circumference measurements were recorded for 252 men (cross-sectional data). Body fat was estimated through an underwater weighing technique. The percentage of body fat was treated as a dependent variable. Independent variables consisted of 10 body circumferences as follows :
Dependent (response) variable : The percentage of body fat (PCTFAT) is a measure of health. The data were estimated through two steps. The body density (Db) was measured through the underwater weighing. The percentage of body fat then was calculated based on Siri’s equation [13] (i.e., PCTFAT = 495/Db-450).
Independent (predictor) variables : 10 independent variables include, neck circumference (NECKCIR, cm), chest circumference (CHESTCIR, cm), abdomen circumference (ABDOCIR, cm), hip circumference (HIPCIR, cm), thigh circumference (THIGHCIR, cm), knee circumference (KNEECIR, cm), ankle circumference (ANKLECIR, cm), extended biceps circumference (BICEPCIR, cm), forearm circumference (ARMCIR, cm), and wrist circumference (WRISTCIR, cm). In taking of circumference measurements the tape measure was positioned in a horizontal plane or perpendicular to the length of the segment being measured [10].
3.Development of Prediction Model
3.1.Full Model(Initial Model with 10 Predictor Variables)
It was expected that 10 predictor variables positively relate to the percentage of body fat (PCTFAT). Therefore, a model was chosen with 10 predictor variables as an initial model :
PCTFAT = β0 + β1NECKCIR + β2CHESTCIR +
β3ABDOCIR + β4HIPCIR + β5THIGHCIR +
β6KNEECIR + β7ANKLECIR + β8BICEPCIR +
β9ARMCIR + β10WRISTCIR + ϵ
All signs of regression coefficients are anticipated to be positive because all anthropometric circumferences seem to increase as the percentage of body fat increases. For some cases such as a body builder, the increase in BICEPCIR or THIGHCIR may result decrease in % body fat. However, this will not affect entire positive relation to body fat. Interaction terms are not included in the initial model. If other variables such as sex or a history of physical exercise is included, we can suspect some interaction terms. For example, it is easy to suspect some interaction between the sex and the HIPCIR (due to the difference in HIPCIR between male and female). In the initial model, however, there are no known pairs of variables that interact with each other. Therefore, no interaction terms are included.
Scatter plots of the response variable against each predictor variable can aid in determining the nature and strength of bivariate relationships between each of the predictor variable and the response variable. A compliment to the scatter plot matrix that may be useful at times is the correlation matrix [6]. To get preliminary information about variables, scatter plots of the response variable against each predictor variable and the correlation matrix were generated. An example of the scatter plot and the correlation matrix are presented in <Figure 1> and <Table 1>, respectively. The plots show positive linear relationships between the response variable and each predictor variable. The correlation matrix shows that there exist several values greater than 0.7 implying some multicollinearity (MC) between the predictor variables. These findings were subjected to further analysis. Variance inflation factor (VIF) was analyzed to check the MC problems. The VIF is often used as a measure of the severity of MC and a maximum VIF greater than 10 is generally taken as an indication of MC between the predictor variables [5]. The largest VIF was 9.868 for HIPCIR. Even though MC exists, the degree of MC was not significant.
As shown in <Figure 1>, there may be some outliers showing distinct increases or decreases in PCTFAT. For example, extremely short person or extremely well trained (e.g. body builder) person might be presented by outliers. Some expected outliers, if they exist, will make the normality assumption violated. However, the sample size is relatively high (n = 252), therefore, it is anticipated that one or two outliers, even possibly exist, will not affect the entire relationship between the dependent variable and the independent variables.
3.1.1.Analysis of the Full Model
All 10 predictor variables were included in the full model. The regression coefficients of intercept and 10 predictor variables are shown in <Table 2>. The coefficient of multiple determination (R2) and mean squared error (MSE) for the full model are 0.7347 and 19.3462, respectively.
3.1.2Variable Selection
Roche indicated that there were no known pairs of body circumference measures that were good predictor of total body composition [11]. Therefore all possible regression models were investigated. RSQUARE procedure of SAS 9.2 performs all possible regressions for a collection of independent variables (SAS Institute Inc., 2009). Using the RSQUARE procedure and options, R2, Adjusted R2, Cp, and MSE statistics were obtained for all possible models and data for 5, 6, and 7 variable models with high R2 are presented in <Table 3>. Here, Cp was introduced by Mallows as a criterion for selecting a regression model [6]. The model with little bias tends to be near the line Cp = p. The first five predictor variable model (i.e., the model with neckcir, abdocir, hipcir, armcir, and wristcir) was selected based on the parsimony principle. This model was subjected to the further analysis presented in next section.
3.2.Reduced Model (Model with 5 Predictor Variables)
The model with 5 predictor variables (NECKCIR, ABDOCIR, HIPCIR, ARMCIR, AND WRISTCIR) selected from the initial model was subjected to further analysis. As shown in the <Table 4>, all 5 variables were significant at significance level of 0.05. The coefficient of multiple determination (R2) and mean squared error (MSE) for the reduced model are 0.7312 and 19.2103, respectively. The model is summarized as follows :
PCTFAT = 2.704-0.601NECKCIR+0.974ABDOCIR -0.332HIPCIR+0.409ARMCIR -1.618WRISTCIR+ε
3.2.1.Evaluation of Assumptions for the Reduced Model
In general, a linear regression modeling consists of various assumptions regarding the anticipated model. These include assumptions for linearity, constant variance, and normality. In order to see whether a particular variable should enter linearly or not, partial residual plots were examined. An example plot for PCTFAT vs. ABDOCIR is shown in <Figure 2>. No visible curvature supports that linear terms are adequate. The linearity assumption for the reduced model is not violated.
To check the constant variance assumption, the plot of Residual vs. Predicted Value was generated as presented in <Figure 3>. No visible systematic pattern indicates that the constant variance assumption is not violated.
To identify unusual outliers, the studentized residual and Cook’s D statistics were investigated. Here, Cook’s Di is an overall measure of influence of the ith observation on the estimated regression coefficients [6]. All the studentized residual were less than 3. The high absolute values of studentized residual were 2.527, 2.482, and 2.613 for obs 39, 82, and 207 respectively. The Cook’s D values were less than 1. The highest values among them were 0.450, 0.026, and 0.026 for obs 39, 82, and 207 respectively. Based on this analysis, no outliers were found. To support this finding, a normal probability plot was investigated and presented in <Figure 4>. The linear relationship on the figure shows the normality.
3.2.2.Multicollinearity
As introduced in the full model, Variance inflation factors (VIFs) were investigated to check the possible multicollinearity (MC) problems in the reduced model (see <Table 5>). The highest VIF was 4.81776 for ABDOCIR (which is less than 10). Even though MC exists, the degree of MC was not significant. As compared to the full model, the VIFs were relatively small.
3.2.3.Interactions
As mentioned in the initial model, since there are no known pairs of variables that interact with each other, no interaction terms are included. Although no interaction term was anticipated, all possible two way interaction terms of the reduced model were investigated as shown in <Table 6>. No interaction term was significant at 5% significance level although abdocir×armcir (p = 0.0517) was marginal.
4.Discussion and Conclusion
Roche (1996) revealed that relatively accurate estimates of body composition for men were found with bicep circumference, hip circumference, abdomen circumference, and arm circumference. However, there were no known pairs of variables that were good predictor of total body composition [11]. Based on the data analysis in the present study, 5 variables selected were hip circumference, abdomen circumference, arm circumference, neck circumference, and wrist circumference. The bicep circumference did not turn out to be a good predictor of body fat percentage. This may be due to the high variability between individuals in developing biceps muscles (e.g. different physical training among individuals).
It was also anticipated that all signs of regression coefficients were to be positive because all anthropometric circumferences seemed to increase as the percentage of body fat increases. However, the signs of NECKCIR, HIPCIR, and WRISTCIR were negative. I suspected some wrong signs due to the multicollinearity (MC) between predictor variables. However, the results of VIF analysis showed no MC problems. Therefore, I conclude that partial relationships are different from marginal relationships.
The fitted prediction model is : % Body Fat = 2.704 - 0.601 (Neck Circumference)+0.974 (Abdominal Circumference) - 0.332 (Hip Circumference) + 0.409 (Arm Circumference) - 1.618 (Wrist Circumference) + ε. This model can now be used to estimate the percentage of body fat simply using a scale and a measuring tape. The units are percent (%) for the percentage of body fat and cm for the body circumference. In taking of circumference measurements, the tape measure should be placed as follows [10] :
-
Neck circumference : The tape measure is placed in a horizontal plane at the level of the widest part of the neck as seen from the front aspect.
-
Abdomen circumference : The tape measure is positioned horizontally at the level of the greatest anterior extension of the abdomen.
-
Hip circumference : The tape measure is placed in a horizontal plane at the level of maximum extension of the buttocks.
-
Forearm circumference : The tape measure is placed around the proximal part of the forearm, perpendicular to its long axis, at the level of maximum circumference.
-
Wrist circumference : The tape measure is placed perpendicular to the long axis of the forearm and in the same plane on the anterior and posterior aspects of the wrist.
Some limitations of the present study should be addressed in future work. The underwater weighing technique is not free from measurement error. The measurement errors associated with the underwater weighing technique are mainly due to the errors in residual volume in the lung. Consequently, the errors associated with residual volume can have a considerable effect on body density [9, 10]. Another possible error results from the conversion of body density to percent fat. Although universally accepted, Siri’s equation is based on the results of direct compositional analysis of human cadavers, but only a few cadavers were used and they did not represent a distribution of the normal population. Measuring anthropometric circumferences is also not free from errors associated with measurement devices and techniques. In addition, the model did not consider age and sex. The anthropometric circumferences may vary with the age, sex, and the race. The data collected were for only men within the unspecified races. For future research, it would be interesting to investigate the effect of age, sex, and the race in predicting the percentage of body fat.