[Stata] Chi-square test & Post-hoc analysis (tab and tabchi)
The chi-square test is an analysis used when both the independent and dependent variables are categorical variables.
In STATA, the relationship between the two categorical variables can be checked through cross-tabulation, and the chi-square test can be conveniently performed as an option here.
tab yvar xbar, chi column
As above, after the tab command, list two categorical variables and put “chi
” as an option to perform the chi-square test. Here, categorical variables also include binary variables.
Application
Here, x is the independent variable and y is the dependent variable. For example, let’s say I want to analyze the difference in health status according to sex in the dataset. Here, the sex is defined as a categorical variable with 1= Male 2 = Female. Meanwhile, health status was defined with 6 categories. Therefore, it is appropriate to perform the chi-square test at this time.
tab yvar xvar, chi col
If you add an column
option after the chi
option, the percentage based on the column is also provided.
According to the results, it can be concluded that there is a significant difference in the ratio of nativity according to the sub-ethnic group at the significance level of 99.9% (p <.001).
Post-hoc analysis
You may want to explore specifically how the differences are significant within subgroups here. This is called post hoc analysis. Since STATA does not provide this as a default option, you must use a user-created package called tab_chi
(developed by Nicholas Cox). This command performs the same test that is available in SPSS (See this video for SPSS) based on the adjusted residuals for interpreting the contingency table. Please note that performing post-hoc on the chi-square test is not widely recommended and is debatable (See this discussion post).
ssc install tab_chi
tabchi yvar xvar, adj noo noe
Afterward, posthoc analysis for the chi-square test can be performed using the above syntax. Here, the adj
option is an abbreviation for adjusted, which means “adjusted residuals, Pearson residuals divided by an estimate of their standard error.” Adjusted residuals refer to Pearson residuals that are adjusted for errors in estimating the standard error, considering the sample size.
As the p-value is not provided here, you need to interpret the significance by using “Z-distribution” for the two-tailed tests.
You can interpret the significance level through the Z-score distribution table above.
- If the test statistic is greater than or equal to 1.645 or less than -1.645, it is significant at the p<0.1 level.
- If the test statistic is greater than or equal to 1.96 or less than -1.96, it is significant at the p<0.05 level.
- If the test statistic is greater than or equal to 2.576 or less than -2.576, it is significant at the p<0.01 level.
- If the test statistic is greater than or equal to 3.291 or less than -3.291, it is significant at the p<0.001 level.
Please find this post if you are unfamiliar with the concept of z-score and p-value: https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-statistics/what-is-a-z-score-what-is-a-p-value.htm
Also, you can get the exact p-value by using this by using two-tailed hypothesis: https://www.socscistatistics.com/pvalues/normaldistribution.aspx
In the result table, the absolute value of the adjusted residual in East Asian cells is 2.576 or more, and in South Asian cells, it is 1.96 or more. The adjusted residuals can be reported as shown in the table above. Significance means “they are more extreme than what would be expected if the null hypothesis of independence was true.”
Please refer to the below two papers for more details on how to report in the tables and how to interpret them.
To examine the relationship between the provider-trained group and scoring on the knowledge items, chi-square tests were conducted to test for independence of the accuracy of responses to knowledge questions and whether a respondent had received training or counseling in FABM from a provider.
Chi-square tests were then used to test for independence of the groupings and how users rated attraction and functionality and ranked types of evidence. The adjusted residuals were calculated to determine where the largest differences between observed and expected counts arose, while accounting for sample size of each of the three respondent groups. Statistical analyses were conducted in Stata v. 15 (StataCorp LLC; College Station, TX).
Starling, M. S., Kandel, Z., Haile, L., & Simmons, R. G. (2018). User profile and preferences in fertility apps for preventing pregnancy: an exploratory pilot study. Mhealth, 4.