The ChiSquared Test of Independence
The chisquared test of independence also uses the chisquared statistic and chisquared distribution, but it is used to test whether there is a difference in frequency among two or more groups. The outcome is categorical (2 or more levels) or ordinal. Therefore, there can be multiple rows or columns in our contingency table, and the degress of freedom are
where r= the number of rows in the contingency table, and c= the number of columns.
For example, in the following contingency table, df=(r1)*(c1)= (31)*(31)=4:

Good 
Fair 
Poor 

High Exposure 



Medium Exposure 



Low Exposure 



There are 3 exposure categories and 3 outcome categories, so df= (31) * (31) = 2*2 = 4
The research question can be phrased as either:
 Is there a difference in outcome between two or more groups?
 Is there an association between two variables?
Therefore,
 H_{0}: The distribution of the outcome is independent of the groups
 H_{1}: H0 is false
Example 1:
Investigators wanted to study factors related to whether an HIV individual would disclose the fact that they were HIV+ to their sexual partners.
[Stein MD, Freedberg KA, Sullivan LM, Savetsky J, Levenson SM, Hingson R, Samet JH. Sexual ethics. Disclosure of HIVpositive status to partners. Arch Intern Med. 1998 Feb 9;158(3):2537.]
The abstract stated:
"We interviewed 203 consecutive patients presenting for primary care for HIV at 2 urban hospitals. One hundred twentyseven reported having sexual partners during the previous 6 months. The primary outcome of interest was whether patients had told all the sexual partners they had been with over the past 6 months that they were HIV positive.
One study sought to determine whether the frequency of disclosure varied depending on the potential mode of transmission risk, and their findings are shown in the table below.
Table 1: Observed Data
HIV Transmission Risk 
Disclosed 
Not Disclosed 
Total 

Injection Drug Use 
35 (67%) 
17 
52 
Homosexual contact 
13 (52%) 
12 
25 
Heterosexual contact 
29 (58%) 
21 
50 
Total 
77 (60.6%) 
50 
127 
Note that a total of 77 individuals out of 127 reported disclosure, and the other 50 did not. Therefore, the overall frequency of disclosure was 77/127= 60.6%. If there were no differences among the three groups, one would expect the frequency of disclosure to be 60.6% for each of the three groups. We can then calculate the number of expected disclosure in each of the three risk categories by multiplying the number of subjects in each category by 0.606. For example, there wer 52 injection drug users, so expected disclosures would be 52 x 0.606 = 31.5. And we can compute the expected number of not disclosures in this category by simply subracting 31.5 from 52, so the number of nondisclosures for injection drug use is 5231.5=20.5. If we repeat this procedure for the other two risk categories, we can create the table of frequencies that would be expected if the null hypothesis were true, as shown in Table 2 below.
Table 2: Expected Under the Null Hypothesis
HIV Transmission Risk 
Disclosed 
Not Disclosed 
Total 

Injection Drug Use 
31.5 (60.6%) 
5231.5 = 20.5 
52 
Homosexual contact 
15.2 (60.6%) 
2515.2 = 9.8 
25 
Heterosexual contact 
30.3 (60.6%) 
5030.3 = 19.7 
50 
Total 
77 (60.6%) 
50 
127 
Now we can compute the chisquared statistic using the formula
Next, we need to compute the degrees of freedom, which is
where r = the number of category rows and c = the number of category columns. In this case:
We can see from the chisquared table that the critical value of χ^{2} with 2 degrees of freedom and α=0.05 is 5.99, but our computed is only 1.95, so we would fail to reject the null hypothesis, and we would conclude that there is insufficient evidence to conclude that the frequency of disclosure varies among these three risk categories.
However, we can get a better idea of the actual pvalue by using the 1pchisq()command in R and providinn the chisquared statistic and the degrees of freedom in parentheses.
> 1pchisq(1.95,2)
[1] 0.3771924
Therefore, the pvalue is 0.38.
Note that we use 1pchitest because we want the probability given by the upper tail of the chisquared distribution.