The ttest is used to compare the values of the means from two samples and test whether it is likely that the samples are from populations having different mean values.
When two samples are taken from the same population it is very unlikely that the means of the two samples will be identical. When two samples are taken from two populations with very different means values, it is likely that the means of the two samples will differ. Our problem is how to differentiate between these two situations using only the data from the two samples.
Worked example
A study of the effect of caffeine on muscle metabolism used eighteen male volunteers who each underwent arm exercise tests. Nine of the men were randomly selected to take a capsule containing pure caffeine one hour before the test. The other men received a placebo capsule. During each exercise the subject's respiratory exchange ratio (RER) was measured. (RER is the ratio of CO_{2} produced to O_{2} consumed and is an indicator of whether energy is being obtained from carbohydrates or fats).
The question of interest to the experimenter was whether, on average, caffeine changes RER.
The two populations being compared are “men who have not taken caffeine” and “men who have taken caffeine”. If caffeine has no effect on RER the two sets of data can be regarded as having come from the same population.
The results were as follows:

RER(%) 

Placebo 
Caffeine 

105 
96 

119 
99 

100 
94 

97 
89 

96 
96 

101 
93 

94 
88 

95 
105 

98 
88 
Mean

100.56 
94.22 
SD

7.70 
5.61 
The means show that, on average, caffeine appears to have altered RER from about 100.6% to 94.2%, a change of 6.4%. However, there is a great deal of variation between the data values in both samples and considerable overlap between them. So is the difference between the two means simply due sampling variation, or does the data provide evidence that caffeine does, on average, reduce RER? The pvalue obtained from an independent samples ttest answers this question.
The ttest tests the null hypothesis that the mean of the caffeine treatment equals the mean of the placebo versus the alternative hypothesis that the mean of caffeine treatment is not equal to the mean of the placebo treatment.
Computer output obtained for the RER data gives the sample means and the 95% confidence interval for the difference between the means.
Computer output
The Independent Samples ttest in Minitab
Enter the data from both samples into one column and the group identity in a second column, then select
Stat > Basic Statistics > 2Sample t... to perform an independent sample ttest in Minitab
Two Sample TTest and Confidence Interval
Two sample T for Caffeine vs Placebo

N 
Mean 
StDev 
SE Mean 
Caffeine

9 
94.22 
5.61 
1.9 
Placebo

9 
100.56 
7.70 
2.6 
95% CI for mu Caffeine  mu Placebo: (13.1, 0.4)
TTest mu Caffeine = mu Placebo (not =): T = 1.99 P = 0.032 DF = 16
Both use Pooled StDev = 6.74
N.B. mu = m = mean
The Independent Samples ttest in SPSS
Enter the data from both samples into one column and the group identity in a second column, then select
Analyze > Compare Means > Independent Samples T Test ...
TTest
Note: The difference in signs obtained in the two outputs is because one calculation considers caffeine – placebo values, and the other placebo – caffeine. It makes no difference to the conclusions of the test, ie p = 0.063.
Results
The pvalue is 0.063 and, therefore, the difference between the two means is not statistically significantly different from zero at the 5% level of significance. There is an estimated change of 6.4% (SE = 3.17%). However, there is insufficient evidence (p = 0.063) to suggest that caffeine does change the mean RER.
Alternative suggestion
It could be argued, however, that the researcher might only be interested in whether 'caffeine reduces RER'. That is, the researcher is looking for a specific direction for the difference between the two population means. This is an example of a onetail ttest as opposed to a twotailed ttest outlined above.
It is possible to make the choice for a onetail test in Minitab.
SPSS only performs a 2tailed test (the nondirectional alternative hypothesis) and to obtain the pvalue for the directional alternative hypothesis (onetailed test) the pvalue should be halved. Hence, in this example, p = 0.032.
A suitable null hypothesis in both cases is H_{0}: On average, caffeine has no effect on RER,
with an alternative (or experimental) hypothesis,
H_{1}: On average, caffeine changes RER (2tail test), or
H_{1}: On average, caffeine reduces RER (1tail case).
Results for the alternative suggestion could be reported as something along the lines:
The mean RER in the caffeine group (94.2 ± 1.9) was significantly lower (t = 1.99, 16 df, onetailed ttest, p = 0.032) than the mean of the placebo group (100.6 ± 2.6).
The number after a mean value and the ± sign is the standard error of the mean.
Note: It is important to decide whether a one or twotailed test is being carriedout, before analysis takes place.
Otherwise it might be tempting to see what the pvalue is before making your decision!
Assumptions underlying the independent sample ttest
Both the paired and independent sample ttests make assumptions about the data, although both tests are fairly robust against departures from these assumptions.
For the independent samples ttest it is assumed that both samples come from normally distributed populations with equal standard deviations (or variances)  although some statistical packages (e.g. Minitab and SPSS) allow you to relax the assumption of equal population variances and perform a ttest that does not rely on this assumption. Statistical tests are available to assess whether the two sample variances are significantly different, but a simple ruleofthumb is to check whether one standard deviation is more than twice the size of the other. If it is, use the 'unequal variances' option.
If normality cannot be assumed, the MannWhitney Test is often used, but is less powerful than the ttest.