When to use the z-test vs t-test?When you know the population standard deviation you should use the z-test, when you estimate the sample standard deviation you should use the t-test.The t-distribution has heavier tails (Leptokurtic Kurtosis) than the normal distribution to compensate for the higher uncertainty because we estimate the standard deviation. (the standard deviation of the standard deviation statistic)Usually, we don't have the population standard deviation, so we use the t-test.You should use the t-test!The t-test is always the correct test when you estimate the sample standard deviation. I guess the reason for the confusion is historical. The degrees of freedom equal sample size minus one. When the sample size is greater than 30, the t-distribution is very similar to the normal distribution.The t-distribution limit at infinity degrees of freedom is the normal distribution.In the past, people used tables to calculate the cumulative probability. For the t-table you need to have a separate set of data for any DF value, hence the Z-Table is more detailed and more accurate than the t-table. Z distribution vs t distributionYou may see the Leptokurtic kurtosis shape of the t-distribution (DF=4), compares to the Normal distribution (Z). Show
Z-test type I error - using sample standard deviationThe following simulation ran over 300,000 samples of a normal population and compares the sample mean to the true mean, using the t-test and the z-test with a significance level of 0.05. Blue Z - The actual type I error for the z-test when using the sample standard deviation. Green T - The actual type I error for the t-test. T vs Z - type I error chartFollowing the simulation results of type I error, when using sample standard deviation in z-test.
Z-test type I error - using population standard deviationThe following simulation ran over 300,000 samples of a normal population and compared the sample mean. This time we know the population standard deviation. Why the following chart looks the same as the t-test vs z-test - type I error chart?The green line, in the previous chart, shows the type I error for the t-test when using the correct test (sample S).The red line, in the current chart, shows the type I error for the z-test when using the correct test (σ). We expect that For any statistical test, the type I error will be around the significance level (α0). Z-test type I error chartlibrary(BSDA)reps < - 300000 # number of simulationsn1 < - 100; # sample size#populationsigma1 < - 12# true SDmu1 < - 40# true meann_vec < -c(4,5,6,7,8,10,12,15,17,20,25,30,35,40,45,50,60,80)pvt < - numeric (length(n_vec))pvz < - numeric(length(n_vec))j=1for (n1 in n_vec) # sample size{pvalues_t < - numeric(reps)pvalues_z < - numeric(reps)set.seed(1)for (i in 1:reps) {x1 < - rnorm(n1, mu1, sigma1) #take a smaples1=sd(x1)pvalues_t[i] < - t.test(x1,x2=NULL,mu = mu1,alternative="two.sided")$p.valuepvalues_z[i] < - z.test(x1, y=NULL, alternative = "two.sided", mu = mu1, sigma.x = s1)$p.valuepvalues_z[i] < - z.test(x1, y=NULL, alternative = "two.sided", mu = mu1, sigma.x = sigma1)$p.value}pvt[j] < - mean(pvalues_t < 0.05)pvz[j] < - mean(pvalues_z < 0.05)j=j+1 } Z-test with sample S vs z-test with σWe used the same code, but instead of t-test we used z-test with sigma1: Are the observed changes in mean statistically significant? This is perhaps a major consideration while making a critical hypothesis that gives a perfect analysis for a condition. Such analysis are the excellent candidates for hypothesis testing, or in other words, significance testing. For testing the hypotheses various test statistics are performed, such as t-test and z-test, and that will be the main course of discussion during the blog. We will cover main topics as;
About Hypothesis TestingLet’s start with a simple situation: you are a company, monitoring the daily clicks on blogs and want to analyze whether the outcomes of the current month are different from the previous month’s outcomes. For example, are they different due to a particular marketing campaign, or any other reason. In order to check this piece of activity, hypothesis testing is performed in terms of null hypothesis and alternative hypothesis. Hypotheses are the predictive statements that are capable of being tested in order to give connections between an independent variable and some dependent variables. Here, the question to be researched for is converted into;
Assuming that average clicks on blogs is 2000 per day before marketing campaign, you believe that population has now higher average clicks due to this campaign, such that Here the observed mean is >2000, and expected population mean is 2000. Next step would be to run test statistics that compare the value of both means. (Related blog: What is Confusion Matrix?) What is p-value?The calculated value of the test statistic is converted into a p-value that explains whether the outcome is statistically significant or not. For a brief, a p-value is the probability that the outcomes, from sample data, have occurred by chance, and varies from 0% to 100%. In general, these values are written in decimal format, like a p-value of 5% is written as 0.05. Lower p-values are considered to be favorable, as they indicate that data didn’t happen by chance. For example, if p-value is 0.01, it means that there is 1% probability that, from an event, the results have appeared by chance. However, a p-value of 0.05 is ideally acceptable, signifying that data is valid. Here, the test statistic is a numerical summary of the data which is compared to what would be expected under null hypothesis. It can take many forms such as t-test (usually used when the dataset is small) or z-test etc (preferred when the dataset is large), or ANOVA test, etc. Level of significance is the amount of some percentage that is required to reject a null hypothesis when it is true, it is denoted by 𝝰 (alpha). In general, alpha is taken as 1%, 5% and 10%. Confidence level: (1-𝝰) is accounted as confidence level in which null hypothesis exists when it is true. For instance, assuming the level of significance as 0.05, then smaller the p-value (generally p≤ 0.05), rejecting the null hypothesis. As this is a substantial confirmation against the null hypothesis that proves it is invalid. Also, if the p-value is greater than 0.05, accepting the null hypothesis. As this gives evidence that alternate hypothesis is weak therefore null hypothesis can be accepted. (Suggested blog: Mean, median, & mode) Significance of p-valueThe p-value is only a piece of information that signifies the null hypothesis is valid or not. Ideally, following rules are used in determining whether to support or reject the null hypothesis;
(Must read: What is Precision, Recall & F1 Score in Statistics?) One-tailed TestAt the level of significance as 0.05, a one-tailed test allows the alpha to test the statistical significance in one single direction of interest, this simply implies that alpha = 0.05 is at the one tail of distribution of test statistics. A test is one-tailed when the alternative hypothesis is stated in terms of “less than” or “greater than”, but not both. A direction must be selected before testing. It tells the effect of changes in one direction only, not in another direction. One- tailed test can be performed in two forms, i.e., It is used when
Left tailed test It is used when;
Right tailed test Two tailed TestWhile taking the significance level as 0.05, a two-tailed test allows half of the alpha level to test statistical significance at one single direction and half alpha level in another direction such that significance level of 0.025 in each tail of the distribution of test statistics.
Two tailed test In two tailed tests, we test the hypothesis when the alternate hypothesis is not in the form of greater than or less than. When an alternate hypothesis is defined as there is difference in values (such as means of the sample), or observed value is not equal to the expected value. Where a specific direction needs not to be defined before testing, a two-tailed test also takes into consideration the chances of both a positive and a negative effect. (Suggested blog: Conditional Probability) What is Z-test?Z-test is the statistical test, used to analyze whether two population means are different or not when the variances are known and the sample size is large. This test statistic is assumed to have a normal distribution, and standard deviation must be known to perform an accurate z-test. A z-statistic, or z-score, is a number representing the value’s relationship to the mean of a group of values, it is measured with population parameters such as population standard deviation and used to validate a hypothesis. For example, the null hypothesis is “sample mean is the same as the population mean”, and alternative hypothesis is “the sample mean is not the same as the population mean”. (Also check: Importance of Statistics and Probability in Data Science) One-sample Z-testThe z-statistics refers to the statistics computed for testing hypotheses, such that,
Two-sample Z-testThe above formula is used for one sample z-test, if you want to run two sample z-test, the formula for z-statistic is
(Read blog: Data Types in Statistics) What is T-test?In order to know how significant the difference between two groups are, T-test is used, basically it tells that difference (measured in means) between two separate groups could have occurred by chance. This test assumes to have a normal distribution while based on t-distribution, and population parameters such as mean, or standard deviation are unknown. The ratio between the difference between two groups and the difference within the group is known as T-score. Greater is the t-score, more is the difference between groups, and smaller is the t-score, more similarities are there among groups. For example, a t-score value of 2 indicates that the groups are two times as different from each other as they are with each other. (Must read: What is A/B Testing?) Also, after running t-test, if the larger t-value is obtained, it is highly likely that the outcomes are more repeatable, such that
Mainly, there are three types of t-test:
One sample T-testThe t-statistics refers to the statistics computed for hypothesis testing when
Two-sample T-test
T-test vs Z-testIt is certainly a tricky choice that a particular test statistics would be selected in what conditons, in the below diagram, a comparison is demonstrated between z-test and t-test relying on specific conditions.
Comparing T-test and Z-test As the sample size differs from analysis to analysis, a suitable test for hypothesis testing can be adopted for any sample size. For example, z-test is used for it when sample size is large, generally n >30. Whereas t-test is used for hypothesis testing when sample size is small, usually n < 30 where n is used to quantify the sample size. The t-test is the statistical test that can be deployed to measure and analyze whether the means of two different populations are different or not when the standard deviation is not known. The z-test is the parametric test, implemented to determine if the means of two different datasets are different from each other, when the standard deviation is known. Both t-test and z-test employ the different use of distribution to correlate values and make conclusions in terms of hypothesis testing. Notably, t-test is based on the Student’s t-distribution, and the z-test counts on Normal Distribution. (Related blog: What is Statistics?) Implementing both tests in testing of hypothesis, population variance is significant in obtaining the t-score and z-score. While the population variance in the z-test is known, it is unknown in the t-test. Some major assumptions are considered while conducting either t-test or z-test. In a t-test,
In the z-test,
ConclusionThe t-test and z-test are the substantive tests in determining the significance difference between sample and population. While the formulas are similar, the selection of a particular test relies on sample size and the standard deviation of population. From the above discussion, we can conclude that t-test and z-test are relatively similar, but their applicability is different such as the fundamental difference is that the t-test is applicable when sample size is less than 30 units, and z-test is practically conducted when size of the sample crosses the 30 units. (Must read: Clustering Methods and Applications) Similarly, there are other essential differences as well which have been seen in the blog. We hope this made a clear understanding of the differences between the both z-test and t-test. |