Cluster sampling: A probability sampling technique
Image source: Statistical Aid
Cluster sampling is defined as a sampling method where multiple clusters of people are created from a population where they are indicative of homogenous characteristics and have an equal chance of being a part of the sample. In this sampling method, a simple random sample is created from the different clusters in the population. This is a probability sampling procedure.
Examples
Area sampling: Area sampling is a method of sampling used when no complete frame of reference is available. The total area under investigation is divided into small sub-areas which are sampled at random or according to a restricted process (stratification of sampling). Each of the chosen sub-areas is then fully inspected and enumerated, and may form the basis for further sampling if desired.
Types of cluster sampling
There are three types as following,
Single stage Cluster: In this process sampling is applied in only one time. For example, An NGO wants to create a sample of girls across five neighboring towns to provide education. Using single-stage sampling, the NGO randomly selects towns (clusters) to form a sample and extend help to the girls deprived of education in those towns.
Two-stage Cluster: In this process, first choose a cluster and then draw sample from the cluster using simple random sampling or other procedure. For example, A business owner wants to explore the performance of his/her plants that are spread across various parts of the U.S. The owner creates clusters of the plants. He/she then selects random samples from these clusters to conduct research.
Multistage Cluster: Few step added to two-stage then it is called multistage cluster sampling. For example, An organization intends to survey to analyze the performance of smartphones across Germany. They can divide the entire country’s population into cities (clusters) and select cities with the highest population and also filter those using mobile devices.
Advantages
· Consumes less time and cost
· Convenient access
· Least loss in accuracy of data
· Ease of implementation