GSU Library Research Guides: SAS: Frequency Tables

What is a Frequency Distribution?

A frequency table shows the distribution of observations based on the options in a variable. Frequency tables are helpful to understand which options occur more or less often in the dataset. This is helpful for getting a better understanding of each variable and deciding if variables need to be recoded or not. There is no formula for a frequency table since it reports the count of each option in a variable.

Below is the code for the frequency distribution for the variable language in the SLID dataset.

Code

PROC FREQ DATA = SLID;
TABLES language;
RUN;

In the code above we are running the PROC (procedure) FREQ (frequency) from the DATA SLID. In our TABLES we are specifying the variable language. We then end with the RUN command.

Output

The output chart shows us the frequency distribution of the variable language.

Each row in the chart is an option that respondents could have selected during data collection. We can see the options are 1, 2, and 3, which correspond to the labels English, French, and Other. Sas always presents options that are coded from the smallest number the to greatest, so we know that English is coded with a numerical value of 1, French is coded as 2, and Other is 3, which is helpful when recoding.

We can see in the Frequency column that most, 5,716 observations, selected English as their language. The least commonly reported option is French with 497 observations and 1,091 observations selecting Other.

In the next column to the right, Percent, Sas shows us the percentage of each option from the entire dataset that only includes non-missing observations. For example, 78.26% of observations selected English.

The column to the right, Cumulative Frequency, is the count of the each option and the option(s) about it. For example, 6213 observations in the dataset speak either English or French.

The column furthest to the right, Cumulative Percent, is the percentage of each option and the option(s) above it. Cumulative Percent is used to determine cutoffs for quartiles. For example, 85.06% of observations in the dataset speak either English or French.