Proportion: You can easily calculate the proportion by dividing the frequency by the total number of events. When you are dealing with nominal data, you collect information through:įrequencies: The Frequency is the rate at which something occurs over a period of time or within a dataset. If you don’t know them, you can read my blog post (9min read) about it. To understand properly what we will now discuss, you have to understand the basics of descriptive statistics. We will now go over every data type again but this time in regards to what statistical methods can be applied. Therefore knowing the types of data you are dealing with, enables you to choose the correct method of analysis. You have to analyze continuous data differently than categorical data otherwise it would result in a wrong analysis. An example would be a feature that contains temperature of a given place like you can see below:ĭatatypes are an important concept because statistical methods can only be used with certain data types. Therefore we speak of interval data when we have a variable that contains numeric values that are ordered and where we know the exact differences between the values. Interval values represent ordered units that have the same difference. An example would be the height of a person, which you can describe by using intervals on the real number line. Continuous DataĬontinuous Data represents measurements and therefore their values can’t be counted but they can be measured. You can check by asking the following two questions whether you are dealing with discrete data or not: Can you count it and can it be divided up into smaller and smaller parts? 2. An example is the number of heads in 100 coin flips. It basically represents information that can be categorized into a classification. This type of data can’t be measured but it can be counted.
In other words: We speak of discrete data if the data can only take on certain values. We speak of discrete data if its values are distinct and separate. Because of that, ordinal scales are usually used to measure non-numeric features like happiness, customer satisfaction and so on. This is the main limitation of ordinal data, the differences between the values is not really known. Note that the difference between Elementary and High School is different than the difference between High School and College. You can see two examples of nominal features below: Therefore if you would change the order of its values, the meaning would not change. Note that nominal data that has no order. Nominal values represent discrete units and are used to label variables, that have no quantitative value. Note that those numbers don’t have mathematical meaning. Categorical data can also take on numerical values (Example: 1 for female and 0 for male). Therefore it can represent things like a person’s gender, language etc. Categorical DataĬategorical data represents characteristics. We will sometimes refer to them as measurement scales. We will discuss the main types of variables and look at an example for each. Think of data types as a way to categorize different types of variables. You also need to know which data type you are dealing with to choose the right visualization method. Having a good understanding of the different data types, also called measurement scales, is a crucial prerequisite for doing Exploratory Data Analysis (EDA), since you can use certain statistical measurements only for specific data types.
Numerical Data (Discrete, Continuous, Interval, Ratio).