The Two Types of Structured Data: Numeric and Categorical
In an effort to spread the love on data science, I’m going to try to tackle some of the most common concepts and keywords and re-explain them in my own words, without all the boring jargon. Join me.
When you’re starting out with Data Analysis or Data Science, you’ll find out that there are different types of data: structured and unstructured. And within structured data there are two types, numeric and categorical.
Numeric data
Numeric data is, you guessed it, numbers. These can be continuous or discrete. Think of a continuous numeric data type as a number that could be written in an infinite number of ways and can be visualized with a line. Continuous numeric data is often a decimal, or a float; and this decimal can change. Take, for example, time. Timing something with a stopwatch may give you a measure like 1.2 minutes, but a more advanced timer may tell you it was in fact 1.98768 minutes.
A discrete number is an integer and is more like a count. You either have 2 apples in the basket or you don’t.
Categorical data
On to categorical data. As the name suggests, this data puts information into categories and is often text based. For example, if you had data about trees, you may also have data that tells you the species of a tree. This is commonly referred to as nominal data.
It can sometimes be numbers, if the numbers represent a group such as 0 for people that answered no in a survey, and 1 for people that answered yes. This is also known as binary categorical data.
Finally, you could also find ordinal categorical data. This data has a specific order, such as ranking foods on a scale of like to didn’t like, age group buckets, grades and scores, or ability levels. This is probably the most confusing one as it can balance the line between numeric and categorical. It’s useful to think about how you would plot the data. In this case, you’re likely to group them into buckets…or categories!
Knowing what type of data you are working with is the first step in being able to analyze it using the correct tools.
If you enjoyed this and found it helpful (or totally unhelpful) please let me know by leaving a reaction or comment, or find me via LinkedIn.