Two categories of Structured Data: Rectangular and Non-Rectangular
In an effort to spread the love on data science, I’m going to try to tackle some of the most common concepts and keywords and re-explain them in my own words, without all the boring jargon. Join me.
Structured data can typically be categorised as either rectangular and non-rectangular depending on how it is organised. Traditional “data” often comes in the form of a table or spreadsheet. If you were to draw this, you are likely to draw a square or rectangle. Helpfully, this type of data is called rectangular data. It is usually two-dimensional, with rows and columns. In data science and analytics, you may also encounter this as data frames or matrices.
The less intuitive relative of rectangular data is non-rectangular data. The obvious ones are graphs representing roads or transport networks, or social media connections. You may encounter these if you start learning about recommender systems. Another type is JSON files, which are common if you are doing some web scraping or working with APIs. These have more of a hierarchical or nested structure. A final example is data from devices, or the Internet of Things. This data can be complex because it often accounts for location and time as well as many different data types that won’t fit into a nice two-dimensional structure.
Understanding and effectively handling both rectangular and non-rectangular data become crucial skills for extracting meaningful insights since the line between the two may become blurred. You may find your data is organised both ways and it’s your job to combine the information to find these insights.
If you enjoyed this and found it helpful (or totally unhelpful) please let me know by leaving a reaction or comment, or find me via LinkedIn.