Module 1: Background You’ll Need 1

  • Describe what data is and what it isn’t

What is Data? 

data

Data is factual information that has been collected and can be used for analysis, calculation, or discussion.

Data often has no impactful meaning when looked at “raw” or before analysis. It’s just a list of numbers or solitary information.

Data is NOT a news headline. Data is NOT a graph or a value with no context of how it was created and collected.

A person points towards many different data tabs including profiles and graphs.
Figure 1. Data becomes meaningful when it’s collected with purpose and used in context—like tracking how students spend their time.

Imagine you recorded the number of hours of TV watched by your class peers each day this week. That is data!

The hours would not all be the same and would change from person to person and even from day to day. Depending on how much other information you gathered, you might be able to identify some reasons that impacted the amount of TV watched by each student each day. (This could be the number of kids they have, hours they work at a job, their dislike of TV, favorite type of show, etc.). The data would vary from student to student, and it’s possible that there would be no students who had the exact same times spent watching TV.

Data are the smallest unit of decision-making. They are the smallest units of factual information that can be used for calculation, reasoning, or discussion. Data can range from abstract ideas to concrete measurements, including but not limited to, statistics.

Data is plural! This can sound funny to us because outside of academic or scientific settings, the word data is treated as singular. So when referring to the data in a study, we will say “the data are ___.”

In the pursuit of knowledge, data are a collection of discrete values that convey information, describing quantity, quality, facts, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted.

Data are commonly used in scientific research, finance, and in virtually every other form of human organizational activity.

Examples of data sets include stock prices, crime rates related to the number of fast food restaurants in a ZIP code, scientific observations (like daily high temperatures), starting salaries by career field, literacy rates by county, and census data. In this context, data represents the raw facts and figures that can be used in such a manner to capture the useful information out of it.

Data are collected using techniques such as measurement, observation, query, or analysis, and are typically represented as numbers or characters, which may be further processed.

Data is analyzed using techniques such as calculation, reasoning, discussion, presentation, visualization, or other forms of post-analysis.

You may have heard the term “Big Data.” Big Data usually refers to very large quantities of data compiled through computing technology. Working with such large (and growing) data sets is difficult, even impossible. Data Science uses methods that allow for efficient analysis of Big Data.

This article makes a headline statement based on data, then describes the data in words, numbers, and graphs.