Making Sense of it All – Decluttering Data Science

posted in: Economic ideas | 0

In a technical field like data science, it is common for various terminologies and technical jargon to be thrown around. Based on my reading and understanding and my work as an intern in a Data Intelligence department during my first-year summer break, I have tried to make some sense of it. As always, your feedback and suggestions welcome.

The five main components of data science can be classified as under:

  • Data Collection
  • Data Cleaning
  • Data Mining
  • Data Analysis
  • Data Visualisation

To further have an understanding of what the above components mean or do, I will give a brief overview of the above

DATA COLLECTION

This step pertains to the collection of raw data both structured or unstructured. This is also sometimes referred to as data extraction.

DATA CLEANING

This step involves taking the raw data and preparing it in usable formats. Looking for data duplication and any errors that may be embedded. Sometimes also called data warehousing.

DATA MINING

The data thus gathered in usable format, is further examined to see its utility in analyzing or answering a business question for example- does the data help in determining sales outcome based on number of customers visited by sales agent?

DATA ANALYSIS

As the name suggests, the data is used for various analysis including predictive analytics where regression models and machine learning techniques are used to determine trends or patterns in the data.

DATA VISUALIZATION

This step involves the communication to the management of the outcome of the analysis done previously in easily readable formats (reports, graphs, various charts etc.) It’s pretty much a part of Business Intelligence.

Leave a Reply

Your email address will not be published. Required fields are marked *