BATCH ANALYTICS

INTRODUCTION

This blog mainly consists of two parts.


  1. Batch analytics 
  2. Real-time analytics
Above two parts are combined with 3 different phases.
  1.  Data collection (Source of the data)
  2.  Data processing
  3.  Data visualization
Since test data is already provided with no need to invest much time for data collection purposes but there is some work involved in this for real-time analytics

First We'll look into the Batch analytics section.



BATCH ANALYTICS 

Step 01 -Data Collection (Given Data Set)

Step 02 -Data Preprocessing

This involves the multiple steps for cleaning and pre-processing the dataset. There are two main packages in python which can be used to perform this: pandas and numpy.
Points to Remember:

       The pandas package provides high-performance, easy to use structures and data analysis tools.
  • The numpy package is used to perform different operations. 

  • Below figures describe the process of the data pre-processing phase using  Jupyter Notebook.




  1. Importing python packages & Understanding the data set.
  2. Checking the Data Types of the fields and null values.




  3. Pre Processing columns and rows which contain null values.

  4. Formatting the Date & Hour.
  5. Checking the Data.
  6. Extracting the cleaned data set to a new CSV file.


Step 02: Data Visualization

For this step, I used Tableau freely available software. Tableau can help anyone see and understand their data. Connect to almost any database, drag and drop to create visualizations, and share with a click.

Step 1: Connect to your data.
Step 2: Drag and drop to take the first look and select the most appropriate attributes for the scenarios
Step 3: Focus your results.
Step 4: Select the most suitable graph type.
Step 5: Add Filters.
Step 6: Build a dashboard to show your insights.



Check this out for the final Dashboards of Batch Analytics !!here





Comments

Popular posts from this blog

REAL-TIME ANALYTICS