Big Data
The volume of data that one has to deal has exploded to unimaginable levels in the past
decade, and at the same time, the price of data storage has systematically reduced.
Private companies and research institutions capture terabytes of data about their users’
interactions, business, social media, and also sensors from devices such as mobile phones
and automobiles. The challenge of this era is to make sense of this sea of data. This is
where big data analytics comes into picture.
Big Data Analytics largely involves collecting data from different sources, munge it in a
way that it becomes available to be consumed by analysts and finally deliver data products
useful to the organization business.
Big Data Life Cycle
In today’s big data context, the previous approaches are either incomplete or suboptimal.
For example, the SEMMA methodology disregards completely data collection and
preprocessing of different data sources.
These stages normally constitute most of the work
in a successful big data project.
A big data analytics cycle can be described by the following stages:
Business Problem Definition
Research
Human Resources Assessment
Data Acquisition
Data Munging
Data Storage
Exploratory Data Analysis
Data Preparation for Modeling and Assessment
Modeling
Implementation
Big Data Analytics – Core Deliverable
As mentioned in the big data life cycle, the data products that result from developing a big data product are in most of the cases some of the following:
Machine learning implementation: This could be a classification algorithm, a regression model or a segmentation model.
Recommender system: The objective is to develop a system that recommends choices based on user behavior. Netflix is the characteristic example of this data product, where based on the ratings of users, other movies are recommended.
Dashboard: Business normally needs tools to visualize aggregated data. A dashboard is a graphical mechanism to make this data accessible.
Ad-Hoc analysis: Normally business areas have questions, hypotheses or myths that can be answered doing ad-hoc analysis with data.
Big Data Analytics – Core Deliverable
As mentioned in the big data life cycle, the data products that result from developing a big data product are in most of the cases some of the following:
Machine learning implementation: This could be a classification algorithm, a regression model or a segmentation model.
Recommender system: The objective is to develop a system that recommends choices based on user behavior. Netflix is the characteristic example of this data product, where based on the ratings of users, other movies are recommended.
Dashboard: Business normally needs tools to visualize aggregated data. A dashboard is a graphical mechanism to make this data accessible.
Ad-Hoc analysis: Normally business areas have questions, hypotheses or myths that can be answered doing ad-hoc analysis with data.
Comments
Post a Comment