Lifecycle of Data Science

Posted by Deven Raj on November 12th, 2019

  • Phase 1 (Discovery) 

This is the phase where you discover the various requirements, prerequisites, priorities and required budget for a particular project. Assessing the resources, people, time and data to support such a project. this also includes framing a business plan and formulate a hypothesis to test in an algorithm later.

  • Phase 2 (Data Prep) 

In this phase, you require an analytical sandbox to perform analytics for the time period of the project. Exploration, Pre-processing and conditioning data according to the model is mostly what this phase is all about.

Perform ETLT (Extract, Transform, Load, and Transform) to get data into the sandbox.

Use R for data cleaning, Transformation, and visualizations. It helps to spot the outliers and establish a relationship.

Learn in-depth data science with this data science tutorial.

  • Phase 3 (Model Planning)

Using the methods and techniques to draw a close relationship within variables, we shall set the base for the algorithms which will be implemented in the next phase. Applying Exploratory Data Analytics (EDA) using various statistical formulas and visualization tools.

  • R has a complete set of modeling capabilities and provides a good environment for building interpretive models.
  • SQL Analysis services can perform in-database analytics using common data mining functions and basic predictive models.
  • SAS/ACCESS  can be used to access data from Hadoop and is used for creating repeatable and reusable model flow diagrams.
  • Phase 4 (Model Building)

This phase is for developing datasets for training and testing purposes. Consideration is given to whether your existing tools will suffice running models or it may need a more robust environment with fast/parallel processing power. It also includes analysis of various learning techniques like classification, association, and clustering to build the model.

 

  • Phase 5 (Operationalize)

You will deliver final reports, briefings, code and a lot of technical documents with data. Sometimes a pilot project (Side application) is implemented in a real-time production. This provides a clear picture of performance and related constraints on a smaller scale before full-scale deployment.

  • Phase 6 (Communicate results)

Since it's the last phase, you identify all key findings, communicate to the stakeholders and determine if the results of your particular project have been a success or failure based on the developed criterion in phase 1.

Conclusion

In this emerging market for data science, a person needs to acquire various hard and soft skills to make a career path in this particular field. He needs to be great at statistics and mathematics to analyze and visualize data. As machine learning is at the center of Data Science, You need a lot of experience. Should have a solid understanding of the domain you are working in to comprehend the business problems and come up with a solution. Be capable of implementing various algorithms which require great coding skills and a tough judgment as you will be making key decisions and presenting them to stakeholders. Hence excellent communication is a factor that can greatly help a person advance in this field. Due to these factors numerous data science online course are available in the market through which one can master data science.

A data scientist can earn up to 0,931 on an average and close to 0,110 if the individual can distinguish himself/herself by having a complete package of soft and hard skills required to do well in data science.

This field is growing and growing fast every day and with ongoing advancements in storage capacities and data manipulations, data science might become the backbone of prediction and prescriptive analytics.

 Not only is data science just about modeling and algorithms, it's application can be used in almost all the new technological advancements in computers and data handling. 

Like it? Share it!


Deven Raj

About the Author

Deven Raj
Joined: November 9th, 2019
Articles Posted: 1