This is a site for the data science aspirants, who are passionate about data science and for the people who wants to start their career in data science from beginning.

Monday, 7 November 2016

Job of a Data Scientist

By 22:52
What do data scientists do? Do you have this question, then here are the answers for you. A data scientist makes predictions using past data to make that prediction and answer the questions by using past data which can not be answered by normal techniques.
Data scientist have their own style and steps to solve the problems and answer the questions, let us know what exactly a data scientist is does.


Define the question
In every field we have problems to solve like wise a data scientist also have problems to solve, when you have some question to answer you need to define that question, what are the challenges that you are facing, what area you have to solve in that question, it is not a good idea to travelling without knowing our destination so you should define your question before you answer it.

Define the ideal data set
After defining our question its time to define the ideal data set that you should use to answer the question, a professional data scientist will define his data set by his intuition because the data you are going to define to solve your question will play a major role in the outcome of your results, so define the data set which you are going to play with.

Obtain the data
Once we define our data set we have to obtain it, often data became very large day by day it is increasing and become very cheap. We can get data from many sources like data created by human by different action example social media etc, data created by industries and by machines lite ATM's, and many more things, you should be clear from where you have to obtain your ideal data set.

Clean the data
We can not start making jewelry with the raw gold, we need to clean in and extract the pure gold then you to start making beautiful jewelry, same thing should be considered with the data you have obtained. You cannot process further without cleaning the data as it is a raw data you obtain it would have the data that you don't need, you need to extract the data which you need to process.

Exploratory data analysis
An exploratory data analysis builds on a descriptive analysis by searching for discoveries, trends, correlations, or relationships between the measurements of multiple variables to generate ideas or hypotheses.

Statistical prediction/modeling
Statistical modeling is the traditional way to analyse the problem, you need to make statistical modeling or prediction, an intuition about what's going to happen in the next sample you might take.

Interpret results
Interpreting your results, challenging them. Then synthesizing them and writing them up in reproducible ways that can be shared with other people.


Data products
Finally, we're going to talk about distributing results through things like interacting graphics, also through right ups and presentations, and finally through interactive apps built on top of R or Python based on your comfort.



Read More...

Sunday, 6 November 2016

Data Scientist vs Data Analyst-A common question for every beginner.

By 18:33
Often we see that the boom in data industry with the immense creation of data everyday from different sources, so the topic of data science become interesting. I have seen many people asking a question about the different fields, that is what is the difference between data scientist and data analyst.

Before discuss about the differences let us discuss about data, what is data?, data is an information or knowledge,for example this article itself is some data. Now we should have question that who generates data, the data is mainly creating by three sources they are humans, industries and machines.
Humans are creating data by their different action in social media and in many more things, next one is an organisation and finally the data is created by machines.

So now time comes to talk about data scientist and data analyst, first we will discuss about data analyst, if you observe from the below picture the data analyst is pointing to something 

representation, so he use to break the large problem to small pieces for better understand-ability, so data analyst the use to give the solutions through a representation using different kind of visualizations like bar charts, pie charts etc based on what happened so far. Whereas being a data scientist who will see the problem in business point of view, will do the predictive analysis to find what going to be happen in future, that is what a data scientist will do.

Let us see the skills set and team structure of a data scientist, if we see the following picture we can find that a data scientist have a team of data analyst, software engineer and a domain skill experts like

 a java, R, or Scala programmer. whereas data analyst have a team of software engineer and a data warehousing. A data analyst and a data scientist have a different roles under them as you see in the following picture.
When it comes to technical skills of a data scientist and a data analyst they are follow as like given below.

So we have discussed the major difference between a data scientist and a data analyst, skill sets of the each field.







Read More...