Want to became a data scientist..!, Then you must know the prerequisites before you start cranking. As data scientist is the sexiest job of 21st century, many are willing to start their career as a data scientist. You might have heard that to start career in data science one should have an expert skills in various domains, but the truth is if you have the good basic knowledge in maths, statistics and programming and communication skills then everyone can start their career in data science.
It is better to have basics of mathematics concepts like linear algebra, calculus, probability and statistics, these skills are must to learn data science. If you passionate about data science then things are very easy to learn.
If you have the basic knowledge as we discussed above then you should have a basic knowledge on the following tools
Hadoop: It is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. No data scientist can escape from learning this tool as data scientist have to work with a huge data-sets which is highly difficult with normal storage systems.
Hive: It allows sql queries on dataset stored in a hadoop cluster, that means hadoop itself does not support all the things which need the supporting tools too.
Mahout: It is to build an environment for quickly creating salable performance machine learning applications, machine learning is the trending technology where is very helpful in many industries so one such machine learning applications can be create using mahout and you knowledge on linear algebra and calculus plays a major role in machine learning.
Spark: It is a fast and general engine for large-scale data processing. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells.
Storm: It is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm is simple, can be used with any programming language, and is a lots of fun to use!
So we have discussed the all the major prerequisites for a data scientist, If you are passionate about a data science then all this tools and skills are easy to learn and then you can play with a huge data to find predictions.
Here is my youtube channel please follow and subscribe ill get you more video tutorials and articles to help you to learn data science on from this internet world yourself by showing you the right stuff to learn from.
Thanks guys will see you in next article.

0 comments:
Post a Comment