DATA SCIENCE BASICS

In this post, we’ll be discussing about two fundamental questions which can assist you in shaping your career in data science.
1 . What is data science and what are its components?
2. What are the prerequisite skills needed to get a job in data science?
1 . Data science is actually the techniques which are used for extracting meaningful insights from a huge dataset. With the presence of many people on the web , companies like Facebook, Instagram, Google collected plenty of knowledge about its users. This led to Big Data. It comprises of unstructured datasets. Hence, several methods were developed to work on this data and are available up with wide scale applications. An example is that Netflix collects data from its users to return with the choices like where to put subtitles, the way to place the top credits and the way to make transition between episodes of an internet series.
So the first step is to collect data and store it. It involves collecting differing types of knowledge like user generated data, external data and storing them.
To ensure the reliable flow of knowledge , data pipelines are built on the idea of a standard structure, ETL which is Extract, Transform and cargo . Through this, the transformation of raw data is done so that it can be suitably analysed. This task is handled by data engineers.
Only after this, the “analysis” on data can be performed. Often this is often the sole part which is concentrated ignoring the essential foundations, hence people have the misunderstanding that only data analytics comprise data science.
On top of this, metrics are built on which data is tracked, categorisation of users is completed and data also can be trained with the assistance of labels. Before deploying ML models, an experimentation framework is put for getting an estimate of the changes before it’s implemented on the whole dataset.
2. Now to be able to implement it, you need the following skills for getting started in the domain of data science.
Programming LanguageIrrespective of the role, you would like to understand a programing language suitable for manipulating statistical data like Python or R. Besides this, you furthermore may got to know a database command language like SQL. With the help of Python libraries, the application of machine learning models get simpler hence it is not required to know how exactly the algorithms work initially.

Applied Mathematics You should have a solid understanding of statistics, as it will be needed for making decisions for evaluation of experiments. Knowledge of calculus and linear algebra helps in using the results of a machine learning or statistical implementation in a different case independently.
Data cleaning and visualisationDo not think that data will be readily available to you for processing. Often an excellent deal of your time is spent in cleaning the info , adjusting missed values, correcting formatting. Without this, data can’t be processed to further stages.
You should skills to use visualisation techniques to draw meaningful insights from the info . Matplotlib, ggplot can help in visualisation. Tableau has also been a well-liked tool for rendering data visually.
Software engineeringA strong software engineering background is that the most essential requirement. You should have a clear understanding of algorithms, data structures, memory management which will be always tested in the first rounds for the data science roles.