From the daily news with morning coffee, E-mails, photos, videos and social media to the vast labyrinth of big data, data revolves around us.so we have truck loads of data, now what? How will we be able to unlock valuable insights from our seemingly unrelated random databases?
The answer to this is Data Science. It is the extraction of knowledge from large volumes of data either structured ,unstructured or semi structured by using mathematical techniques. It can also be defined as the process of tunnelling through mountains of information by using by exceptional individuals by excellent skills in writing algorithms by exceptional individuals who are able to extract glittery insights form the coal mines of data. The domain of data science is the explosion of new data generated from web ,mobile, social media and smart devices.
In early stages ,data science was equated to as statistic, but it is a deeper knowledge discovery through data exploration and interpretation. The value of data science is never finite to scientific queries. It helps in deciding everything form how bridges should be designed ,to how frequent a patient must receive the radiation to even mundane things like which genre of movies should be created .
Being inter disciplinary, data science is used for drawing scientific conclusions in large spectrum of academic subject areas. clinical data science is an integral part for providing timely answers to the efficiency of the existing therapeutic compounds. It is crucial for determining the safety of novel therapeutic compounds by planning, collecting, transforming, analysing and reporting if clinical trial data and communication of the obtained results. Data science does not just stop at clinical trails. It is applied for learning the proteins and DNA sequence in genomics .The work of studying and analysing DNA structures ,viruses and other biological pathogens is made easier by tools of data science and also by making the procedures repeatable .Not only restricted to medical field, it has a rich and long history in fraud monitoring and security .Although the techniques used are similar to other data domains ,security data science has a micro-focus on reducing risk ,identifying malicious or fraud insiders. With evolution, security data science is able to meet the challenges of gaining insights from huge streams of log data, manage them to discover inside threats and to prevent fraud.
The exceptional individuals mentioned above with excellent coding skills are known as a data scientists .It is said that that being a Data scientist will be the most appealing job of the 21 century. The most basic and crucial capability of a data scientist is to write code which can swim through heap loads of data, giving structure and backbone to formless data and joining this potentially incomplete data with rich data sources making it possible for further analysis and resulting in a clean set. A data scientist ,essentially can be a scientist from various fields, their domains doesn’t really matter .Let it be an ecologist, a librarian ,an economist or an astrophysicist. It does not really come up as a surprise when most of the data scientists in roll today were trained erstwhile in maths, economics or computer science. Data scientists rely strongly on natural language processing, machine learning, statistics, optimizations, signal processing and text retrieval.
Yahoo’s “Hadoop” and LinkedIn’s “People You May Know” are some of the results by data scientists at their best! Data driven companies like Google, Amazon, eBay and Twitter employs data scientists who are on a constant quest of adding and refining their tool kit .Data science matters to companies because it enables them to plan and operate more in a skilled manner. It is about adding substantial endeavour value by manipulating the results obtained.
They say it always have to start with a question. The utilitarianism of the data science depends on the question it can answer. There comes the science part of the equation. It does not matter if you have a larger database to search through or if you can code in python. What matters is that if you are able to find the answers to this very complex question using the data in hand. The combination of large data and coding to sort it out wont necessarily get you the required result, because in a heap of 5 Gb of data,4 kb will be the answer to the question in hand .But a sect contradicts this conclusion by saying that data science is about applying scientific methods to problems for making better decisions, which we were not able to do before due to lack of data and tools to analyse the data. Conclusions are drawn from verifiable data rather than from anecdotes and instincts in data science.
Concluding that the propaganda around the data science will flame out if it is only about data rather than science. The impact of data science will be measured by scientific questions we can answer wit the data. The work word in data science is not data ,but it is science.