Before defining data science, let’s get familiarized with three terms – data, information and insight. These three terms seem similar; however, they are not quite the same.
- DATA is a raw unorganized set of information. What we call data is actually raw data. It contains information; however, the information is not readily available.
- INFORMATION is when you analyze data so that it provides some sort of understanding of the data. The term we use is information extraction (from data).
- INSIGHT is gained by analyzing data and information to understand what is going on with a particular situation. This can be used to make better decisions.
Data science is the field of study that combines domain expertise, programming skills, and knowledge of mathematics and statistics to extract meaningful insights from data. In turn, these systems generate insights which analysts and business users can translate into tangible business value.
More and more companies are coming to realize the importance of data science, AI, and machine learning. Regardless of industry or size, organizations that wish to remain competitive in the age of big data need to efficiently develop and implement data science capabilities or risk being left behind.
- While Data Science focuses on finding meaningful correlations between large datasets, Data Analytics is designed to uncover the specifics of extracted insights.
- In other words, Data Analytics is a branch of Data Science that focuses on more specific answers to the questions that Data Science brings forth.
-
Data analysis works better when it is focused, having questions in mind that need answers based on existing data.
-
Data science produces broader insights that concentrate on which questions should be asked, while big data analytics emphasizes discovering answers to questions being asked
-
Hence, Data Science is a field of analyzing and picking insightful information from raw data.
-
A Data Scientist is one who creates programming code and combines it with statistical knowledge to create insights and analysis on relatively larger data or big data.
- Mathematics
- Statistics
- Data Processing
- Data Analysis
- Computer Science/Programming
- Machine Learning
In the image below you can see a Venn Diagram showing how fields are interlinked.
- INTERNET SEARCH:
- All the Search engines make use of data science algorithms to deliver one of the best results for our searched thing in a fraction of second.
- DIGITAL ADVERTISEMENT:
- Search is the biggest application of data science and machine learning. From displaying advertisements on various websites to the digital bill at the airport, everything is decided by using data science algorithms.
- Digital Ads: Get a lot of higher CTR [Click Through Rate] than traditional ads.
- RECOMMENDER SYSTEM:
- Everyone is familiar with recommendations on YouTube, Amazon, Flipkart, Netflix etc.
- They not only help to search relevant products but also for advertisements that are to make us buy that product.
- Companies use this to promote their products. Most recommendations are based on previous search results.
- IMAGE RECOGNITION:
- We used to upload images with friends in social media and we get suggestions to tag our friends. This automatic tag suggestion uses a face recognition algorithm.
- Similarly, it is asked to scan the QR code when we open WhatsApp in a Web browser.
- SPEECH RECOGNITION:
- Some of the best examples are Google Voice Assistant, Siri, Cortana etc.
- This also uses a Data Science algorithm.
- (Here you can see a video conversation between Cortana and Satya Nadella (Microsoft CEO).
- [Link: https://youtu.be/HdKGo9IgWOA]
- Data Engineer
- Data Analyst
- Data Scientist
| Data Engineer | Data Analyst | Data Scientist |
|---|---|---|
| Focus on improving data consumption techniques | Focus on the present technical analysis of data. | Focuses on a futuristic display of data. |
| Data mining for getting insights from data. | Collecting information from a database with the help of a query. | Manage, mine, and clean unstructured data to prepare it for practical use. |
| Prepare data, construct, test, maintain complete architecture. | Analyze numeric data and help to make decisions. | Analyze and interpret complex data. |
| Scripting, reporting and data visualization. | Spreadsheet knowledge | Decision making soft skills. |
Writing queries on data |
Use basic algorithms in their work like logistic regression, linear regression and so on |
Understand and interpret Big Data analysis |












