Skip to main content
9 answers
11
Asked 931 views Translate

What is the difference between data science and machine learning?

I've been looking into data science careers, and I know that it is closely related with machine learning and big data. I'm confused as to what the difference between data science and machine learning is, and also how big data plays in a part in both fields. What exactly is data science and machine learning, and how are they related to each other (and how does big data tie into them)? Any help would be greatly appreciated.

#data-analysis #data-science #computer-science #computer-software #big-data #machine-learning #data-visualization #data #data-mining

+25 Karma if successful
From: You
To: Friend
Subject: Career question for you

11

9 answers


4
Updated Translate

Kurt’s Answer

"data science" is a more broad category, and "machine learning" is a subset. kinda like "medicine" would be a broad category and "heart surgery" would be a smaller subset of that discipline. So "all machine learning is a type of data science, but not all data science involves machine learning"... ;-)

4
2
Updated Translate

Benjamin’s Answer

Data science is a discipline that works with machine learning and Big Data, as well as many other things. I work as a Data Scientist, and while I do use machine learning and Big Data in my job, it is not all I do. Also, you need to consider that there are different types of data scientists.


Machine Learning is, at its most basic, a predictive model created by feeding it data. Let us imagine we have a list of houses that sold recently. We have two columns, one with the square footage of the house and one with the price the house sold for. We could feed this data into a machine learning algorithm and it will build a model for us. Now if I ask the model how much a 2000 sq ft house will sell for, the model will provide us price based on the list of prices we had given it. Now obviously we use much more complex data sets with many many more variables, but at the end of the day machine learning boils down to asking a computer to either classify an object (is the picture a cat or dog?), provide a numeric value (regression - think of the house price example), or cluster (see how data should be best grouped based on attributes - think of all the students in your high school and how they can be grouped: jocks, drama clubs kids, nerds, popular crowd).


Big Data is massive, fast moving, data sets. It is a popular term, but not all data science or machine learning involves Big Data. Twitter is great example of big data with millions of tweets every few minutes.


In my case, I am what you might call an operational data scientist. I work in financial compliance at Verizon helping to hunt down people who are "gaming the system" or stealing from us by using loop holes in our policies. The biggest part of my job is finding, gathering, and cleaning data so I can analyze it. Once I have the data I may run it through a machine learning algorithm to create a predictive model they may help us to predict which people we should look at more closely (make the haystack a little smaller - easier to find the needle in.


A big data example I worked on with another company was using the voice recordings or people calling customer service. I was able to determine certain speech patterns that were more likely to be used by someone trying to commit some type of fraud. We were able to use this information to alert the customer care reps who to be on the look for.

Benjamin recommends the following next steps:

Check out Kaggle, great site to learn about data science and machine learning
Check out my website analytics4all.org - It is an introductory website designed to give an overview of everything from databases to machine learning and Big Data
2
1
Updated Translate

Michelle (Guqian)’s Answer

Data Science is a rather broad field that covers many areas, and machine learning is one of them. Data Science in the industry currently has three major tracks: analytics, generalist, and machine learning.


  • Analytics requires minimum statistical background and it requires someone to have keen business sense, and the ability to break down business problems into different aspects and do deep dives. Major skills needed for this track are: data pulling, data processing & dashboarding.
  • Generalist track requires you to solve a business / product problem end-to-end. You need to be able to understand the real problem, and has good business sense, knows how to solve it, and come up with a solution using statistical or modeling approach.
  • Machine learning track requires you to understand the problem, and could figure out what are the suitable ML techniques to apply here, which models you could apply and how to fine tune them with reasonable performance evaluation. You would also need to know how to have your model built in the product, how to evaluate its real-time performance, etc. Sometimes it's not the issue of simply building one model, it could become a ML system design problem that could involve multiple components.

Michelle (Guqian) recommends the following next steps:

Read some articles on how Data Science techniques are used in the industry, and have a brief idea which area you're more interested in.
Machine learning is also a broad area, and you could go really deep with it if you want. You could take the famous Andrew Ng's Machine learning course on coursera to see whether you like it! https://www.coursera.org/learn/machine-learning
1
0
Updated Translate

Kulwinder’s Answer

Data science includes the algorithms and processing methodology for entire data as well.

Machine learning includes the implementing different algorithms for data to get best output.

0
0
Updated Translate

Yi’s Answer

Machine learning is very specific set of algorithms such as gradient boosting, random forest
people use term Data science more broadly - it definitely includes machine learning, and AI, and it can also includes more traditional modeling and statistic as well;
0
0
Updated Translate

Alessandro’s Answer

Data Science is a broad container for data storage, processing and modeling encompassing multiple disciplines like mathematics, statistics, and computer science. Machine learning is a specific set of methods and tools to learn from data and infer knowledge.

Every day there are new buzz words being introduce to refer to the same technology, my advice if you are interested in learning technology is to stay away from tech marketing and focus on the fundamentals: mathematics, statistics, computer science!
0
0
Updated Translate

Bonnie’s Answer

Tagging on to Benjamin’s answer I recommend Udacity’s free online courses. They have a free Stanford Introduction to Machine Learning Course and Intro to Data Science courses.

Bonnie recommends the following next steps:

Visit Udacity.com and choose from a list of over 200 fre courses
0
0
Updated Translate

Mohamed’s Answer

Machine learning
Machine learning creates a useful model or program by autonomously testing many solutions against the available data and finding the best fit for the problem. This means machine learning is great at solving problems that are extremely labor intensive for humans. It can inform decisions and make predictions about complex topics in an efficient and reliable way.

These strengths make machine learning useful in a huge number of different industries. The possibilities for machine learning are vast. This technology has the potential to save lives and solve important problems in healthcare, computer security and more. Google, always on the cutting edge, has decided to integrate machine learning into everything they do to stay ahead of the curve.

Data Science Process
The proliferation of smartphones and digitization of so many parts of daily life have created massive amounts of data. At the same time, the continuation of Moore’s Law, the idea that computing would dramatically increase in power and decrease in relative cost over time, has made cheap computing power widely available. Data science exists as the link between these two innovations. By combining these components, data scientists can derive more insight from data than ever before.

The practice of data science requires a unique combination of skills and experience. A good data scientist is fluent in programming languages like R and Python, has knowledge of statistical methods, an understanding of database architecture and the experience to apply these skills to real-world problems. A masters in data science may build upon existing knowledge to ensure that you are best prepared for a long career in this ever-growing field.

Data Scientist vs Machine Learning Engineer

Skills Needed for Data Scientists
Statistics
Data mining and cleaning
Data visualization
Unstructured data management techniques
Programming languages such as R and Python
Understand SQL databases
Use big data tools like Hadoop, Hive and Pig

Skills Needed for Machine Learning Engineers
Computer science fundamentals
Statistical modeling
Data evaluation and modeling
Understanding and application of algorithms
Natural language processing
Data architecture design
Text representation techniques

source: https://www.mastersindatascience.org/careers/data-science-vs-machine-learning/
0
0
Updated Translate

Henry’s Answer

Data science is just a umbrella term that encompasses everything from data analytics to machine learning. Generally speaking it just means using data (analysis/engineering) in order to find/analyze/act on information in order to accomplish an objective. Machine learning is a fairly broad set of techniques as well, but generally refers to algorithms that allow systems to automatically respond to new data inputs by learning from previous experience.
0