What are the best data science courses/nanodegrees I can take online?
I'm currently a third year computer science major, and I've recently discovered a passion for data science and machine learning. I've been trying to learn more about it, and I've found many online resources (Udacity's data analyst, machine learning nano degrees, Stanford's Intro to Machine Learning course taught by Andrew Ng, and various courses offered on Udemy). With all of these great resources, which one would be best for someone who is comfortable programming and wants to get more practical experience? Any advice would be very helpful (and even more so if you are a current data scientist!) and would be greatly appreciated!
#machine-learning #data-analysis #data #big-data #computer-science #computer-software #data-science #data-mining #data-visualization
The most important thing I've learned in my career is the importance of real world application. Regardless if one obtains Data/ML skills from a degree program, online course, or other means, once you land a job in this field the most important skill you can have is translating real world use cases into the technology, and vis versa. I recommend finding course that have real word examples or taking on projects of your own. During job interviews, most teams wants to hear actual stories and examples of how you've applied your learning rather that simply what you know. It can also be helpful to learn some of the industry tools (Splunk, Tableau, etc) so you can hit the ground running in your future roles.
Logan recommends the following next steps:
Hi Albert, you ask a great question. There are so many choices that narrowing them down can be difficult. I have taken the Intro to Programming Nanodegree from Udacity and based on my experience, I recommend short listing it. Why? Udacity is going to teach you about the subject and then ask you to demonstrate your learning with a practical application. These applications are scaled down versions of what you might expect to be asked to do in the real world. They will be evaluated and you will understand how to improve if needed. When you are complete with the training, you'll have a portfolio of accomplishments that you can demonstrate. More importantly, when you finish and have your certificate, you'll likely be approached by leading technology companies interested in offering you a job. Who backs Udemy's courses versus Udacities, for example? Look at who is endorsing these courses as a way to narrow them down. Having said that, you can also take a number of the free courses elsewhere and then return to the paid offerings. Yes, investing ~$650 at Udacity can lead to a potential career. :)
Douglas recommends the following next steps:
I've been working as a software engineer for a long time and I'm now making a career change into data science and machine learning. So I've been asking myself the same question.
I've spent a lot of time on courses and they sure have been helpful. However, in retrospect I feel I've spent too much time on courses and theory and too little on solving real-world problems.
Sure, you need to lay some groundwork, but don't overdo it. The problem with studying too much is that for knowledge to really stick it has to be applied. So it's best to alternate between learning and applying.
If I now had to do it all over again this is what I would do:
- Learn the basics at Kaggle Learn. This is the groundwork. An alternative would be Andrew Ng's machine learning course on Coursera. Or have a look at machinelearningmastery that provides a lot of practical information.
- Start working on a problem in one of the competitions.
- Study top solutions in completed competitions. Top contenders will provide a writeup on their approach and sometimes also the code. It's amazing and humbling...
- If you feel you lack some specific knowledge or find something interesting then find a course that will help you master that specific subject. But only extract things from the course that you currently need
Once you have gained the basic knowledge and experience go deeper. Maybe you're interested in a specific field like NLP or finance, then dive into that. Explore more models and the math. Maybe read some research papers.
Happy ML journey
Stefan recommends the following next steps:
Hi Albert, you chose a great field to study. I have taken many online courses including the ones you mentioned, such as Udacity Machine learning Nanodegree and Andrew Ng machine learning course on Coursera. In addition to that, I also took Deeping learning Nanodegree on Udacity and Deep learning specialization taught by Andrew on Coursera. Overall, they are good quality courses. However, there is some level of difference between data science and machine learning.
For data science, it relates more to mathematics and statistics. You focus more on explanatory analysis. You will analyze lots of data and try to find the underlying relationship between them and also explain them. Explanation plays a huge role as a data scientist. Not only you need to build a great model, but also you need to explain them in business words and try your best find the causal factors.
For machine learning, it relates more to computer science. You focus more on building models with smaller errors. Sometimes, the model you built is a black box. You won't be able to understand what is the root cause to make your model bad or good. In other words, as a machine learning engineer favor accuracy more than explanation. Due to this, machine learning engineer will spend quite large amount of time on parameter tuning and model training.
With that being said, If you want to be a data scientist. You may need to learn more about statistics and math. The data analysis nanodegree on Udacity is a great start. If you want to be a Machine Learning Engineer, you need to be really good in programming. You most likely will be using Python for it. Proficiency in Java or C# would also be a great plus, because eventually you will productionize your models. After you gain proficiency in programming, I will recommend the machine learning course taught by Andrew on Coursera as a start. If you want more advanced stuff, you can take Deeping learning specialization taught by Andrew.
Since you are CS major, I assume you want to be a machine learning engineer. I worked as a data scientist @ Airbnb. Here at San Francisco Bay area, Machine learning engineer interview will still test you most on coding skills instead of modeling. So be good at coding can really help you land a job as MLE. I hope this helps you :)
Ji recommends the following next steps:
This is a great question that I come across in just about every career fair I go to these days. I agree with the two previous answers. Most courses today will provide you with a relatively comprehensive introduction to practical data science. What I find lacking in most new graduates applying for a job is the experience of applying these concepts to real world problems (really want to emphasize real-world here).
I know it is an unrealistic to expect real world experience of fresh college graduate. Moreover, a lot of the academic/courses problems are tailored to the topics and far from being representative of what a data scientist may do on the job. In the real world machine learning and data science problems in general tend to be messy and complex. So once you are done with the course I highly recommend that you get on Kaggle (Kaggle.com) and start competing. The more of these competitions you participate in, the more familiar you will get with the intricacies real-world problems bring.
I interview over 30 new graduates with data science background a year. Almost always I find those who have extended experience participating in Kaggle have a much easier time passing the interview, ramping up on the job and go on to grow and have much greater impact on the team thank compared to those who learn on the job.
I by no means intend to suggest that one can't go on to succeed by just doing the courses. However, i am assuming you really want an advantage that gets you noticed by an employer and want to succeed in your passion. Kaggle will get you closer to that.
Lastly, even Kaggle problems aren't exactly real-world problems. They are cleaned to ensure sensitive data isn't exposed to the public and they only focus on a small scope of the larger problem so they are solved in a reasonable timeframe.
Yohannes recommends the following next steps:
Hey Albert, awesome to see you're interested in Data Science. So I've taken Udacity courses (machine learning nano-degree) and bunch of other classes online. Since you're still in college, I would recommend seeing what classes your college offers for machine learning/data science. If you haven't already, you should take classes in statistics, probability and linear algebra to shore up your mathematical foundations. It's much easier to do this when you're in college and don't have distractions that trying to re-learn it on your own when you leave.
In terms of Udacity, I'm a huge fan of their programs. The machine learning nano-degree is more relevant program for you (data analyst is for folks coming from a non-technical and non-mathematical background wanting to do simple descriptive stats and analysis). You can technically take the majority of the courses for free. The nano-degree just adds in some additional applied exercises. As a heads up all the coding exercises and jupyter notebooks for the nanodegree are on github (https://github.com/udacity/machine-learning). If want feedback and grading, you'll have to pay for the courses. Also check out fastai's courses on deep learning and machine learning (http://course.fast.ai/) and (http://forums.fast.ai/t/another-treat-early-access-to-intro-to-machine-learning-videos/6826). Jeremey Howard does a great job of teaching applied data science concepts and provides lots of practical advice on building and delpoying data science models.
I am a data scientist that focuses on applying machine learning and deep learning to NLP problems. A couple of pieces of advice from a practitioners point of view. Learn about how to properly split up you data into train/validation/test sets and how use cross-validation for improving your models. The udacity free machine learning course (the sebastian thrum one) is fantastic on this point. More advanced data scientist will understand how to develop training data that is actually representative of the real world. Andrew Ng Machine Learning Yearning (http://www.mlyearning.org/) is a great resource for practical tips and pointers about using machine learning in production. 90 percent of your life will be data processing. So if you haven't already, spend some time getting familiar with Python, pandas and numpy and learn how to clean, format, and handle different data formats. Finally, practice a lot. Do competitions on Kaggle and other data science competition sites. Don't worry about doing well, just focus on getting lots of experience handling data and building intuitions.
Feel free to ask more specific questions if you have any. Good luck!