Skip to main content
5 answers
4
Asked 564 views Translate

How should I build my Data Science Portfolio to find a job?

I have a Bachelor's Degree in CS, 4 years of work experience and currently pursuing MS, Business Analytics from The University of Texas at Dallas. I want to build a strong portfolio for Data Science Jobs. I am seeking a mentor to help me work on some good projects or provide guidance.
#data-science #machine-learning #data-analytics #job

+25 Karma if successful
From: You
To: Friend
Subject: Career question for you

4

5 answers


0
Updated Translate

Matt’s Answer

After interviewing over 1000 data scientists and engineers, I've found the strongest portfolios consists of a highlighted blog posts that demonstrate the cool things you've worked on. Even better when the blog posts link to public notebooks (e.g., Jupyter notebooks on GitHub). As a hiring manager, I always appreciated being able to briefly read someone's technical writing to get a feel for their proficiency with professional skills, including their aptitude at communicating complex technical work.


These next steps highlight a strategy that I think could help a student build such a portfolio. Let me know what you think and if I could better clarify anything.


Wish you the best in building your portfolio!

Matt recommends the following next steps:

Find an interesting data problem. E.g., a Kaggle competition or an analysis using Reddit data from files.pushshift.io. Focus on something you find interesting so your excitement will show when you share your results with other professionals.
Start small. Spend 2-4 hours working with the data and see what simple things you can learn. Put together a small notebook (e.g., Jupyter notebook) and share it on GitHub or Kaggle that highlights what your initial work and what you learned. Lastly, write a short blog post sharing your initial work. Aim for something people can read in less than 5 minutes.
Iterate. Extend your work to apply more sophisticated methods and learn other cool things about the data. Aim to share 2-5 more notebooks and blog posts to build a portfolio of work around this interesting problem space.
Summarize. Write a blog posts that introduces the problem space and discusses what you learned. Only discuss methods at a high level and instead link to your earlier blog posts. Aim to write something that another professional less experienced in data science could appreciate, while learning a bit about your problem space and the methods.
Optional: Share with our community and get feedback. Share your work on platforms like Kaggle, Reddit, Hacker's News, etc. Look to get feedback, both positive and constructive. Further, recognize your work may help teach less experienced professionals.
0
0
Updated Translate

David’s Answer

I think Kaggle is an excellent place to learn about Data Science, build your skills, and showcase your skills. I did a Kaggle contest a few years ago and it was _very_ exciting and gave me something very _concrete_ to point to to prove to anyone what I could do.

David recommends the following next steps:

Take a look at Kaggle contests and educational resources. Pick a contest and make a baseline entry in it.
0
0
Updated Translate

Chhavi’s Answer

I would also recommend checking with Econ and Econometrics groups in your university to check if they have projects or they have data problems that need resolution that would give you some hands on practice. Given the current scenario, you may have to look for data problems, some resources could be the SAS, Git hub Websites. Linked in Learning and YouTube videos can provide some additional training material. I think what would really help though is practicing with an actual problem or data to solve once you feel you have learned a few steps.
Once it comes to visualization, there is a book called "Say it with Charts" that will help you present the data in a visual way using non visualization tools like Tableau and QlikView.
0
0
Updated Translate

Chhavi’s Answer

I would also recommend checking with Econ and Econometrics groups in your university to check if they have projects or they have data problems that need resolution that would give you some hands on practice. Given the current scenario, you may have to look for data problems, some resources could be the SAS, Git hub Websites. Linked in Learning and YouTube videos can provide some additional training material. I think what would really help though is practicing with an actual problem or data to solve once you feel you have learned a few steps.
Once it comes to visualization, there is a book called "Say it with Charts" that will help you present the data in a visual way using non visualization tools like Tableau and QlikView.
0
0
Updated Translate

Sahar’s Answer

You'll first want to create a github repository. Keep your projects there, and make sure they are well documented, with good Readmes that explain the scope, purpose, and outcomes of each project. A blog post (per previous suggestions) is nice, but not necessary because you can relay the same information in your github readme. Sometimes maintaining a blog in addition to the code base can be an extra step that keeps folks from doing anything. It's better to just start doing projects instead of worrying about how to make everything perfect (in other words, having an imperfect SOMETHING is better than having nothing).

To find actual projects, you can search on data science/machine learning contest sites like kaggle.com or just do general internet searches for things like "machine learning project".
0