How to become a data scientist?
I am going to a liberal art college in a fringe town next school year. The school does not have a data science major but I am highly interested in data science. (I will be going to double majors in Computer science and Statistics.) What are some steps should I do to become a data scientist? Is hard to find a data scientists internship in summer? What are some tips during my 4 years of colleges?
Hi Yingyi, This is a great question! A double major in Comp Sci and Stats is a fantastic foundation for this area. The great news is that there are so many different ways to start working in data science and all different skills sets are required. One possible suggestion is to join a group like WiDS - there is a Women in Data Science (WiDS) Group that holds regional events - one was just in March. Check out the speakers to get an idea of how broad the possibilities and talent really is: http://www.widsconference.org/speakers.html
You will start to see more and more job opportunities in this field - your majors are a great springboard for big data or data science or so many related data/STEM careers. There are also some great online programs at https://www.coursera.org (see https://www.coursera.org/specializations/jhu-data-science)
You can also watch the WiDs videos: http://www.widsconference.org/videos.html (take notes on all the projects you like and start focusing in on possible areas that you would like to pursue.) One thing to consider is starting a Data Science Group/club at your school and watch the videos together as part of a club activity. This is what we did at our campus - we watched some of the conference together and added some local speakers and a networking event. It was so much fun and we met some great new peers at our own campus through this activity. It doesn't have to cost a lot either, get some chips and cookies and fire up the laptops. Find a faculty advisor to be a champion and include other areas/departments. Data science has sooo many applications to almost all industries.
Victoria recommends the following next steps:
Great question and excited to see you're interested in Data Science. Data Science is a not really a unified field and a bit buzzwordy. Data scientist come from all sorts of backgrounds (computer science, political science, physics, statistics, even Creative Writing [me!]) and work on all sorts of problems. One one hand you'll may have a data scientists working with advertisement data trying to predict target audiences and click through rates and on the other a data scientist using deep learning to analyze MRI scans for cancerous tumors. And all sorts of things in between. In my role as a Data Scientist at Talla, I research machine learning and deep learning technique to solve natural language problems like teaching a computer to read text, answer questions, and classify large corpuses of data.
You shouldn't worry about not having a Data Science major, your double major cover the majority of the baseline knowledge you'll need. Here's a few classes you should take to help with building skills fundamental to most data science work:
- Intro to Probability and Statistics
- Linear Algebra
- Data Structures and Algorithms
- Data Visualization
- Introduction to Causal Inference
Advanced (chose which one(s) look the most interesting)
- Machine Learning
- Deep Learning with Neural Networks
- Natural Language Processing
- Computer Vision
The best way to get a sense of what it's like to be data scientist is doing data science. A great resource for this is Kaggle and DrivenData. Both host open competitions and challenges that anyone can participate in. You should look through the challenges, see which look interesting, download the data, and start building models. Kaggle has a great set tutorials as well that well teach you the basic skills on working with data, looking for patterns and building models. The great thing is you don't need a college degree, just jump in. The more hands on experience you get working with data, the better.
Closely related to that point is document and share the data science work you do. It will be very useful to show potential employers a portfolio of your work where you can point to specific examples of the tools, techniques and analysis you did. You can store code samples in github and write blog posts on Medium to highlight cool findings.
On internships, it honestly depends on where you are located. I'd recommend talking to your school's career center for help here.
Another piece of advice. Go on LinkedIn and look for data scientist in your area and reach out to them. Say you are a student, and ask if they have time for a phone call or coffee for questions. Professionals in general like providing advice and helping out students. So feel free to ask about what they do, things they'd learned, and any advice they have. At the end of the conversation you might also want to ask if there are any internship opportunities and if they can recommend you.
And finally in terms of general advice, study abroad if you have chance!
Take some humanities and social science classes. Especially ethics, philosophy, and social or political science. Our technologies have far outpaced social norms and policies. As a data scientist you may find yourself working on human data (customers, patients, and other categories of people) and your models will have repercussions on real lives. It's important to the general welfare of society to be cognizant about the potential unintended consequences and implications of your work.
Hope this was helpful. Feel free to reach out to me directly if you have questions about Data Science or post here for more advice. Good luck!
Dhairya recommends the following next steps:
My consistent advice is that for any given role, there are a lot of jobs around it on the spectrum that are related and require slightly different skills or personalities. For example, most people think of the stereotype of being a doctor when they pursue the career. Then there are specializations like ER doctors, surgeons, pediatrics, etc. But then there are even more "shades", like radiology where you may not interact with patients at all, or administration where your focused on the operations of the hospital. These all take a grounding in medicine, but the personality types are vastly different for everyone of them.
In summary, once you're exposed to a field, your ability to blend your personality/interests into that field can help you find that area where you'll be most successful in your chosen career.
Many job postings list advanced degrees as requirements for data-related positions. Sometimes, that’s non-negotiable, but as demand outstrips supply—and given the often specialized, highly technical nature of the work—the proof is increasingly in the pudding. That is, data skills often outweigh mere credentialism. What’s most important to hiring managers is an ability to demonstrate mastery of the subject in some way, and it’s increasingly understood that this demonstration doesn’t have to follow traditional channels.
In the end, there’s no single path toward becoming a Data Analyst, and that’s good news if you’re hoping to land a data analysis role. Because Data Analysts can work across many different industries, may be generalists or highly specialized, and often play an interdisciplinary role in a company, even job titles in data analysis can be quite varied