7 answers
Asked
433 views
What do you recommend learning/doing as a beginner in data science?
I'm looking for free resources that actually do something or mean something, where I can learn. Codedex blocks you after a couple lessons, so it is not my speed. I want to learn tensorflow, pytorch, SQL, etc... and make projects. Any recs?
Login to comment
7 answers
Updated
Laila’s Answer
Hi Priya! Here are some great free resources to get you started:
freeCodeCamp, Kaggle Learn, and Google's Machine Learning Crash Course offer free, project-based learning with no hidden fees.
For learning SQL, check out SQLZoo and Mode Analytics SQL Tutorial. If you're interested in TensorFlow and PyTorch, their official tutorials are beginner-friendly and a great place to start.
Once you grasp the basics, dive into building projects on Kaggle using real datasets. That's where you'll really learn and grow. You've got this!
freeCodeCamp, Kaggle Learn, and Google's Machine Learning Crash Course offer free, project-based learning with no hidden fees.
For learning SQL, check out SQLZoo and Mode Analytics SQL Tutorial. If you're interested in TensorFlow and PyTorch, their official tutorials are beginner-friendly and a great place to start.
Once you grasp the basics, dive into building projects on Kaggle using real datasets. That's where you'll really learn and grow. You've got this!
Updated
Srinivasa’s Answer
Here are a few insights:
• SQL (Structured Query Language) handles only structured data in a table, row, and column format
• SQL basics are essential for Data Analysis; you have to focus more on data functions, as I mentioned below
• Aggregate Functions: Summarize data using COUNT(), SUM(), AVG(), MAX(), and MIN().
• Also focus on CASE Statement, All joins, Logical Operators, and String & date functions
• You should know basic selection commands, like how to apply WHERE conditions, SELECT, and group by commands
Best of luck!!
• SQL (Structured Query Language) handles only structured data in a table, row, and column format
• SQL basics are essential for Data Analysis; you have to focus more on data functions, as I mentioned below
• Aggregate Functions: Summarize data using COUNT(), SUM(), AVG(), MAX(), and MIN().
• Also focus on CASE Statement, All joins, Logical Operators, and String & date functions
• You should know basic selection commands, like how to apply WHERE conditions, SELECT, and group by commands
Best of luck!!
Updated
Liam’s Answer
If you need a codex replacement, try ollama and claude code instead. I think claude code and qwen-coder on a low power computer works as an entry level AI code assistant. Look for open source solutions for your self learning not only because they are free, but most of the companies out there use them anyway. Set up a small homelab and just start to dump data into and figure out what you want to learn from there.
check out:
ollama
openwebui
claude code (on ollama)
qwen-coder
phi
granite
chromaDB
qdrant
Try it on a decent desktop PC, you basically have a low powered version of codex. Basically homelab and start from the bottom up and you will learn more than if you had the higher powered commercial tools.
https://www.youtube.com/live/TsyAsrnYnhQ
https://youtu.be/yoze1IxdBdM
https://docs.ollama.com/integrations/claude-code - a 7b token model will work on a lot of dedicated GPU models
https://ollama.com/library/granite4, https://ollama.com/library/phi4-mini-reasoning
check out:
ollama
openwebui
claude code (on ollama)
qwen-coder
phi
granite
chromaDB
qdrant
Try it on a decent desktop PC, you basically have a low powered version of codex. Basically homelab and start from the bottom up and you will learn more than if you had the higher powered commercial tools.
Liam recommends the following next steps:
Updated
Kristen’s Answer
Hi! YouTube is a great place to find free courses on topics like Python, Power BI, and other analytical tools. LinkedIn also offers free courses, and some even come with a certificate. You can search online for more free courses or use AI tools like ChatGPT to help you find them. If a course offers a certificate, it's a good idea to choose those.
Updated
Maximus’s Answer
I wouldn’t jump straight into TensorFlow or PyTorch right away. It’s way easier if you build a solid foundation first. Start with Python, get comfortable working with data, and learn some SQL—then move into machine learning.
If you follow a good order, things make a lot more sense. Begin with Python (that’s non-negotiable), then learn data analysis with Pandas and NumPy. After that, spend some time on SQL since it’s super important for real jobs. Once you’ve got that, pick up the basics of statistics and probability, then move into machine learning. Only after that should you dive into TensorFlow or PyTorch.
A lot of people get stuck because they rush into deep learning too early—it just ends up being confusing without the fundamentals.
For free resources, there are some really solid ones. Kaggle is probably the best overall because you can learn and practice at the same time with real datasets and projects. Google Colab is great too since you don’t have to set anything up and you can run everything in your browser, even with free GPU access. freeCodeCamp has full courses on Python and machine learning, Khan Academy is excellent for understanding stats, and fast.ai is great once you’re ready for more practical deep learning.
If you look at what people actually recommend in the real world, it’s usually something like: use Kaggle to practice data work, learn SQL properly, and only then move into machine learning frameworks. That path works.
The biggest thing, though, is to actually build stuff early. Try making a simple movie recommender, do a sales or dashboard project using SQL and Python, or build a basic model that predicts something. Use Kaggle datasets and put your projects on GitHub—those matter way more than certificates.
A simple timeline could look like this: spend the first couple of weeks on Python, the next couple on Pandas and SQL, then move into machine learning in your second month, and by the third month start exploring TensorFlow or PyTorch while building projects.
That approach will take you a lot further than trying to learn everything at once.
If you follow a good order, things make a lot more sense. Begin with Python (that’s non-negotiable), then learn data analysis with Pandas and NumPy. After that, spend some time on SQL since it’s super important for real jobs. Once you’ve got that, pick up the basics of statistics and probability, then move into machine learning. Only after that should you dive into TensorFlow or PyTorch.
A lot of people get stuck because they rush into deep learning too early—it just ends up being confusing without the fundamentals.
For free resources, there are some really solid ones. Kaggle is probably the best overall because you can learn and practice at the same time with real datasets and projects. Google Colab is great too since you don’t have to set anything up and you can run everything in your browser, even with free GPU access. freeCodeCamp has full courses on Python and machine learning, Khan Academy is excellent for understanding stats, and fast.ai is great once you’re ready for more practical deep learning.
If you look at what people actually recommend in the real world, it’s usually something like: use Kaggle to practice data work, learn SQL properly, and only then move into machine learning frameworks. That path works.
The biggest thing, though, is to actually build stuff early. Try making a simple movie recommender, do a sales or dashboard project using SQL and Python, or build a basic model that predicts something. Use Kaggle datasets and put your projects on GitHub—those matter way more than certificates.
A simple timeline could look like this: spend the first couple of weeks on Python, the next couple on Pandas and SQL, then move into machine learning in your second month, and by the third month start exploring TensorFlow or PyTorch while building projects.
That approach will take you a lot further than trying to learn everything at once.
Updated
Chen’s Answer
The answers above included pretty much the good resources to learn data science related topics. I just want to add a few things:
- Like of the other answer mentioned, once you have some basic knowledge and coding skills, get yourself into projects for implementation. To do that, I recommend you to check out Kaggle competitions.
- Besides python, pyspark, make sure you know how to use agentic AI to facilitate coding. As a data scientist for some years, today I am coding less myself. At work we have GitHub Copilot which improves productivity significantly. These days a lot of coding job can be better done by Copilot, Claude Code, etc.. Make sure your knowledge and skills about these tools are up to date.
- As for online learning platforms, I personally like taking courses from Coursera, but they are not all free.
Good luck!
- Like of the other answer mentioned, once you have some basic knowledge and coding skills, get yourself into projects for implementation. To do that, I recommend you to check out Kaggle competitions.
- Besides python, pyspark, make sure you know how to use agentic AI to facilitate coding. As a data scientist for some years, today I am coding less myself. At work we have GitHub Copilot which improves productivity significantly. These days a lot of coding job can be better done by Copilot, Claude Code, etc.. Make sure your knowledge and skills about these tools are up to date.
- As for online learning platforms, I personally like taking courses from Coursera, but they are not all free.
Good luck!
Updated
Sandeep’s Answer
Hello Priya,
As a beginner in data science, start with the fundamentals first like Python, basic statistics, and SQL. Once you are comfortable you can start moving into advanced tools like TensorFlow and PyTorch. Once you understand the basics, you can apply them in real projects.
Good free resources include:
- Khan Academy for statistics
- freeCodeCamp for Python and SQL
- Coursera and edX for beginner data science courses (many offer free audit options)
Start small, build projects like data visualizations, simple models, or SQL dashboards to gain practical experience.
As a beginner in data science, start with the fundamentals first like Python, basic statistics, and SQL. Once you are comfortable you can start moving into advanced tools like TensorFlow and PyTorch. Once you understand the basics, you can apply them in real projects.
Good free resources include:
- Khan Academy for statistics
- freeCodeCamp for Python and SQL
- Coursera and edX for beginner data science courses (many offer free audit options)
Start small, build projects like data visualizations, simple models, or SQL dashboards to gain practical experience.