Skip to main content

23. 🏭Data Science for Beginners

· 5 min read
Bethany Jepchumba

Please share

    

🗓️ Day 23 of #30DaysOfAzureAI

Foundations of Data Science: Workshops for Beginners

Yesterday we learned about the Azure MLOps (v2) Solution Accelerator. Today is for folk starting on their Data Science journey. Data Science for Beginners Curriculum is a 10-week, 20-lesson course, so dive in!

🎯 What we'll cover

  • Data Science for Beginners curriculum.
  • Data science principles, including ethics, preparation and visualization.

Image banner for day 23

📚 References

🚌 Introducing Data Science for Beginners

Today's article introduces the free Microsoft Data Science for Beginners, a 10-week, 20-lesson course, which includes pre-lesson and post-lesson quizzes, written instructions to complete each lesson, a solution, and an assignment. The course is designed for beginners and assumes no prior knowledge of data science.

The course covers basic principles of data science, including ethical concepts, data preparation, data visualization, data analysis, real-world use cases of data science, and more.

Data Science for Beginners Curriculum

According to Wikipedia, Data Science is a scientific field that uses scientific methods to extract knowledge and insights from structured and unstructured data and apply knowledge and actionable insights from data across a broad range of application domains. Data surrounds us in everything we do, from the clicks we make on different websites to the physical notes we possess. Using Data Science, you can extract insights from this data and apply it to actionable steps such as in decision making.

The Data Science for Beginners Curriculum is a gentle introduction to the world of data and how you can manipulate it to extract insights. The curriculum spans over 10 weeks (about 2 and a half months) covering 20 topics including Data Science ethics, Data Science in the Cloud, Data analysis, preparation, and visualization.

In addition to hands-on projects, the curriculum includes quizzes and a postscript on real world application of Data Science. In this blog, I will cover a brief overview of ethics in Data Science as well as preparing and visualizing your data linking the concepts back to specific curriculum lessons for self-study.

##Data Science Principles explained

Ethics in Data Science

Data surrounds us in everything we do. In our online interactions, most of our actions and activities are recorded. How the data collected is used is key in ensuring potential harms and unintended consequences arise from data-driven actions.

Under ethics we consider Microsoft’s six ethics principles including: accountability, transparency, fairness, inclusiveness, reliability & safety, and privacy & security. For example, when designing solutions how do we adapt them to meet a broad range of human needs and capabilities? Read more on data ethics in our curriculum.

Data Preparation

The Data Science process begins with getting your data. Real world data is often messy. You might encounter some missing values, duplicated data, or data not in the right format. For example, if someone fills in the same form twice, how do you mitigate this? In the data preparation lesson, you will learn how to import your data, the different libraries needed, how to locally install the libraries and finally how to use the libraries.

Data Visualization

After analyzing your data and you might want to share your findings with the world. The first thing is understanding what charts you can use where and how to create meaningful charts. For example, using either seaborne or matplotlib libraries, you can create bar graphs and pie charts to show how various categories in your data relate. Alternatively, you can create a line chart to show data changes over time. To create meaningful charts, you need to factor in readability in terms of colours used and ensure your charts are free of deception. Learn more on data visualization by using Minnesota’s birds dataset.

Conclusion

For a deep dive into the different Data Science concepts, building hands-on-projects, head over to the Data Science for Beginners curriculum. At the end of the lessons, you will understand how you can clean, prepare, and visualize your data ready for modelling. In tomorrow’s article, we cover Machine Learning, which covers the next step after understanding your data.

👓 View today's article

Today's article.

🙋🏾‍♂️ Questions?

You can ask questions about this post on GitHub Discussions

📍 30 days roadmap

What's next? View the #30DaysOfAzureAI Roadmap

🧲 Subscribe