Navigating the World of Data Engineering


2 to 3 p.m., Jan. 29, 2024
2 to 3 p.m., Feb. 5, 2024
2 to 3 p.m., Feb. 12, 2024
2 to 3 p.m., Feb. 19, 2024
2 to 3 p.m., Feb. 26, 2024
2 to 3 p.m., March 4, 2024
2 to 3 p.m., March 11, 2024
2 to 3 p.m., March 18, 2024
2 to 3 p.m., March 25, 2024

How can you master handling massive datasets and transform raw data into insightful, actionable information?

Join our workshops to dive deep into advanced data management and analysis techniques designed for graduate students. Discover the secrets of efficient database management, unravel the complexities of ETL (Extract, Transform, Load) processes, and get hands-on experience with cutting-edge big data technologies.

Are you ready to elevate your data engineering skills and stand out in the rapidly evolving field of data science?

Are you curious about how to kickstart your journey in data engineering with user-friendly tools before diving deep into the core complexities of the field?

We begin our workshop series with an accessible introduction to Streamlit and Gradio, crafting interactive web applications to visualize and manipulate data effortlessly. However, this is just the beginning. As the weeks progress, we will seamlessly transition into the heart of data engineering, unraveling the intricacies of ETL (Extract, Transform, Load) processes. This gradual progression ensures a solid foundation, paving the way for you to master advanced data engineering techniques with confidence. Are you ready to evolve from creating engaging data-driven applications to mastering the art of data extraction, transformation, and loading?

We meet in the Weaver Science and Engineering Library Rm 212. You can also join us via Zoom

Resources and Notes:

Date / Topic YouTube link
01/29/24 Building Python based webapps with Streamlit and Gradio
02/05/24 Deploying ML models and creating demos with Streamlit and Gradio
02/12/24 Introduction to SQL Part-1 (Basic Commands and Joins)
02/19/24 Introduction to SQL Part-2 (Functions, Sub-queries and Nested Selects)
02/26/24 Introduction to noSQL Part-1 (Types of noSQL Databases and MongoDB basics)
03/04/24 Spring Break (No Class)
03/11/24 Introduction to noSQL Part-2 (Basics of Cassandra)
03/18/24 Introduction to Spark and Hadoop Part-1 (Hadoop Ecosystem and Hive Tutorial)
03/25/24 Introduction to Spark and Hadoop Part-2 (Intro to Apache Spark and PySpark)



Carlos Lizárraga
Michele Cosi
Jeffrey Gillan