Course curriculum

  • 1

    Introduction to Machine Learning for Data Engineers

    • What Do I Need to Take for The Data Engineering Certification?

    • An Introduction to Machine Learning

    • Modeling in Python

    • Data Cleansing

    • Basic Algorithm (Model) Types

    • The Perceptron

    • Simple Neural Network

  • 2

    The Google Cloud Platform for Data Engineers - Section 1

    • Introduction

    • Is this Course for You?

    • Instructor Q & A

    • Cloud Platform Resource Hierarchy

    • The Exam Case Study

    • Big Data Services

    • Section Summary

    • Data Engineering Cheat Sheet

  • 3

    The Google Cloud Platform for Data Engineers - Section 2

    • Creating an Account on GCP

    • Navigating the GCP Console

    • Creating a Project

    • Billing at a High Level

    • IAMS

    • Cloud Shell

    • API Mangement

    • Installing GCP SDK

    • Section Summary

  • 4

    The Google Cloud Platform for Data Engineers - Section 3

    • Compute Services High Level

    • Working with Compute Engine

    • Cloud Launcher

    • Compute Engine Resources

    • What's Docker?

    • App Engine

    • Deploy Docker Container

    • Summary

  • 5

    The Google Cloud Platform for Data Engineers - Section 4

    • Key Storage Terms

    • Storage Classes

    • Buckets

    • Working with Objects in Our Buckets

    • An Introduction to gsutil

    • Summary

  • 6

    The Google Cloud Platform for Data Engineers - Section 5

    • Cloud SQL Introduction

    • MySQL Client in Cloudshell

    • Creating and Exporting Schema

    • Cloud SQL Backups

    • Creating a Cloud SQL Instance

    • Summary

  • 7

    The Google Cloud Platform for Data Engineers - Section 6

    • What is BigQuery?

    • What is BigTable?

    • Cloud Datastore

    • The Pub/Sub Demo

    • What is Cloud Datastore?

    • What is Pub/Sub?

    • BigQuery Demo

    • Cloud Dataproc Demo

    • BigTable Demo

    • What is Tensorflow?

    • What is a Datalab?

    • Cloud Datalab Demo

    • Section Summary

  • 8

    Managing Big Data on Google's Cloud Platform - Section 1

    • Introduction

    • Is This Course Right for You?

    • The 4 Types of Data

    • Structured Versus Unstructured Data

    • Instructor Q&A

    • Section Summary

  • 9

    Managing Big Data on Google's Cloud Platform - Section 2

    • Why Use Cloud Dataproc?

    • On Premise Hadoop Buildout

    • Scaling Up or Out

    • Decouple Storage and Compute

    • Regions and Zones

    • Cloud Dataproc Architecture

    • Section Summary

  • 10

    Managing Big Data on Google's Cloud Platform - Section 3

    • The Cluster Creation Screen

    • Create Cluster Using Console

    • Create Cluster with Command Line

    • The 3 Cluster Options

    • Preemptible Worker Nodes

    • How Preemption Works

    • Image Versions

    • Custom Image

    • Custom Dataproc Cluster

    • Install Software on Dataproc Clusters

    • Add Initialization Actions

    • Cluster High Availability

    • Scaling Clusters

    • Section Summary

  • 11

    Managing Big Data on Google's Cloud Platform - Section 4

    • The Submit Jobs Screen

    • Submit Spark Job to Cluster

    • Submit PySpark Job vis SSH

    • Hadoop Jobs to GCP in 3 Steps

    • Scala and Python Jobs to GCP

  • 12

    Managing Big Data on Google's Cloud Platform - Section 5

    • WebHD_720p (4)

    • WebHD_720p (3)

    • WebHD_720p (5)

    • WebHD_720p (2)

    • WebHD_720p

    • WebHD_720p (6)

    • WebHD_720p (1)

  • 13

    Managing Big Data on Google's Cloud Platform - Section 6

    • Whiteboarding: On-Preminse to Cloud Dataproc

    • Whiteboarding: Moving Jobs to GCP

    • White Boarding: Data and Compute in the Same Zones

    • Whiteboarding: Defining Preemption

    • White Boarding: On-Premise Jobs to GCP Architecture

    • White Boarding: Adding Custom Software to Nodes

    • Section Summary

  • 14

    Streaming Analytics on Google's Cloud Platform - Section 1

    • Introduction

    • Is this Course Right for You?

    • What is Streaming?

    • The Three Vs of Data

    • The Beam Pipeline

    • Section Summary

  • 15

    Streaming Analytics on Google's Cloud Platform - Section 2

    • Definition and History

    • Beam Object Model

    • Pipeline Object Review

    • Pipeline Object Review Answer Key

    • Event Time and Processing Time

    • Windowing

    • Use Case: The Mobil App

    • Handling Data Tensions

    • MapReduce

    • FlumeJava and Batch Patterns

    • MillWheel

    • Event Skew

    • Section Summary

  • 16

    Streaming Analytics on Google's Cloud Platform Section - 3

    • Cloud Dataflow: The SDK and the Runner

    • The 4 Core Questions of Dataflow

    • Lab: Building a Dataflow Pipeline

    • Dataflow Job Monitoring UI

    • Stack Driver and Dataflow

    • Simple Dashboard

    • Lab: Monitoring Dataflow

    • Section Summary

  • 17

    Tensorflow on Google's Cloud Platform for Data Engineers - Section 1

    • Course Introduction

    • Exam Preparation Tip

    • Is This Course Right for You?

    • Exam Tip

    • What's an Array?

    • What's a Tensor?

    • The "FLOW" in Tensorflow

    • Number Moving Through Graph

    • Hello World in Tensorflow

    • Section Summary

  • 18

    Tensorflow on Google's Cloud Platform for Data Engineers - Section 2

    • Creating a Jupyter Notebook on GCP

    • Reconnect Datalab to Virtual Machine

    • Download and Upload Notebooks to Datalab

    • Up and Running with Cloud Datalab

    • Summary

  • 19

    Tensorflow on Google's Cloud Platform for Data Engineers - Section 3

    • The Tensorflow Code Base

    • Forward Feeding Graphs

    • Handling Iteration in Tensorflow Graphs

    • Steps in Every Tensorflow Program

    • Modeling Larger Computational Graphs

    • Resizing After High Utilization Warning

    • Simple End-to-End Example

    • Tensor Dimensions

    • Placeholders

    • Sessions

    • Node Life Cycle

    • Properties of a Tensor

    • Convert to Tensors

    • Enabling Logging with Tensorflow

    • Lab: hello World in Tensorflow

    • Section Summary

  • 20

    Tensorflow on Google's Cloud Platform for Data Engineers - Section 4

    • Numpy vs Tensorflow

    • Data Scrubbing

    • Data Import and Exploration

    • Linear Regression in Tensorflow

    • The Mandelbrot Set

    • Overfitting and How to Correct it.

    • Packaging Up Our Model

    • Creating a Server Input Function

    • Lab: Linear Regression in TensorFlow

    • Linear Regression Lab Walk Through

    • Cloud Machine Learning at Scale

    • Section Summary