Next Level Python in Data Science (Intermediate) | Numpy, Pandas, Spark, TensorFlow & More

Next Level Python for Data Science and /or Machine Learning covers the essentials of using Python as a tool for data scientists to perform exploratory data analysis, complex visualizations, and large-scale distributed processing on “Big Data”. In this course we cover essential mathematical and statistics libraries such as NumPy, Pandas, SciPy, SciKit-Learn, TensorFlow, as well as visualization tools like matplotlib, PIL, and Seaborn. This course is ‘intermediate level’ as it assumes that attendees have solid data analytics and data science background and have basic Python knowledge. Topics are introductory in nature, but are covered in-depth, geared for experienced students.

Retail Price: $2,595.00

Next Date: 05/20/2024

Course Days: 5


Enroll in Next Date

Request Custom Course


At Course Completion

This course is approximately 50% hands-on, combining expert lecture, real-world demonstrations and group discussions with machine-based practical labs and exercises. Our engaging instructors and mentors are highly experienced practitioners who bring years of current "on-the-job" experience into every classroom. Throughout the hands-on course students, will learn to leverage core Python scripting for data science skills using the most current and efficient skills and techniques.

  • How to work with Python in a Data Science Context
  • How to use NumPy, Pandas, and MatPlotLib
  • How to create and process images with PIL
  • How to visualize with Seaborn
  • Key features of SciPy and Scikit Learn

Audience Profile

This course is geared for experienced data analysts, developers, engineers or anyone tasked with utilizing Python for data analytics or eventual machine learning tasks.

 

Prerequisites

Attending students are required to have a background in basic Python.

Take Before: Students should have attended or have incoming skills equivalent to those in the following courses:

·         TTPS4873      JumpStart to Python for Data Science (3 days)

·         TTPS4874      Applied Python for Data Science and Engineering (4 days)


Outline

1.       Python Review (Optional)

  • Why Python?
  • Python syntax compared to other programming languages
  • Python interpreter
  • Strings
  • Understanding lists
  • Tuples and Sets
  • Dictionaries
  • Parsing command-line arguments
  • Decision making
  • Loops
  • Iterators
  • Generators
  • Functions & Modules

2.       NumPy Arrays and Vectorized Computation

  • NumPy arrays
  • Array functions
  • Data processing using arrays
  • Linear algebra with NumPy
  • NumPy random numbers

3.       SciPy

  • Cluster
  • Constants
  • FFTpack
  • Integrate
  • Interpolate
  • Linalg
  • Ndimage
  • Spatial

4.       Introducing Pandas

  • Data in the 21st century
  • Introducing pandas
  • A tour of pandas
  • Summary

5.       The DataFrame Object

  • Overview of a DataFrame
  • Similarities between Series and DataFrames
  • Sorting by index
  • Setting a new index
  • Selecting columns and rows from a DataFrame
  • Selecting rows from a DataFrame
  • Extracting values from Series
  • Renaming columns or rows
  • Resetting an index

6.       Filtering a DataFrame

  • Optimizing a data set for memory use
  • Filtering by a single condition
  • Filtering by multiple conditions
  • Filtering by condition
  • Dealing with duplicates
  • Coding challenge

7.       Merging, Joining and Concatenating

  • Introducing the data sets
  • Concatenating data sets
  • Missing values in concatenated DataFrames
  • Left joins
  • Inner joins
  • Outer joins
  • Merging on index labels
  • Coding challenge

8.       Visualization Using Matplotlib

  • A crash course in Matplotlib
  • Covariance and correlation
  • Conditional probability
  • Bayes' theorem

9.    Using PIL/Pillow

  • Overview
  • How to Install Pillow
  • How to Load and Display Images
  • How to Convert Images to NumPy Arrays and Back
  • How to Save Images to File
  • How to Resize Images
  • How to Flip, Rotate, and Crop Images
  • Extensions

10.    Visualization Using Seaborn

  • Introduction
  • Handling Data with pandas DataFrame
  • Plotting with pandas and seaborn
  • Tweaking Plot Parameters

11.    Machine Learning with scikit-learn

  • An overview of machine learning models
  • The scikit-learn modules for different models
  • Data representation in scikit-learn
  • Supervised learning – classification and regression
  • Unsupervised learning – clustering and dimensionality reduction
  • Measuring prediction performance

 

Bonus Content / Time Permitting

12.    TensorFlow Overview

  • Introduction
  • What are Neural Networks?
  • Why Do Neural Networks Work So Well?
  • Configuring a Deep Learning Environment
  • Exploring a Trained Neural Network
Course Dates Course Times (EST) Delivery Mode GTR
5/20/2024 - 5/24/2024 10:00 AM - 6:00 PM Virtual Enroll
7/8/2024 - 7/12/2024 10:00 AM - 6:00 PM Virtual Enroll
8/19/2024 - 8/23/2024 10:00 AM - 6:00 PM Virtual Enroll
10/14/2024 - 10/18/2024 10:00 AM - 6:00 PM Virtual Enroll
12/9/2024 - 12/13/2024 10:00 AM - 6:00 PM Virtual Enroll