Working with Apache Hive | Apache Hive Essentials

Apache Hive is the de-facto standard for data warehousing Hadoop. This course starts with standard Hive setup and operations, continues into Advanced Hive use, discusses performance and execution engines, and ends with a practical workshop.

Retail Price: $1,895.00

Next Date: 07/18/2021

Course Days: 2


Enroll in Next Date

Request Custom Course


Course Objectives

This course is intended for data scientists and software engineers. It gives them practical level of experience, achieved through a combination of about 50% lecture, 50% lab work.

 

Course Prerequisites

This is an Introductory-level course is geared for experienced data scientists and engineers seeking a quick start to working with Hive. Attendees should have some familiarity with basic SQL, and should also be able to navigate Linux command line and have basic knowledge of Linux editors (such as VI / nano) for editing code.

This course is a core component of our Big Data, AI & Machine Learning Skills Path, designed to train participants of all skill levels in modern data science across the enterprise. We offer courses in next level Hadoop, Hive, Analytics, Kafka and more. Please contact us for details and next step recommendations based on your specific roles and. goals.


Course Outline

Hive Basics

  • Defining Hive Tables
  • SQL Queries over Structured Data
  • Filtering / Search
  • Aggregations / Ordering
  • Partitions
  • Joins
  • Text Analytics (Semi-Structured Data)

Hive Advanced

  • Transformation, Aggregation
  • Working with Dates, Timestamps, and Arrays
  • Converting Strings to Date, Time, and Numbers
  • Create new Attributes, Mathematical Calculations, Windowing Functions
  • Use Character and String Functions
  • Binning and Smoothing
  • Processing JSON Data
  • Execution Engines (Tez, MR, Spark)

Impala (for Cloudera track)

  • Architecture
  • Impala joins and other SQL specifics

Bonus Project

  • Students will work in teams to do this end-to-end workshop
  • Setup a data warehouse with Hive
  • Query and analyze data with Hive and Spark
Course Dates Course Times (EST) Delivery Mode GTR
7/18/2021 - 7/19/2021 10:00 AM - 6:00 PM Virtual Enroll
9/2/2021 - 9/3/2021 10:00 AM - 6:00 PM Virtual gauranteed to run course date Enroll
10/18/2021 - 10/19/2021 10:00 AM - 6:00 PM Virtual Enroll
11/29/2021 - 11/30/2021 10:00 AM - 6:00 PM Virtual Enroll
12/16/2021 - 12/17/2021 10:00 AM - 6:00 PM Virtual Enroll