Working with Apache Hive

Name: training4it.com
Address: 9913 Shelbyville Rd #200, Louisville, KY, 40223
Telephone: 502.265.3057

Hive is the de-facto standard for data warehousing Hadoop. This course starts with a Hive setup and operations and continues into advanced Hive uses. It also discusses performance and execution engines while ending with a practical workshop.

Retail Price: $1,795.00

Next Date: Request Date

Course Days: 2

Request a Date

Request Custom Course

WHAT YOU'LL LEARN

Join an engaging hands-on learning environment, where you’ll learn:

Hive basics and features
How to process, transform, and manage data
Processing and performance management
How to setup a date warehouse with Hive
Data query and analysis

WHO SHOULD ATTEND?

Data Scientists, Software Engineers, Developers, and Administrators

PREREQUISITES

Before attending this course, you should:

Be familiar with SQL
Be able to navigate the Linux command line
Have basic knowledge of command line Linux editors (VI/nano)

COURSE OUTLINE

Hive Basics

Defining Hive Tables
SQL Queries over Structured Data
Filtering / Search
Aggregations / Ordering
Partitions
Joins
Text Analytics (Semi-Structured Data)

Hive Advanced

Transformation, Aggregation
Working with Dates, Timestamps, and Arrays
Converting Strings to Date, Time, and Numbers
Create new Attributes, Mathematical Calculations, Windowing Functions
Use Character and String Functions
Binning and Smoothing
Processing JSON Data
Execution Engines (Tez, MR, and Spark)

Impala (for Cloudera track)

Architecture
Impala joins and other SQL specifics

Bonus Project

Students will work in teams to do this end-to-end workshop
Setup a data warehouse with Hive
Query and analyze data with Hive and Spark

Sorry! It looks like we haven’t updated our dates for the class you selected yet. There’s a quick way to find out. Contact us at 502.265.3057 or email info@training4it.com

Request a Date