Introduction to Enterprise Scala in Spark
Course Objectives
This course provides indoctrination in the practical use of the umbrella of technologies that are on the leading edge of data science development.
Working in a hands-on learning environment led by our expert practitioner, attending students will lean:
- Essential Scala programming, leveraging your existing OO development experience
- How to write essential Spark programs and perform exploratory data analysis in Scala and the Spark shell
- Work with Spark Core
- Work with NoSQL
- How to write programs for Spark Streaming in Scala
Course Prerequisites
Attending students are required to have a background in programming with basic OO development skills. Students should have the following incoming skills or knowledge:
- Experience in developing object-oriented enterprise applications to at least a basic level
- Familiarity with Eclipse
- Be comfortable with the Linux/Unix command line, including editing text files
Course Agenda
Please note that this list of topics is based on our standard course offering, evolved from typical industry uses and trends. We’ll work with you to tune this course and level of coverage to target the skills you need most. Topics, agenda and labs are subject to change, and may adjust during live delivery based on audience needs and skill-level.
Session: Functional Programming in Scala
- Functional Programming
- Scala Overview
- Scala vs. Python vs. Java vs. R
- REPL in Scala
- Installing Scala
- Hello, Scala
Session: Introduction to Scala
- Classes and Objects
- Traits
- Mixins
- High-Order Functions
- Types and Inference
- Lists
- Annotations
- Collections
- Pattern Matching
- Using Java in Scala
- Futures, Promises, and Parallel Collections (Concurrency)
- Functional Programming Overview
Session: Spark Core
- Hadoop and Spark Overview
- File I/O with HDFS
- Data Frames and Resilient Distributed Datasets
- Spark SQL
- In-memory lookups
- Essential AI with MLLib
- Using Web Notebooks (Optional)
Session: Working with NOSQL
- Not Only SQL
- Relational Data
- Sqoop
- Columnar Databases
- Cassandra
- Document Databases
- Key/Value Databases
- Graph Databases
- Neo4J
- GraphX
- Hive in Spark
Session: Spark Streaming
- Spark Streaming Model
- Streaming with Kafka
Session: ML Lib
- Machine Learning Essentials
- Spark ML/MLLib
- MLLib and Streaming
- MLlib, Streaming, and Kafka
Session: Enterprise Integration
- Enterprise Service and Message Busses
- Lambda Architecture
Sorry! It looks like we haven’t updated our dates for the class you selected yet. There’s a quick way to find out. Contact us at 502.265.3057 or email info@training4it.com
Request a Date