Data Science Overview
Retail Price: $895.00
Next Date: 05/09/2024
Course Days: 1
Enroll in Next Date
Request Custom Course
Course Objectives
This course provides a high-level view of a variety of core, current data science related technologies, strategies, skillsets, initiatives and supporting tools in common business enterprise practices. This list covers a general range of topics current to the time of course distribution. We will collaborate with your team to refine level of depth of coverage, understand areas of greater importance to your team, where you would like to add demos, etc.
Throughout the session you'll:
- Foundations: Grids & Virtualization; SOA, ESB / EMB, The Cloud
- The Hadoop Ecosystem: HDFS; Resource Navigators, MapReduce, Spark, Distributions
- Big Data, NOSQL, and ETL
- ETL: Exchange, Transform, Load
- Handling Data & a Survey of Useful tools
- Enterprise Integration Patterns and Message Busses
- Developing in Hadoop Ecosystem: R, Python, Java, Scala, Pig, and BPMN
- Artificial Intelligence and Business Systems
- Who’s on the Team? Evolving Roles and Functions in Data Science
- Growing your Infrastructure
Course Prerequisites
This introductory-level / primer course is an overview intended for Business Analysts, Data Analysts, Data Architects, DBAs, Network (Grid) Administrators, Developers or anyone else in the data science realm who need to have a baseline understanding of some of the core areas of modern Data Science technologies, practices and available tools.
Attendees should have prior exposure to Enterprise Information Technology. As well as familiarity with Relational Databases.
Outline
Please note that this list of topics is based on our standard course offering, evolved from typical industry uses and trends. We will work with you to tune this course and level of coverage to target the skills you need most.
Foundations
- Grids and Virtualization
- Service-Oriented Architecture
- Enterprise Service Bus
- Enterprise Message Bus
- The Cloud
The Hadoop Ecosystem
- HDFS: Hadoop Distributed File System
- Resource Negotiators: YARN, Mesos, and Spark; ZooKeeper
- Hadoop Map/Reduce
- Spark
- Hadoop Ecosystem Distributions: Cloudera, Hortonworks, OpenSource
Big Data, NOSQL, and ETL
- Big Data vs. RDBMS
- NOSQL: Not Only SQL
- Relational Databases: Oracle, MariaDB, DB/2, SQL Server, PostGreSQL
- Key/Value Databases: JBoss Infinispan, Terracotta, Dynamo, Voldemort
- Columnar Databases: Cassandra, HBase, BigTable
- Document Databases: MongoDB, CouchDB/CouchBase
- Graph Databases: Giraph, Neo4J, GraphX
- Apache Hive
- Common Data Formats
- Leveraging SQL and SQL variants
ETL: Exchange, Transform, Load
- Data Ingestion, Transformation, and Loading
- Exporting Data
- Sqoop, Flume, Informatica, and other tools
Enterprise Integration Patterns and Message Busses
- Enterprise Integration Patterns: Apache Camel and Spring Integration
- Enterprise Message Busses: Apache Kafka, ActiveMQ, and other tools
An Overview of Developing in Hadoop Ecosystem
- Languages: R, Python, Java, Scala, Pig, and BPMN
- Libraries and Frameworks
- Development, Testing, and Deployment
Exploring Artificial Intelligence and Business Systems
- Artificial Intelligence: Myths, Legends, and Reality
- The Math
- Statistics
- Probability
- Clustering Algorithms, Mahout, MLLib, SciKit, and Madlib
- Business Rule Systems: Drools, JRules, Pegasus
The Modern Data Team
- Agile Data Science
- NOSQL Data Architects and Administrators
- Developers
- Grid Administrators
- Business and Data Analysts
- Management
- Evolving your Team
- Growing your Infrastructure
Course Dates | Course Times (EST) | Delivery Mode | GTR | |
---|---|---|---|---|
5/9/2024 - 5/9/2024 | 10:00 AM - 6:00 PM | Virtual | Enroll | |
6/20/2024 - 6/20/2024 | 10:00 AM - 6:00 PM | Virtual | Enroll | |
8/1/2024 - 8/1/2024 | 10:00 AM - 6:00 PM | Virtual | Enroll | |
9/12/2024 - 9/12/2024 | 10:00 AM - 6:00 PM | Virtual | Enroll | |
10/24/2024 - 10/24/2024 | 10:00 AM - 6:00 PM | Virtual | Enroll | |
12/5/2024 - 12/5/2024 | 10:00 AM - 6:00 PM | Virtual | Enroll |