Apache Spark Basics – Quick Start to Spark

Overview

Apache Spark Basics introduces the participants to the Spark environment. This course is designed as a two-day, fast-paced training which covers the benefits, features and common uses & tools. During the course, participants work in a dynamic and hands-on learning environment.

What You'll Learn

Where Spark fits into the Big Data ecosystem
How to use core Spark features for critical data analysis
Key Spark technologies such as Spark shell for interactive data analysis, Spark internals, RDDs, Dataframes and Spark SQL

Curriculum

Background and history
Spark and Hadoop
Spark concepts and architecture
Spark ecosystem (core, Spark SQL, MLib, streaming)

Spark in local mode
Spark web UI
Spark shell
Analyzing dataset – part 1
Inspecting RDDs

Partitions
RDD Operations / transformations
RDD types
MapReduce on RDD
Caching and persistence
Sharing cached RDDs

Dataframes
Dataframes DDL
Spark SQL
Defining tables and importing datasets
Queries

Who should attend

This is an Introductory-level course, geared for Developers and Architects seeking to be proficient in Spark tools & technologies. Participants should be experienced developers who are comfortable with Java, Scala or Python programming. Participants should also be able to navigate Linux command line and have basic knowledge of Linux editors (such as VI/nano) for editing code.

This course is highly recommended for:

Lead data scientists
Spark developers
Software developers
Big Data scientists
Software architects
Java developers
Application developers
Full stack developers
Python developers

Prerequisites

Participants must attend the Java Programming Fundamentals (for Java training), Introduction to Python Programming (for Python training) and Introduction to SQL (Basic familiarity is needed, not in-depth SQL skills) courses, prior to taking up this course, or have equivalent knowledge and skills.

Apache Spark Basics – Quick Start to Spark

Overview

What You'll Learn

Curriculum

Who should attend

Prerequisites

Interested in this Course?

Ready to recode your DNA for GenAI?
Discover how Cognixia can help.

Generative AI - Rewire

Generative AI - Organization

JUMP

Digital Mindset & Culture

Change & Adoption

REWIRE

Organization Transformation

Apache Spark Basics – Quick Start to Spark

Overview

What You'll Learn

Curriculum

Module 1: Spark Basics

Module 2: First look at Spark

Module 3: RDDs in depth

Module 4: Spark SQL & Dataframes

Who should attend

Prerequisites

Interested in this Course?

Ready to recode your DNA for GenAI? Discover how Cognixia can help.

Ready to recode your DNA for GenAI?
Discover how Cognixia can help.