Overview
Cassandra (C*) is a massively scalable NoSQL database that provides high availability and fault tolerance, as well as linear scalability when adding new nodes to a cluster. This course provides an in-depth introduction to working with Cassandra and using it create effective data models, while focusing on the practical aspects of working with C*. The course covers important topics such as internal architecture for making sound decisions, CQL (Cassandra Query Language) as well as Java APIs for writing Cassandra clients.
What You'll Learn
- Understand the needs addressed by C*
 - Be familiar with the operation and structure of C*
 - Be able to install and set up a C* database
 - Use the C* tools, including cqlsh, nodetool and CCM (Cassandra Cluster Manager)
 - Familiarize with C* architecture and how a C* cluster is structured
 - Understand how data is distributed and replicated in a C* cluster
 - Understand core C* data modelling concepts and use them to create well-structured data models
 - Use data replication and eventual consistency intelligently
 - Understand and use CQL to create tables and query for data
 - Know and use the CQL data types (numerical, textual, uuid, etc.)
 - Understand the various kinds of primary keys available (simple, compound and composite primary keys)
 - Use more advanced capabilities like collections, counters, secondary indexes, CAS (Compare and Set), static columns and batches
 - Familiarize with the Java client API
 - Use the Java client API to write client programs that work with C*
 - Build and use dynamic queries with QueryBuilder
 - Understand and use asynchronous queries with the Java API
 
Curriculum
- Why we need Cassandra
 - High level Cassandra overview
 - Cassandra features
 - Basic Cassandra installation and configuration
 
- Cassandra architecture overview
 - Cassandra clusters and rings
 - Data replication in Cassandra
 - Cassandra consistency/eventual consistency
 - Introduction to CQL
 - Defining tables with a single primary key
 - Using cqlsh for interactive querying
 - Selecting and inserting/upserting data with CQL
 - Data replication and distribution
 - Basic data types (including uuid, timeuuid)
 
- Defining a compound primary key
- CQL for compound primary keys
 - Partition keys and data distribution
 - Clustering columns
 - Overview of internal data organization
 
 - Additional querying capabilities
- Result ordering – ORDER BY and CLUSTERING ORDER BY
 - UPDATE and DELETE queries
 - Result filtering, ALLOW FILTERING
 - Batch queries
 
 - Data modelling guidelines
- Denormalization
 - Data modelling workflow
 - Data modelling principles
 - Primary key considerations
 
 - Composite partition keys
- Defining with CQL
 - Data distribution with composite partition keys
 - Overview of internal data organization
 
 
- Indexing
- Primary/partition keys and pagination with token()
 - Secondary indexes and usage guidelines
 
 - Cassandra counters
- Counter structure and definition
 - Using counters
 - Counter limitations
 - Cassandra collections
 - Collection structure and uses
 - Defining collections (set, list, and map)
 - Querying collections (including insert, Update, Delete)
 - Limitations
 - Overview of internal storage organization
 
 - Static column – overview and usage
 - Static column guidelines
 - Materialized view: Overview and usage
 - Materialized view guidelines
 
- Overview of consistency in Cassandra
 - CAP theorem
 - Eventual (tunable) consistency in C* – One, Quorum, All
 - Choosing CL One
 - Choosing CL Quorum
 - Achieving immediate consistency
 - Using other consistency levels
 - Internal repair mechanisms (Read repair, hinted handoff)
 
- Overview of lightweight transactions
 - Using LWT, the [applied] column
 - IF EXISTS, IF NOT EXISTS, Other IF conditions
 - Basic CAS internals
 - Overhead and guidelines
 
- Dealing with Write failure
- Unavailable Node and NodeFailure
 - Requirements for Write operations
 
 - Key and row caches
- Cache overview
 - Usage guidelines
 
 - Multi-data center support
- Overview
 - Replication factor configuration
 - Additional Consistency Levels – LOCAL/EACH QUORUM
 
 - Deletes
- CQL for Deletion
 - Tombstones
 - Usage Guidelines
 
 
- API Overview
- Introduction
 - Architecture and Features
 
 - Connecting to a Cluster
- Cluster and Cluster.Builder
 - Contact Points, Connecting to a Cluster
 - Session Overview and API
 - Working with Sessions
 
 - The Query API
- Overview
 - Dynamic Queries, Statement, SimpleStatement
 - Processing Query Results, ResultSet, Row
 - PreparedStatement, BoundStatement
 - Binding Values and Querying with PreparedStatements
 - CQL to Java Type Mapping
 - Working with UUIDs
 - Working with Time/Date Values
 - Working with Batches of SimpleStatement and PreparedStatement
 
 - Dynamic Queries and QueryBuilder
- QueryBuilder Overview and API
 - Building SELECT, DELETE, INSERT, and UPDATE Queries
 - Creating WHERE Clauses
 - Other Query Examples
 
 - Configuring Query Behavior
- Setting LIMIT and TTL
 - Working with Consistency
 - Using LWT
 - Working with Driver Policies
 - Load Balancing Policies – RoundRobinPolicy, DCAwareRoundRobinPolicy
 - Retry Policies – DefaultRetryPolicy, DowngradingConsistencyRetryPolicy, Other Policies
 - Reconnection Policies
 
 - Asynchronous Querying Overview
- Synchronous vs. Asynchronous Querying
 - Executing Asynchronous Queries
 - util.concurrent.Future
 - Cassandra ResultSetFuture
 
 
Who should attend
The course is highly recommended for –
- Java developers
 - Database administrators
 - Spring developers
 - Architects
 - Full stack developers/engineers
 - DevOps developers/engineers
 



