Day 6: Cassandra Lab Practice – Indexes and Performance


Day 6: Secondary Indexes & Basic Performance Considerations

🎯 Objective:

Learn how to create and use secondary indexes in Cassandra, and understand when and why they should (or shouldn't) be used.


🧠 1. Why Secondary Indexes?

  • Used when you want to query by a non-primary key column
  • Works best when:

    • Column has high cardinality (many unique values)
    • Queries are selective
  • Avoid on low-cardinality fields (e.g., status = 'active')

🏗️ 2. Create Secondary Index

➤ Syntax:

CREATE INDEX index_name ON keyspace.table (column_name);

Example:

CREATE INDEX emp_dept_idx ON test_lab.employee (department);

🔍 3. Query Using the Index

SELECT * FROM test_lab.employee
WHERE department = 'HR';

⚠️ This query only works if a secondary index exists on department.


🔧 4. Drop Index

DROP INDEX IF EXISTS test_lab.emp_dept_idx;

📌 5. Index on Collection Columns

You can also index elements inside collections.

Example – Indexing a map value:

CREATE TABLE test_lab.student_marks (
  student_id uuid PRIMARY KEY,
  marks map<text, int>
);

CREATE INDEX marks_index ON test_lab.student_marks (ENTRIES(marks));

Then:

SELECT * FROM test_lab.student_marks 
WHERE marks['math'] = 90;

🚫 Limitations of Indexes

  • May not scale well in high-volume systems
  • Better to model your data around queries than rely on indexes
  • Don't use multiple indexes in the same query

🧪 Day 6 Lab Tasks

  1. Add a secondary index on department in employee
  2. Query using that column (WHERE department = 'HR')
  3. Drop the index
  4. Create a new table with a map collection and index it
  5. Try a query on the map key using ENTRIES()

Checklist

Task Done
Created and used secondary index
Queried by non-primary column using index
Dropped the index
Created index on map collection column

⚙️ Bonus Tip – Check Index Info:

DESCRIBE INDEX emp_dept_idx;