Day 6: Cassandra Lab Practice – Indexes and Performance

✅ Day 6: Secondary Indexes & Basic Performance Considerations

🎯 Objective:

Learn how to create and use secondary indexes in Cassandra, and understand when and why they should (or shouldn't) be used.

🧠 1. Why Secondary Indexes?

Used when you want to query by a non-primary key column
Works best when:
- Column has high cardinality (many unique values)
- Queries are selective
Avoid on low-cardinality fields (e.g., status = 'active')

🏗️ 2. Create Secondary Index

➤ Syntax:

CREATE INDEX index_name ON keyspace.table (column_name);

✅ Example:

CREATE INDEX emp_dept_idx ON test_lab.employee (department);

🔍 3. Query Using the Index

SELECT * FROM test_lab.employee
WHERE department = 'HR';

⚠️ This query only works if a secondary index exists on department.

🔧 4. Drop Index

DROP INDEX IF EXISTS test_lab.emp_dept_idx;

📌 5. Index on Collection Columns

You can also index elements inside collections.

✅ Example – Indexing a map value:

CREATE TABLE test_lab.student_marks (
  student_id uuid PRIMARY KEY,
  marks map<text, int>
);

CREATE INDEX marks_index ON test_lab.student_marks (ENTRIES(marks));

Then:

SELECT * FROM test_lab.student_marks 
WHERE marks['math'] = 90;

🚫 Limitations of Indexes

May not scale well in high-volume systems
Better to model your data around queries than rely on indexes
Don't use multiple indexes in the same query

🧪 Day 6 Lab Tasks

Add a secondary index on department in employee
Query using that column (WHERE department = 'HR')
Drop the index
Create a new table with a map collection and index it
Try a query on the map key using ENTRIES()

✅ Checklist

Task	Done
Created and used secondary index	✅
Queried by non-primary column using index	✅
Dropped the index	✅
Created index on map collection column	✅

⚙️ Bonus Tip – Check Index Info:

DESCRIBE INDEX emp_dept_idx;