✅ Day 6: Secondary Indexes & Basic Performance Considerations
🎯 Objective:
Learn how to create and use secondary indexes in Cassandra, and understand when and why they should (or shouldn't) be used.
🧠 1. Why Secondary Indexes?
- Used when you want to query by a non-primary key column
Works best when:
- Column has high cardinality (many unique values)
- Queries are selective
- Avoid on low-cardinality fields (e.g.,
status = 'active'
)
🏗️ 2. Create Secondary Index
➤ Syntax:
CREATE INDEX index_name ON keyspace.table (column_name);
✅ Example:
CREATE INDEX emp_dept_idx ON test_lab.employee (department);
🔍 3. Query Using the Index
SELECT * FROM test_lab.employee
WHERE department = 'HR';
⚠️ This query only works if a secondary index exists on department
.
🔧 4. Drop Index
DROP INDEX IF EXISTS test_lab.emp_dept_idx;
📌 5. Index on Collection Columns
You can also index elements inside collections.
✅ Example – Indexing a map
value:
CREATE TABLE test_lab.student_marks (
student_id uuid PRIMARY KEY,
marks map<text, int>
);
CREATE INDEX marks_index ON test_lab.student_marks (ENTRIES(marks));
Then:
SELECT * FROM test_lab.student_marks
WHERE marks['math'] = 90;
🚫 Limitations of Indexes
- May not scale well in high-volume systems
- Better to model your data around queries than rely on indexes
- Don't use multiple indexes in the same query
🧪 Day 6 Lab Tasks
- Add a secondary index on
department
inemployee
- Query using that column (
WHERE department = 'HR'
) - Drop the index
- Create a new table with a
map
collection and index it - Try a query on the map key using
ENTRIES()
✅ Checklist
Task | Done |
---|---|
Created and used secondary index | ✅ |
Queried by non-primary column using index | ✅ |
Dropped the index | ✅ |
Created index on map collection column | ✅ |
⚙️ Bonus Tip – Check Index Info:
DESCRIBE INDEX emp_dept_idx;