Cassandra Fundamentals


Cassandra ๋ž€

Apache Cassandra ๋Š” Facebook ์—์„œ ์‹œ์ž‘๊ณ  ํ˜„์žฌ๋Š” Apache ์žฌ๋‹จ์—์„œ ๊ด€๋ฆฌํ•˜๊ณ  ์žˆ๋Š” ์˜คํ”ˆ์†Œ์Šค ๋ถ„์‚ฐ NoSQL Database ์ด๋‹ค. Java ๋กœ ์ž‘์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ, High Availability, Scalability ๋ฅผ SPOF ์—†์ด ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด Amazon DynamoDB ์˜ ๋ถ„์‚ฐ ์Šคํ† ๋ฆฌ์ง€ ๋””์ž์ธ๊ณผ Google Bigtable ์˜ ๋ฐ์ดํ„ฐ ๋ชจ๋ธ์„ ์กฐํ•ฉํ•˜์—ฌ ์„ค๊ณ„๋˜์—ˆ๋‹ค.

Cassandra ํŠน์ง•

  • Masterless ๋ฐฉ์‹์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด Cluster ์ค‘๋‹จ ์—†์ด ๋…ธ๋“œ๋ฅผ ์ถ”๊ฐ€/์‚ญ์ œํ•˜์—ฌ ์ˆ˜ํ‰ ํ™•์žฅ/์ถ•์†Œ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.
  • ๋ฐ์ดํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ ๋…ธ๋“œ์— ๋ถ„์‚ฐ ๋ฐ ๋ณต์ œํ•˜์—ฌ ์ €์žฅํ•œ๋‹ค.
  • CQL(Cassandra Query Language) ์ด๋ผ๋Š” SQL ๊ณผ ์œ ์‚ฌํ•œ ์ฟผ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ JOIN ๋“ฑ ๋ณต์žกํ•œ ์—ฐ์‚ฐ์€ ์ง€์›ํ•˜์ง€ ์•Š๋Š”๋‹ค.
  • Column Family Data Model ์„ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ WHERE ์ ˆ์— Key ๋งŒ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋“ฏ ๋ณต์žกํ•œ ์ฟผ๋ฆฌ๋Š” ์ง€์›ํ•˜์ง€ ์•Š์ง€๋งŒ ๋‹จ์ˆœํ•œ ๊ฒ€์ƒ‰ ์กฐ๊ฑด์œผ๋กœ ๋Œ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒ€์ƒ‰ํ•˜๊ธฐ ์ ํ•ฉํ•˜๋‹ค.

Cassandra Architecture

Cassandra ๋Š” ๋ชจ๋“  ๋…ธ๋“œ๊ฐ€ ์„œ๋กœ ์†Œํ†ตํ•  ์ˆ˜ ์žˆ๋Š” Peer-to-Peer ์•„ํ‚คํ…์ฒ˜์™€ ๋ชจ๋“  ๋…ธ๋“œ๊ฐ€ ๋™์ผํ•œ ์—ญํ• ์„ ํ•˜๋Š” Masterless ๋ฐฉ์‹์œผ๋กœ ํด๋Ÿฌ์Šคํ„ฐ(๋˜๋Š” Ring)๋ฅผ ์ด๋ฃจ์–ด ๋ถ„์‚ฐ ์‹œ์Šคํ…œ์„ ๊ตฌ์„ฑํ•œ๋‹ค. ๊ฐ ๋…ธ๋“œ๊ฐ€ ๋™๋“ฑํ•œ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ํด๋Ÿฌ์Šคํ„ฐ์— ๋…ธ๋“œ๋ฅผ ์ถ”๊ฐ€ํ•จ์œผ๋กœ์จ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์ˆ˜ํ‰์œผ๋กœ ํ™•์žฅํ•˜๊ธฐ ์šฉ์ดํ•˜๋‹ค.

Cassandra ์— ์ €์žฅ๋œ ๋ฐ์ดํ„ฐ๋Š” ํด๋Ÿฌ์Šคํ„ฐ ์ „์ฒด์— ๊ท ๋“ฑํ•˜๊ฒŒ ๋ถ„์‚ฐ๋˜๊ณ , ๊ฐ ๋…ธ๋“œ๊ฐ€ ๋…๋ฆฝ์ ์œผ๋กœ ์ฝ๊ธฐ์™€ ์“ฐ๊ธฐ ์ž‘์—…์„ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.

  • Cassandra Cluster = Cassandra Ring
  • Data Center = Rack ์˜ ๋…ผ๋ฆฌ์ ์ธ ์ง‘ํ•ฉ
  • Rack = Node ์˜ ๋…ผ๋ฆฌ์ ์ธ ์ง‘ํ•ฉ์œผ๋กœ ๋ฐ์ดํ„ฐ ๋ณต์ œ๋ณธ์ด ๋‹ค๋ฅธ ๋…ผ๋ฆฌ์  Rack ์— ๋ถ„์‚ฐ๋˜๋„๋ก ์‚ฌ์šฉ
  • Node = Cassandra ๋ฅผ ํ˜ธ์ŠคํŒ…ํ•˜๋Š” ์„œ๋ฒ„ ์ธ์Šคํ„ด์Šค๋กœ ๋…ธ๋“œ๋ผ๋ฆฌ Gossip Protocol ์„ ํ†ตํ•ด ํ†ต์‹ 
  • Keyspace = RDBMS ์˜ Database ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ ํ•˜๋‚˜ ์ด์ƒ์˜ Column Family ๋ฅผ ํฌํ•จ
  • Column Family = RDBMS ์˜ Table ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ ๊ฐ Row ๋งˆ๋‹ค ๋‹ค๋ฅธ Column ์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Œ
  • Column = Key-Value ํ˜•ํƒœ๋กœ ์ €์žฅ๋˜๋ฉฐ Key(Column Name) ์€ ์ •์ , ๋™์  ์ƒ์„ฑ ๊ฐ€๋Šฅ

Cassandra Data Model

Cassandra ์— ์ €์žฅ๋˜๋Š” ๋ฐ์ดํ„ฐ๋Š” Column Family ์— ์ €์žฅ๋˜๋Š”๋ฐ ๊ฐ Row ๊ฐ€ Key-Value ๋กœ ์ด๋ฃจ์–ด์ง„ ์—ฌ๋Ÿฌ๊ฐœ์˜ Column ์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋‹ค. RDBMS ์™€ ๋‹ฌ๋ฆฌ Column ์ด ๋ชจ๋‘ ์กด์žฌํ•˜์ง€ ์•Š์•„๋„ ๋œ๋‹ค.

Cassandra ์˜ Primary Key ๋Š” 1๊ฐœ ์ด์ƒ์˜ Partition Key(Row Key) ์™€ 0๊ฐœ ์ด์ƒ์˜ Cluster Key ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. Cassandra ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์‚ฐ ๋ฐ ๋ณต์ œํ•˜์—ฌ ์ €์žฅํ•˜๊ธฐ ์œ„ํ•ด Partition Key(Row Key) ๋ฅผ ์‚ฌ์šฉํ•ด Hash Token ์„ ์ƒ์„ฑํ•˜๊ณ  ํ•ด๋‹น Token ์— ๋งž๋Š” ๋…ธ๋“œ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์‚ฐ ๋ฐ ์ €์žฅํ•œ๋‹ค. Cluster Key(Sort Key) ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ •๋ ฌํ•  ๋•Œ ์‚ฌ์šฉํ•˜๋Š” Key ๋กœ ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋  ๋•Œ ์ •๋ ฌํ•ด์„œ ์ €์žฅํ•œ๋‹ค.

Installing Cassandra with Docker


Cassandra Node 2๊ฐœ ์ƒ์„ฑ

docker pull cassandra:latest
docker network create cassandra
docker run \
  --name cassandra-node1 \
  --network cassandra \
  --rm -d cassandra:latest
docker run \
  --name cassandra-node2 \
  --network cassandra \
  -e CASSANDRA_SEEDS=cassandra-node1 \
  --rm -d cassandra:latest
 meatsby ๐Ÿ‘พ ๎‚ฐ ~ ๎‚ฐ docker ps
CONTAINER ID   IMAGE              COMMAND                   CREATED         STATUS         PORTS                                         NAMES
999d64b6b046   cassandra:latest   "docker-entrypoint.sโ€ฆ"   6 seconds ago   Up 6 seconds   7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp   cassandra-node2
bc4d85c6c785   cassandra:latest   "docker-entrypoint.sโ€ฆ"   2 minutes ago   Up 2 minutes   7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp   cassandra-node1

 meatsby ๐Ÿ‘พ ๎‚ฐ ~ ๎‚ฐ docker exec -it cassandra-node1 bash
root@bc4d85c6c785:/# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load        Tokens  Owns (effective)  Host ID                               Rack
UN  172.20.0.3  119.68 KiB  16      100.0%            8c056774-d514-4991-bf03-0d8e79fd74e0  rack1
UN  172.20.0.2  119.81 KiB  16      100.0%            3a905cff-d969-4f67-a37c-6ec3c7dc171b  rack1

root@bc4d85c6c785:/# cqlsh
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.2.0 | Cassandra 5.0.3 | CQL spec 3.4.7 | Native protocol v5]
Use HELP for help.

Keyspace ์ƒ์„ฑ ๋ฐ ๋ฐ์ดํ„ฐ ์ €์žฅ

-- Create a keyspace
CREATE KEYSPACE IF NOT EXISTS test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : '2' };
 
-- Create a table
CREATE TABLE IF NOT EXISTS test_table (
userid text PRIMARY KEY,
item_count int,
last_update_timestamp timestamp
);
 
-- Insert some data
INSERT INTO test_table
(userid, item_count, last_update_timestamp)
VALUES ('9876', 2, toTimeStamp(now()));
INSERT INTO test_table
(userid, item_count, last_update_timestamp)
VALUES ('1234', 5, toTimeStamp(now()));
root@bc4d85c6c785:/# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load        Tokens  Owns (effective)  Host ID                               Rack
UN  172.20.0.3  112.97 KiB  16      100.0%            8c056774-d514-4991-bf03-0d8e79fd74e0  rack1
UN  172.20.0.2  95.8 KiB    16      100.0%            3a905cff-d969-4f67-a37c-6ec3c7dc171b  rack1
  • 'replication_factor' : '2' ๋กœ ์ง€์ •ํ•ด๋†จ๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋“  ๋…ธ๋“œ์— ์ €์žฅ๋œ ๋ชจ์Šต
 meatsby ๐Ÿ‘พ ๎‚ฐ ~ ๎‚ฐ docker ps
CONTAINER ID   IMAGE              COMMAND                   CREATED         STATUS         PORTS                                         NAMES
999d64b6b046   cassandra:latest   "docker-entrypoint.sโ€ฆ"   5 minutes ago   Up 5 minutes   7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp   cassandra-node2
bc4d85c6c785   cassandra:latest   "docker-entrypoint.sโ€ฆ"   7 minutes ago   Up 7 minutes   7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp   cassandra-node1

 meatsby ๐Ÿ‘พ ๎‚ฐ ~ ๎‚ฐ docker stop 99
99

 meatsby ๐Ÿ‘พ ๎‚ฐ ~ ๎‚ฐ docker ps
CONTAINER ID   IMAGE              COMMAND                   CREATED         STATUS         PORTS                                         NAMES
bc4d85c6c785   cassandra:latest   "docker-entrypoint.sโ€ฆ"   7 minutes ago   Up 7 minutes   7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp   cassandra-node1

 meatsby ๐Ÿ‘พ ๎‚ฐ ~ ๎‚ฐ docker exec -it cassandra-node1 bash
root@bc4d85c6c785:/# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address     Load        Tokens  Owns (effective)  Host ID                               Rack
DN  172.20.0.3  112.97 KiB  16      100.0%            8c056774-d514-4991-bf03-0d8e79fd74e0  rack1
UN  172.20.0.2  95.8 KiB    16      100.0%            3a905cff-d969-4f67-a37c-6ec3c7dc171b  rack1

root@bc4d85c6c785:/# cqlsh
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.2.0 | Cassandra 5.0.3 | CQL spec 3.4.7 | Native protocol v5]
Use HELP for help.
cqlsh> use test;

cqlsh:test> select * from test_table;

 userid | item_count | last_update_timestamp
--------+------------+---------------------------------
   1234 |          5 | 2025-03-26 15:38:56.057000+0000
   9876 |          2 | 2025-03-26 15:38:55.600000+0000

(2 rows)
  • Node2 ๊ฐ€ ๋‹ค์šด๋์ง€๋งŒ ์—ฌ์ „ํžˆ ๋ฐ์ดํ„ฐ๊ฐ€ ์กฐํšŒ๋˜๋Š” ๋ชจ์Šต
    • UN (Up Normal): ๋…ธ๋“œ๊ฐ€ ์ •์ƒ์ ์œผ๋กœ ์ž‘๋™ ์ค‘ (Up & Normal)
    • DN (Down Normal): ๋…ธ๋“œ๊ฐ€ ์ •์ƒ์ ์œผ๋กœ ํ† ํด๋กœ์ง€์— ์†ํ•˜์ง€๋งŒ, ํ˜„์žฌ ๋‹ค์šด๋จ (Down & Normal)
    • UJ (Up Joining): ์ƒˆ๋กœ์šด ๋…ธ๋“œ๊ฐ€ ํด๋Ÿฌ์Šคํ„ฐ์— ํ•ฉ๋ฅ˜ ์ค‘
    • UL (Up Leaving): ๋…ธ๋“œ๊ฐ€ ํด๋Ÿฌ์Šคํ„ฐ์—์„œ ๋– ๋‚˜๋Š” ์ค‘
    • UM (Up Moving): ๋…ธ๋“œ๊ฐ€ ํ† ํฐ์„ ์ด๋™ํ•˜๋Š” ์ค‘

References