Ceph deploy via cephadm

Post about how deploy ceph deploy via cephadm on ubuntu 20.04 on 3 nodes

REQUIREMENTS

3 nodes, with additional drive(in this example /dev/sdb)
ubuntu/debian
Systemd
Podman or Docker for running containers
Time synchronization (such as chrony or NTP)

PREPARE

Edit /etc/hosts

192.168.56.101 node1 node1.linuxdemo.local
192.168.56.102 node2 node2.linuxdemo.local
192.168.56.103 node3 node3.linuxdemo.local

Set hostnames

sudo hostnamectl set-hostname node1
sudo hostnamectl set-hostname node2
sudo hostnamectl set-hostname node3

Hostnames must be resolved by ipv4, not localhost loop!!!

timesync install

sudo apt-get install ntp ntpdate -y
timedatectl set-ntp on

tune /etc/sysctl.conf

kernel.pid_max = 4194303
fs.aio-max-nr=1048576

sysctl -p

User ceph-admin add

sudo useradd -d /home/ceph-admin -m ceph-admin -s /bin/bash
sudo passwd ceph-admin
echo "ceph-admin ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ceph-admin
sudo chmod 0440 /etc/sudoers.d/ceph-admin

Ssh configure

Create ssh config for user

su ceph-admin

tee -a ~/.ssh/config <<EOF
Host *
   UserKnownHostsFile /dev/null
   StrictHostKeyChecking no
   IdentitiesOnly yes
   ConnectTimeout 0
   ServerAliveInterval 300
Host node1
   Hostname node1
   User ceph-admin
   UserKnownHostsFile /dev/null
   StrictHostKeyChecking no
   IdentitiesOnly yes
   ConnectTimeout 0
   ServerAliveInterval 300
Host node2
   Hostname node2
   User ceph-admin
   UserKnownHostsFile /dev/null
   StrictHostKeyChecking no
   IdentitiesOnly yes
   ConnectTimeout 0
   ServerAliveInterval 300
Host node3
   Hostname node3
   User ceph-admin
   UserKnownHostsFile /dev/null
   StrictHostKeyChecking no
   IdentitiesOnly yes
   ConnectTimeout 0
   ServerAliveInterval 300
EOF

Generate and copy keys on all nodes

su ceph-admin
ssh-keygen
ssh-copy-id ceph-admin@node1 && ssh-copy-id ceph-admin@node2 && ssh-copy-id ceph-admin@node3

Ceph initial deploy

From node1(192.168.56.101)

su ceph-admin
sudo curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm -o /usr/local/bin/cephadm
sudo chmod +x /usr/local/bin/cephadm
sudo rm /etc/apt/sources.list.d/ceph.list
sudo rm /etc/apt/trusted.gpg.d/ceph.release.gpg
sudo /usr/local/bin/cephadm add-repo --release octopus
sudo /usr/local/bin/cephadm install
if [[ ! -d "/etc/ceph" ]]; then sudo mkdir -p /etc/ceph;fi
sudo /usr/local/bin/cephadm bootstrap --mon-ip  192.168.56.101 --initial-dashboard-user itclife  --initial-dashboard-password itclife --dashboard-password-noupdate

Output like

INFO:cephadm:Ceph Dashboard is now available at:
         URL: https://vm-1:8443/
        User: admin
    Password: fv8b98gzeb
INFO:cephadm:You can access the Ceph CLI with:
    sudo /usr/local/bin/cephadm shell --fsid a3f8eb22-edbc-11ea-a98b-09a89b8358a4 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
INFO:cephadm:Please consider enabling telemetry to help improve Ceph:
    ceph telemetry on
For more information see:
    https://docs.ceph.com/docs/master/mgr/telemetry/
INFO:cephadm:Bootstrap complete.

sudo /usr/local/bin/cephadm shell --fsid a3f8eb22-edbc-11ea-a98b-09a89b8358a4 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

exit from docker

exit

Install packages other machines(from cluster) to mount ceph(node2,node3)

sudo cephadm install ceph-common

Copy keys to root user to other nodes

ssh-copy-id -f -i /etc/ceph/ceph.pub root@node2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@node3

cat /etc/ceph/ceph.pub

and copy to /root/authorized_keys on node2,node3

Ceph nodes add via orch(orchestrator)

sudo ceph orch host add node2
sudo ceph orch host add node3

Set public network for our ceph

sudo ceph config set mon public_network 192.168.1.0/24

Deploy monitors on a specific set of hosts:

sudo ceph orch apply mon "node1,node2,node3"

or If you want to adjust the default of 5 monitors:

sudo ceph orch apply mon 5

Ceph add label mon

sudo ceph orch host label add node1 mon
sudo ceph orch host label add node2 mon
sudo ceph orch host label add node3 mon

ceph list hosts

ceph orch host ls

Ceph create osd

Ceph list osds

sudo ceph osd tree | column -t -s $'\t'

Exec command on nodes to clean drives(not necessary)

sudo wipefs --all /dev/sdb

Add osd on host's

sudo ceph orch daemon add osd node1:/dev/sdb
sudo ceph orch daemon add osd node2:/dev/sdb
sudo ceph orch daemon add osd node3:/dev/sdb

Ceph create fs and mds, and set placement groups (version1)

Create fs with name volume1

sudo ceph fs volume create volume1

Create 3 mds for fs with name volume1

sudo ceph orch apply mds volume1 3

When we create fs with name volume1, pools will be created automatically

List pools

sudo ceph osd lspools

output

1 device_health_metrics
2 cephfs.volume1.meta
3 cephfs.volume1.data

SET THE NUMBER OF PLACEMENT GROUPS

get pg

ceph osd pool get cephfs.volume1.data pg_num
ceph osd pool get cephfs.volume1.meta pgp_num

and set pg

sudo ceph osd pool set cephfs.volume1.data pg_num 128
sudo ceph osd pool set cephfs.volume1.data pgp_num 128

it is mandatory to choose the value of pg_num because it cannot be calculated automatically. Here are a few values commonly used:

Less than 5 OSDs set pg_num to 128
Between 5 and 10 OSDs set pg_num to 512
Between 10 and 50 OSDs set pg_num to 4096
If you have more than 50 OSDs, you need to understand the tradeoffs and how to calculate the pg_num value by yourself
For calculating pg_num value by yourself please take help of pgcalc tool

List fs

sudo ceph mds stat
sudo ceph fs ls

Ceph create fs and mds(version1)

Create fs with name cephfs

sudo ceph osd pool create cephfs_data 64 64
sudo ceph osd pool create cephfs_metadata 64 64
sudo ceph fs new cephfs cephfs_metadata cephfs_data
sudo ceph orch apply mds cephfs 3

Where:

{pool-name} is the name of the Ceph pool you are creating.
pg_num is the total number of placement groups for the pool. It is recommended that pg_num is set to 128 when using less than 5 OSDs in your Ceph cluster.
pgp_num specifies total number of placement groups for placement purposes. Should be equal to the total number of placement groups.

Info about pg

# By default, Ceph makes 3 replicas of objects. If you want to make four
# copies of an object the default value--a primary copy and three replica
# copies--reset the default values as shown in 'osd pool default size'.
# If you want to allow Ceph to write a lesser number of copies in a degraded
# state, set 'osd pool default min size' to a number less than the
# 'osd pool default size' value.
osd pool default size = 4  # Write an object 4 times.
osd pool default min size = 1 # Allow writing one copy in a degraded state.
# Ensure you have a realistic number of placement groups. We recommend
# approximately 100 per OSD. E.g., total number of OSDs multiplied by 100
# divided by the number of replicas (i.e., osd pool default size). So for
# 10 OSDs and osd pool default size = 4, we'd recommend approximately
# (100 * 10) / 4 = 250.

Ceph mount cephfs

Create creds

Volume name volume1

sudo ceph fs authorize volume1 client.user / rw | sudo tee /etc/ceph/ceph.client.user.keyring
sudo chmod 600 /etc/ceph/ceph.client.user.keyring
sudo cat /etc/ceph/ceph.client.user.keyring

Output

[client.user]
    key = AQDSxDNfzpmkIhAAGyKz2EIqZsaVrXWP0ZNRAQ==

mount via sh

sudo mkdir /mnt/cephfs
sudo mount -t ceph node3:/ /mnt/cephfs -o name=user,secret=AQDSxDNfzpmkIhAAGyKz2EIqZsaVrXWP0ZNRAQ==
cd /mnt/cephfs

mount via fstab

192.168.56.101,192.168.56.102,192.168.56.103:/     /mnt/cephfs    ceph    name=user,secret=AQDSxDNfzpmkIhAAGyKz2EIqZsaVrXWP0ZNRAQ==,noatime,_netdev    0       2

FAQ

Deploy ceph client

ceph-deploy install ceph-client

Ceph test

cd /mnt/cephfs
sudo dd if=/dev/zero of=./test3.img bs=20M count=30

Ceph repair

sudo ceph osd crush rule dump # Shows you the current crush maps
sudo ceph osd getcrushmap -o comp_crush_map.cm  # Get crush map
crushtool -d comp_crush_map.cm -o crush_map.cm  # Decompile map
vim crush_map.cm       # Make and changes you need (host -> osd)
crushtool -c crush_map.cm -o new_crush_map.cm  # Compile map
sudo ceph osd setcrushmap -i new_crush_map.cm  # Load the new map

Ceph clean all packages and wipe data, umount fstab

cd /etc/systemd/system
rm -rf $(ls | grep ceph)
systemctl daemon-reload
docker rm -f $(docker ps | grep ceph | awk '{print $NF}')
apt-get purge ceph-* -y -f; apt-get autoremove -y
rm -rf /var/lib/ceph/*
sudo rm /etc/apt/sources.list.d/ceph.list
sudo rm /etc/apt/trusted.gpg.d/ceph.release.gpg
sudo rm /etc/apt/sources.list.d/docker-ce.list
sudo apt-get update
lvremove $(lvdisplay  | grep Path | awk '{print $NF}' | grep ceph ) --force  --verbose
vgremove $(vgdisplay  | grep Name | awk '{print $NF}' | grep ceph ) --force  --verbose
wipefs --all /dev/sdb

Delete record from fstab

sed -i '/^.*ceph.*$/d' /etc/fstab ; cat /etc/fstab

Ceph commands

sudo ceph status
sudo ceph orch <start|stop|restart|redeploy|reconfig> <service_name>
sudo ceph orch restart osd
sudo ceph orch resume # CEPHADM_PAUSED: cephadm background work is paused

get pools

sudo ceph osd lspools

output

1 device_health_metrics
2 cephfs_data
3 cephfs_meta

delete pool

systemctl stop ceph-mds.target
killall ceph-mds
ceph mds cluster_down
ceph mds fail 0
ceph fs rm cephfs name --yes-i-really-mean-it
ceph osd pool delete cephfs data pool cephfs data pool --yes-i-really-really-mean-it
ceph osd pool delete cephfs metadata pool cephfs metadata pool --yes-i-really-really-mean-it
rm -rf "/var/lib/ceph/mds/cluster-metadata server"
ceph auth del mds."$hostname"

Mon configure

list mon

ceph mon dump

rm mon from node3

ceph mon rm node3

Osd configure

Set min_size data

ceph osd pool set cephfs_data  min_size 1

Osd tree

ceph osd tree | column -t -s $'\t'

Remove osd

ceph osd crush remove osd.0
ceph auth del osd.0
ceph osd rm 0

Ceph stop cluster

Umount share on all nodes and exec on all nodes

sudo systemctl stop ceph\*.service ceph\*.target

Fix ceph after crush

On all nodes

systemctl stop ceph\*.service ceph\*.target

An then restart

systemctl restart ceph-11d4341a-dde8-11ea-9cb6-063fcf8afd33.target

Mans

https://docs.ceph.com/docs/master/cephadm/install/
https://medium.com/@balderscape/setting-up-a-virtual-single-node-ceph-storage-cluster-d86d6a6c658e
https://github.com/ceph/ceph/pull/35587/files
https://docs.ceph.com/docs/giant/rados/operations/control/
http://onreader.mdl.ru/LearningCeph/content/Ch02.html#From_zero
https://programming.vip/docs/ceph-delete-add-osd.html
https://docs.ceph.com/docs/mimic/cephfs/fstab/
https://docs.ceph.com/docs/master/cephfs/fs-volumes/
Ceph block storage
CONVERTING AN EXISTING CLUSTER TO CEPHADM
debug-ceph-docker-containers
administering-ceph-clusters-that-run-in-containers
PG

Ceph deploy via cephadm on ubuntu 20.04 on 3 nodes