Ceph deploy via cephadm
Post about how deploy ceph deploy via cephadm on ubuntu 20.04 on 3 nodes
REQUIREMENTS
- 3 nodes, with additional drive(in this example /dev/sdb)
- ubuntu/debian
- Systemd
- Podman or Docker for running containers
- Time synchronization (such as chrony or NTP)
PREPARE
Edit /etc/hosts
192.168.56.101 node1 node1.linuxdemo.local
192.168.56.102 node2 node2.linuxdemo.local
192.168.56.103 node3 node3.linuxdemo.local
Set hostnames
sudo hostnamectl set-hostname node1
sudo hostnamectl set-hostname node2
sudo hostnamectl set-hostname node3
Hostnames must be resolved by ipv4, not localhost loop!!!
timesync install
sudo apt-get install ntp ntpdate -y
timedatectl set-ntp on
tune /etc/sysctl.conf
kernel.pid_max = 4194303
fs.aio-max-nr=1048576
sysctl -p
User ceph-admin add
sudo useradd -d /home/ceph-admin -m ceph-admin -s /bin/bash
sudo passwd ceph-admin
echo "ceph-admin ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ceph-admin
sudo chmod 0440 /etc/sudoers.d/ceph-admin
Ssh configure
Create ssh config for user
su ceph-admin
tee -a ~/.ssh/config <<EOF
Host *
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
IdentitiesOnly yes
ConnectTimeout 0
ServerAliveInterval 300
Host node1
Hostname node1
User ceph-admin
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
IdentitiesOnly yes
ConnectTimeout 0
ServerAliveInterval 300
Host node2
Hostname node2
User ceph-admin
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
IdentitiesOnly yes
ConnectTimeout 0
ServerAliveInterval 300
Host node3
Hostname node3
User ceph-admin
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
IdentitiesOnly yes
ConnectTimeout 0
ServerAliveInterval 300
EOF
Generate and copy keys on all nodes
su ceph-admin
ssh-keygen
ssh-copy-id ceph-admin@node1 && ssh-copy-id ceph-admin@node2 && ssh-copy-id ceph-admin@node3
Ceph initial deploy
From node1(192.168.56.101)
su ceph-admin
sudo curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm -o /usr/local/bin/cephadm
sudo chmod +x /usr/local/bin/cephadm
sudo rm /etc/apt/sources.list.d/ceph.list
sudo rm /etc/apt/trusted.gpg.d/ceph.release.gpg
sudo /usr/local/bin/cephadm add-repo --release octopus
sudo /usr/local/bin/cephadm install
if [[ ! -d "/etc/ceph" ]]; then sudo mkdir -p /etc/ceph;fi
sudo /usr/local/bin/cephadm bootstrap --mon-ip 192.168.56.101 --initial-dashboard-user itclife --initial-dashboard-password itclife --dashboard-password-noupdate
Output like
INFO:cephadm:Ceph Dashboard is now available at:
URL: https://vm-1:8443/
User: admin
Password: fv8b98gzeb
INFO:cephadm:You can access the Ceph CLI with:
sudo /usr/local/bin/cephadm shell --fsid a3f8eb22-edbc-11ea-a98b-09a89b8358a4 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
INFO:cephadm:Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/docs/master/mgr/telemetry/
INFO:cephadm:Bootstrap complete.
Login in shell(we log in in docker)
sudo /usr/local/bin/cephadm shell --fsid a3f8eb22-edbc-11ea-a98b-09a89b8358a4 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
exit from docker
exit
Install packages other machines(from cluster) to mount ceph(node2,node3)
sudo cephadm install ceph-common
Copy keys to root user to other nodes
ssh-copy-id -f -i /etc/ceph/ceph.pub root@node2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@node3
or
cat /etc/ceph/ceph.pub
and copy to /root/authorized_keys on node2,node3
Ceph nodes add via orch(orchestrator)
sudo ceph orch host add node2
sudo ceph orch host add node3
Set public network for our ceph
sudo ceph config set mon public_network 192.168.1.0/24
Deploy monitors on a specific set of hosts:
sudo ceph orch apply mon "node1,node2,node3"
or If you want to adjust the default of 5 monitors:
sudo ceph orch apply mon 5
Ceph add label mon
sudo ceph orch host label add node1 mon
sudo ceph orch host label add node2 mon
sudo ceph orch host label add node3 mon
ceph list hosts
ceph orch host ls
Ceph create osd
Ceph list osds
sudo ceph osd tree | column -t -s $'\t'
Exec command on nodes to clean drives(not necessary)
sudo wipefs --all /dev/sdb
Add osd on host's
sudo ceph orch daemon add osd node1:/dev/sdb
sudo ceph orch daemon add osd node2:/dev/sdb
sudo ceph orch daemon add osd node3:/dev/sdb
Ceph create fs and mds, and set placement groups (version1)
Create fs with name volume1
sudo ceph fs volume create volume1
Create 3 mds for fs with name volume1
sudo ceph orch apply mds volume1 3
When we create fs with name volume1, pools will be created automatically
List pools
sudo ceph osd lspools
output
1 device_health_metrics
2 cephfs.volume1.meta
3 cephfs.volume1.data
SET THE NUMBER OF PLACEMENT GROUPS
get pg
ceph osd pool get cephfs.volume1.data pg_num
ceph osd pool get cephfs.volume1.meta pgp_num
and set pg
sudo ceph osd pool set cephfs.volume1.data pg_num 128
sudo ceph osd pool set cephfs.volume1.data pgp_num 128
it is mandatory to choose the value of pg_num because it cannot be calculated automatically. Here are a few values commonly used:
Less than 5 OSDs set pg_num to 128
Between 5 and 10 OSDs set pg_num to 512
Between 10 and 50 OSDs set pg_num to 4096
If you have more than 50 OSDs, you need to understand the tradeoffs and how to calculate the pg_num value by yourself
For calculating pg_num value by yourself please take help of pgcalc tool
List fs
sudo ceph mds stat
sudo ceph fs ls
Ceph create fs and mds(version1)
Create fs with name cephfs
sudo ceph osd pool create cephfs_data 64 64
sudo ceph osd pool create cephfs_metadata 64 64
sudo ceph fs new cephfs cephfs_metadata cephfs_data
sudo ceph orch apply mds cephfs 3
Where:
- {pool-name} is the name of the Ceph pool you are creating.
- pg_num is the total number of placement groups for the pool. It is recommended that pg_num is set to 128 when using less than 5 OSDs in your Ceph cluster.
- pgp_num specifies total number of placement groups for placement purposes. Should be equal to the total number of placement groups.
# By default, Ceph makes 3 replicas of objects. If you want to make four
# copies of an object the default value--a primary copy and three replica
# copies--reset the default values as shown in 'osd pool default size'.
# If you want to allow Ceph to write a lesser number of copies in a degraded
# state, set 'osd pool default min size' to a number less than the
# 'osd pool default size' value.
osd pool default size = 4 # Write an object 4 times.
osd pool default min size = 1 # Allow writing one copy in a degraded state.
# Ensure you have a realistic number of placement groups. We recommend
# approximately 100 per OSD. E.g., total number of OSDs multiplied by 100
# divided by the number of replicas (i.e., osd pool default size). So for
# 10 OSDs and osd pool default size = 4, we'd recommend approximately
# (100 * 10) / 4 = 250.
Ceph mount cephfs
Create creds
Volume name volume1
sudo ceph fs authorize volume1 client.user / rw | sudo tee /etc/ceph/ceph.client.user.keyring
sudo chmod 600 /etc/ceph/ceph.client.user.keyring
sudo cat /etc/ceph/ceph.client.user.keyring
Output
[client.user]
key = AQDSxDNfzpmkIhAAGyKz2EIqZsaVrXWP0ZNRAQ==
mount via sh
sudo mkdir /mnt/cephfs
sudo mount -t ceph node3:/ /mnt/cephfs -o name=user,secret=AQDSxDNfzpmkIhAAGyKz2EIqZsaVrXWP0ZNRAQ==
cd /mnt/cephfs
mount via fstab
192.168.56.101,192.168.56.102,192.168.56.103:/ /mnt/cephfs ceph name=user,secret=AQDSxDNfzpmkIhAAGyKz2EIqZsaVrXWP0ZNRAQ==,noatime,_netdev 0 2
FAQ
Deploy ceph client
ceph-deploy install ceph-client
Ceph test
cd /mnt/cephfs
sudo dd if=/dev/zero of=./test3.img bs=20M count=30
Ceph repair
sudo ceph osd crush rule dump # Shows you the current crush maps
sudo ceph osd getcrushmap -o comp_crush_map.cm # Get crush map
crushtool -d comp_crush_map.cm -o crush_map.cm # Decompile map
vim crush_map.cm # Make and changes you need (host -> osd)
crushtool -c crush_map.cm -o new_crush_map.cm # Compile map
sudo ceph osd setcrushmap -i new_crush_map.cm # Load the new map
Ceph clean all packages and wipe data, umount fstab
cd /etc/systemd/system
rm -rf $(ls | grep ceph)
systemctl daemon-reload
docker rm -f $(docker ps | grep ceph | awk '{print $NF}')
apt-get purge ceph-* -y -f; apt-get autoremove -y
rm -rf /var/lib/ceph/*
sudo rm /etc/apt/sources.list.d/ceph.list
sudo rm /etc/apt/trusted.gpg.d/ceph.release.gpg
sudo rm /etc/apt/sources.list.d/docker-ce.list
sudo apt-get update
lvremove $(lvdisplay | grep Path | awk '{print $NF}' | grep ceph ) --force --verbose
vgremove $(vgdisplay | grep Name | awk '{print $NF}' | grep ceph ) --force --verbose
wipefs --all /dev/sdb
Delete record from fstab
sed -i '/^.*ceph.*$/d' /etc/fstab ; cat /etc/fstab
Ceph commands
sudo ceph status
sudo ceph orch <start|stop|restart|redeploy|reconfig> <service_name>
sudo ceph orch restart osd
sudo ceph orch resume # CEPHADM_PAUSED: cephadm background work is paused
get pools
sudo ceph osd lspools
output
1 device_health_metrics
2 cephfs_data
3 cephfs_meta
delete pool
systemctl stop ceph-mds.target
killall ceph-mds
ceph mds cluster_down
ceph mds fail 0
ceph fs rm cephfs name --yes-i-really-mean-it
ceph osd pool delete cephfs data pool cephfs data pool --yes-i-really-really-mean-it
ceph osd pool delete cephfs metadata pool cephfs metadata pool --yes-i-really-really-mean-it
rm -rf "/var/lib/ceph/mds/cluster-metadata server"
ceph auth del mds."$hostname"
Mon configure
list mon
ceph mon dump
rm mon from node3
ceph mon rm node3
Osd configure
Set min_size data
ceph osd pool set cephfs_data min_size 1
Osd tree
ceph osd tree | column -t -s $'\t'
Remove osd
ceph osd crush remove osd.0
ceph auth del osd.0
ceph osd rm 0
Ceph stop cluster
Umount share on all nodes and exec on all nodes
sudo systemctl stop ceph\*.service ceph\*.target
Fix ceph after crush
On all nodes
systemctl stop ceph\*.service ceph\*.target
An then restart
systemctl restart ceph-11d4341a-dde8-11ea-9cb6-063fcf8afd33.target
Mans
https://docs.ceph.com/docs/master/cephadm/install/
https://medium.com/@balderscape/setting-up-a-virtual-single-node-ceph-storage-cluster-d86d6a6c658e
https://github.com/ceph/ceph/pull/35587/files
https://docs.ceph.com/docs/giant/rados/operations/control/
http://onreader.mdl.ru/LearningCeph/content/Ch02.html#From_zero
https://programming.vip/docs/ceph-delete-add-osd.html
https://docs.ceph.com/docs/mimic/cephfs/fstab/
https://docs.ceph.com/docs/master/cephfs/fs-volumes/