高效实施ClickHouse数据备份
本文介绍ClickHouse数据备份,包括手动实施,利用第三方工具(clickhouse-backup)实现。主要包括创建用户并授权,手动备份、自动备份。
创建用户
在进行备份之前,先确认用户是否拥有相应权限。下面首先看如何创建只读用户:
ClickHouse的系统设置策略(settings profile)是一组设置的集合。缺省提供了readonly策略用于查询,下面是/etc/clickhouse-server/users.xml中定义readonly的内容:
<yandex>
<!-- Profiles of settings. -->
<profiles>
...
<!-- Profile that allows only read queries. -->
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
...
所以创建只读用户bkuser01,仅需要执行下面SQL:
CREATE USER bkuser01 IDENTIFIED WITH plaintext_password BY 'password0' SETTINGS PROFILE 'readonly';
GRANT SHOW TABLES, SELECT ON database.* TO bkuser01;
// GRANT ALL ON backup.* TO bkuser01 ; 全部权限
使用default用执行上述SQL ,可能会提供没有权限错误。因为默认default用户没有访问管理权限。
增加default管理权限
修改 /etc/clickhouse-server/users.xml , 去掉 default的访问权限注释:
<access_management>1</access_management>
无需重启ClickHouse服务,再次执行上面授权SQL,应该提示成功。后面自动备份示例需要用户拥有特定数据库全部权限。
手动实现
ClickHouse内置通过ALTER TABLE ... FREEZE
特性支持瞬时备份功能。该方法在shadow目录为所有分区创建硬链接,我们知道硬链接只是增加原文件的别名,从而避免双倍占用空间。
执行手动备份之后,需要移动到其他位置,为了以后可以进行恢复:
- 从其他位置恢复备份
- 为每个表拷贝分区备份至detached目录
- 使用
ALTER TABLE ... ATTACH PARTITION
语句恢复每个分区
自动备份
利用第三方工具clickhouse-backup实现自动备份。假设您已经安装了该工具并配置了远程存储。它提供了一些命令,可以轻松实现数据备份,官方文档内容参考:
NAME:
clickhouse-backup - Tool for easy backup of ClickHouse with cloud support
USAGE:
clickhouse-backup [-t, --tables=.
VERSION:
2.2.0
DESCRIPTION:
Run as ‘root’ or ‘clickhouse’ user
COMMANDS:
tables List list of tables, exclude skip_tables
create Create new backup
create_remote Create and upload
upload Upload backup to remote storage
list List list of backups
download Download backup from remote storage
restore Create schema and restore data from backup
restore_remote Download and restore
delete Delete specific backup
default-config List default config
print-config List current config
clean Remove data in ‘shadow’ folder from all path
folders available from system.disks
clean_remote_broken Remove all broken remote backups
watch Run infinite loop which create full + incremental backup sequence to allow efficient backup sequences
server Run API server
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
–config FILE, -c FILE Config FILE name. (default: “/etc/clickhouse-backup/config.yml”) [$CLICKHOUSE_BACKUP_CONFIG]
–help, -h show help
–version, -v print the version
为了演示,创建backup数据库、test表:
:) CREATE DATABASE IF NOT EXISTS backup
:) CREATE TABLE IF NOT EXISTS backup.test
(
`date` Datetime,
`user_id` String,
`pageviews` Int32
)
ENGINE = MergeTree()
PARTITION BY toStartOfHour(date)
ORDER BY (date)
插入示例数据:
INSERT INTO backup.test VALUES
('2021-01-13 07:00:00', 'user 1', 7),
('2021-01-13 08:00:00', 'user 2', 3),
('2021-01-13 09:00:00', 'user 3', 1),
('2021-01-13 10:00:00', 'user 4', 12)
利用clickhouse-backup工具查看备份情况:
# clickhouse-backup list
Local backups:
no backups found
Remote backups:
no backups found
查看数据表:
# clickhouse-backup tables
backup.test
创建本地备份:
# clickhouse-backup create
2023/03/23 13:14:43 Create backup '2023-03-23T13-12-43'
2023/03/23 13:14:43 Freeze 'backup.test'
2023/03/23 13:14:43 Copy part hashes
2023/03/23 13:14:43 Writing part hashes
2023/03/23 13:14:43 Copy metadata
2023/03/23 13:14:43 Done.
2023/03/23 13:14:43 Move shadow
2023/03/23 13:14:43 Done.
# clickhouse-backup list
Local backups:
- '2023/03/23T13-12-43' (created at 23-03-2023 13:12:43)
Remote backups:
no backups found
更多内容参考:https://dev.to/jv/backup-and-restore-with-clickhouse-backup-dn9
clickhouse-backup地址:https://github.com/AlexAkulov/clickhouse-backup