Git Product home page Git Product logo

oceanbase / ob-operator Goto Github PK

View Code? Open in Web Editor NEW
141.0 10.0 35.0 12.86 MB

Kubernetes operator for OceanBase

Home Page: https://oceanbase.github.io/ob-operator/

License: Other

Dockerfile 0.03% Makefile 0.54% Go 61.42% Shell 0.63% Smarty 0.19% JavaScript 0.11% TypeScript 35.41% Less 0.29% HCL 0.06% MDX 1.12% CSS 0.19% HTML 0.01%
distributed-database kubernetes kubernetes-operator oceanbase cloudnative mysql sql

ob-operator's Introduction

OceanBase Logo

English doc Chinese doc last commit stars building status license

Join Slack Stack Overflow

English | 中文版

OceanBase Database is a distributed relational database. It is developed entirely by Ant Group. The OceanBase Database is built on a common server cluster. Based on the Paxos protocol and its distributed structure, the OceanBase Database provides high availability and linear scalability. The OceanBase Database is not dependent on specific hardware architectures.

Key features

  • Transparent Scalability: 1,500 nodes, PB data and a trillion rows of records in one cluster.
  • Ultra-fast Performance: TPC-C 707 million tmpC and TPC-H 15.26 million QphH @30000GB.
  • Cost Efficiency: saves 70%–90% of storage costs.
  • Real-time Analytics: supports HTAP without additional cost.
  • Continuous Availability: RPO = 0(zero data loss) and RTO < 8s(recovery time)
  • MySQL Compatible: easily migrated from MySQL database.

See also key features for more details.

Quick start

See also Quick experience or Quick Start (Simplified Chinese) for more details.

🔥 Start with all-in-one

You can quickly deploy a stand-alone OceanBase Database to experience with the following commands:

Note: Linux Only

# download and install all-in-one package (internet connection is required)
bash -c "$(curl -s https://obbusiness-private.oss-cn-shanghai.aliyuncs.com/download-center/opensource/oceanbase-all-in-one/installer.sh)"
source ~/.oceanbase-all-in-one/bin/env.sh

# quickly deploy OceanBase database
obd demo

🐳 Start with docker

Note: We provide images on dockerhub, quay.io and ghcr.io. If you have problems pulling images from dockerhub, please try the other two registries.

  1. Start an OceanBase Database instance:

    # Deploy a mini standalone instance.
    docker run -p 2881:2881 --name oceanbase-ce -e MODE=mini -d oceanbase/oceanbase-ce
    
    # Deploy a mini standalone instance using image from quay.io.
    # docker run -p 2881:2881 --name oceanbase-ce -e MODE=mini -d quay.io/oceanbase/oceanbase-ce
    
    # Deploy a mini standalone instance using image from ghcr.io.
    # docker run -p 2881:2881 --name oceanbase-ce -e MODE=mini -d ghcr.io/oceanbase/oceanbase-ce
  2. Connect to the OceanBase Database instance:

    docker exec -it oceanbase-ce obclient -h127.0.0.1 -P2881 -uroot # Connect to the root user of the sys tenant.

See also Docker Readme for more details.

☸️ Start with Kubernetes

You can deploy and manage OceanBase Database instance in kubernetes cluster with ob-operator quickly. Refer to the document Quick Start for ob-operator to see details.

👨‍💻 Start developing

See OceanBase Developer Document to learn how to compile and deploy a manually compiled observer.

Roadmap

For future plans, see Product Iteration Progress. See also OceanBase Roadmap for more details.

Case study

OceanBase has been serving more than 1000 customers and upgraded their database from different industries, including Financial Services, Telecom, Retail, Internet, and more.

See also success stories and Who is using OceanBase for more details.

System architecture

Introduction to system architecture

Contributing

Contributions are highly appreciated. Read the development guide to get started.

License

OceanBase Database is licensed under the Mulan Public License, Version 2. See the LICENSE file for more info.

Community

Join the OceanBase community via:

ob-operator's People

Contributors

70data avatar chengjoey avatar chris-sun-star avatar gidi233 avatar github-actions[bot] avatar hzhovo avatar liuxinbot avatar lizzy-0323 avatar ob-operator-bot avatar pan-ziyue avatar powerfooi avatar skytreedelivery avatar whhe avatar yang1666204 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ob-operator's Issues

[Feature]: Support table-level data restore

Describe your use case

OceanBase after v4.2.1 supports table-level data restore. It should be supported in ob-operator.

Describe the solution you'd like

Add a new type for OBTenantOperation resource like RestoreTables and resourceManager of OBTenantOperation handles all tasks.

Describe alternatives you've considered

No response

Additional context

No response

[Enhancement]: Make PVC binding to OBServer's pod independent

Description

PVCs which are bound to OBServer's pod will be owned by the pod. This leads to cascading deletion when the pod is deleted. However, PV creating by the PVC usually contains important non-volatile data of users, which can not be deleted as usual. There should be a method to avoid such cascading deletion when pod is deleted, on other words, make PVCs independent.

kubectl delete pod手工删除sapp开头的主节点对应的pod后,该pod无法自动恢复

手工删除pod后,进到容器里面查看observer.log发现是observer启动不起来。请问这个是配置不当导致的问题吗?
[2022-05-22 21:00:27.356380] INFO [SERVER] main.cpp:508 [51][0][Y0-0000000000000000] [lt=44] [dc=0] observer is exit(observer_version="OceanBase CE 3.1.3")

ERROR级别日志如下:
[2022-05-22 21:00:24.009866] ERROR [SERVER] init_storage (ob_server.cpp:1692) [52][0][Y0-0000000000000000] [lt=2] [dc=0] init partition service fail(ret=-4009, storage_env_={data_dir:"/home/admin/oceanbase/store/", default_block_size:2097152, disk_avail_space:25769803776, datafile_disk_percentage:90, redundancy_level:1, log_spec:{log_dir:"/home/admin/oceanbase/store//slog", max_log_size:268435456, log_sync_type:0}, clog_dir:"/home/admin/oceanbase/store//clog", ilog_dir:"/home/admin/oceanbase/store//ilog", clog_shm_path:"/home/admin/oceanbase/store//clog_shm", ilog_shm_path:"/home/admin/oceanbase/store//ilog_shm", index_cache_priority:10, user_block_cache_priority:1, user_row_cache_priority:1, fuse_row_cache_priority:1, bf_cache_priority:1, clog_cache_priority:1, index_clog_cache_priority:1, bf_cache_miss_count_threshold:100, ethernet_speed:1310720000}) BACKTRACE:0x99bf8ce 0x97535e1 0x22af46f 0x22af0bb 0x22aee82 0x8056e7b 0x93345be 0x932a007 0x2270dd5 0x7f3a6a0ff493 0x226faae
[2022-05-22 21:00:25.282121] ERROR [SERVER] main (main.cpp:494) [52][0][Y0-0000000000000000] [lt=4] [dc=0] observer init fail(ret=-4009) BACKTRACE:0x99bf8ce 0x97535e1 0x2274784 0x227426b 0x2273fd1 0x2272c68 0x22710da 0x7f3a6a0ff493 0x226faae

WARN级别日志如下:
$ grep WARN observer.log.20220522
[2022-05-22 21:00:23.162562] WARN [STORAGE.TRANS] getClock (ob_clock_generator.h:65) [52][0][Y0-0000000000000000] [lt=21] [dc=0] clock generator not inited
[2022-05-22 21:00:23.173137] WARN [COMMON] load_from_file (ob_io_benchmark.cpp:240) [52][0][Y0-0000000000000000] [lt=20] [dc=0] the io bench result file not exist!
[2022-05-22 21:00:23.173159] WARN [COMMON] init (ob_io_benchmark.cpp:782) [52][0][Y0-0000000000000000] [lt=19] [dc=0] Fail to load io benchmark result, (ret=-4027)
[2022-05-22 21:00:23.173166] WARN [SERVER] init_io (ob_server.cpp:1088) [52][0][Y0-0000000000000000] [lt=6] [dc=0] init io benchmark fail, (ret=-4027)
[2022-05-22 21:00:23.173221] WARN [SERVER.OMT] init_cgroup_root_dir_ (ob_cgroup_ctrl.cpp:540) [52][0][Y0-0000000000000000] [lt=5] [dc=0] no cgroup directory found. disable cgroup support(cgroup_path="cgroup", ret=-4027)
[2022-05-22 21:00:23.173246] WARN [SERVER.OMT] init (ob_cgroup_ctrl.cpp:52) [52][0][Y0-0000000000000000] [lt=20] [dc=0] init cgroup dir failed(ret=-4027, root_cgroup_="cgroup")
[2022-05-22 21:00:23.173256] WARN [COMMON] get_instance (memory_dump.cpp:91) [52][0][Y0-0000000000000000] [lt=6] [dc=0] memory dump not init
[2022-05-22 21:00:23.446586] WARN [LIB] init (ob_tsc_timestamp.cpp:42) [52][0][Y0-0000000000000000] [lt=14] [dc=0] invariant TSC not support(ret=-4007)
[2022-05-22 21:00:23.518720] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=14] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:23.518749] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=27] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:23.532934] WARN [SERVER] get_network_speed_from_config_file (ob_server.cpp:1795) [52][0][Y0-0000000000000000] [lt=3] [dc=0] NIC Config file doesn't exist, auto detecting(nic_rate_path="etc/nic.rate.config", ret=-4027)
[2022-05-22 21:00:23.587680] WARN [STORAGE] inner_get_super_block_version (ob_local_file_system.cpp:877) [52][0][Y0-0000000000000000] [lt=19] [dc=0] read superblock error.(ret=-4009, offset=0, read_size=0, errno=2, super_block_buf_holder={buf:0x7f3a0d405000, len:65536}, fd={fd:290, disk_id:{disk_idx:0, install_seq:0}}, errmsg="No such file or directory")
[2022-05-22 21:00:23.587709] WARN [STORAGE] get_super_block_version (ob_local_file_system.cpp:851) [52][0][Y0-0000000000000000] [lt=5] [dc=0] fail to get super block version from master(ret=-4009)
[2022-05-22 21:00:23.587912] WARN [STORAGE] inner_get_super_block_version (ob_local_file_system.cpp:877) [52][0][Y0-0000000000000000] [lt=4] [dc=0] read superblock error.(ret=-4009, offset=2097152, read_size=0, errno=2, super_block_buf_holder={buf:0x7f3a0d405000, len:65536}, fd={fd:290, disk_id:{disk_idx:0, install_seq:0}}, errmsg="No such file or directory")
[2022-05-22 21:00:23.587930] WARN [STORAGE] get_super_block_version (ob_local_file_system.cpp:853) [52][0][Y0-0000000000000000] [lt=3] [dc=0] fail to get super block version from backup(ret=-4009)
[2022-05-22 21:00:23.587934] WARN [STORAGE] init (ob_local_file_system.cpp:296) [52][0][Y0-0000000000000000] [lt=3] [dc=0] fail to get super block version(ret=-4009)
[2022-05-22 21:00:23.587966] WARN [STORAGE] init (ob_store_file_system.cpp:830) [52][0][Y0-0000000000000000] [lt=14] [dc=0] fail to init store file system(ret=-4009, storage_env={data_dir:"/home/admin/oceanbase/store/", default_block_size:2097152, disk_avail_space:25769803776, datafile_disk_percentage:90, redundancy_level:1, log_spec:{log_dir:"/home/admin/oceanbase/store//slog", max_log_size:268435456, log_sync_type:0}, clog_dir:"/home/admin/oceanbase/store//clog", ilog_dir:"/home/admin/oceanbase/store//ilog", clog_shm_path:"/home/admin/oceanbase/store//clog_shm", ilog_shm_path:"/home/admin/oceanbase/store//ilog_shm", index_cache_priority:10, user_block_cache_priority:1, user_row_cache_priority:1, fuse_row_cache_priority:1, bf_cache_priority:1, clog_cache_priority:1, index_clog_cache_priority:1, bf_cache_miss_count_threshold:100, ethernet_speed:1310720000})
[2022-05-22 21:00:23.587999] WARN [STORAGE] init (ob_partition_service.cpp:300) [52][0][Y0-0000000000000000] [lt=9] [dc=0] init store file system failed.(ret=-4009, env={data_dir:"/home/admin/oceanbase/store/", default_block_size:2097152, disk_avail_space:25769803776, datafile_disk_percentage:90, redundancy_level:1, log_spec:{log_dir:"/home/admin/oceanbase/store//slog", max_log_size:268435456, log_sync_type:0}, clog_dir:"/home/admin/oceanbase/store//clog", ilog_dir:"/home/admin/oceanbase/store//ilog", clog_shm_path:"/home/admin/oceanbase/store//clog_shm", ilog_shm_path:"/home/admin/oceanbase/store//ilog_shm", index_cache_priority:10, user_block_cache_priority:1, user_row_cache_priority:1, fuse_row_cache_priority:1, bf_cache_priority:1, clog_cache_priority:1, index_clog_cache_priority:1, bf_cache_miss_count_threshold:100, ethernet_speed:1310720000})
[2022-05-22 21:00:23.588272] WARN begin (ob_hashtable.h:686) [52][0][Y0-0000000000000000] [lt=9] [dc=0] hashtable not init, backtrace=0x99bf8ce 0x85488c4 0x850a488 0x8495166 0x7deea8b 0x7dde4cd 0x9334391 0x932a007 0x2270dd5 0x7f3a6a0ff493 0x226faae
[2022-05-22 21:00:23.588318] WARN begin (ob_hashtable.h:686) [52][0][Y0-0000000000000000] [lt=10] [dc=0] hashtable not init, backtrace=0x99bf8ce 0x8541f1c 0x8486b95 0x8495180 0x7deea8b 0x7dde4cd 0x9334391 0x932a007 0x2270dd5 0x7f3a6a0ff493 0x226faae
[2022-05-22 21:00:23.735353] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=22] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:23.735379] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=24] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:23.951567] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=4] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:23.951590] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=21] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:24.009931] WARN [SERVER] init (ob_server.cpp:296) [52][0][Y0-0000000000000000] [lt=62] [dc=0] init storage fail(ret=-4009)
[2022-05-22 21:00:24.009953] WARN [SERVER] destroy (ob_server.cpp:387) [52][0][Y0-0000000000000000] [lt=4] [dc=0] destroy observer begin
[2022-05-22 21:00:24.009984] WARN [SERVER] destroy (ob_server.cpp:390) [52][0][Y0-0000000000000000] [lt=11] [dc=0] backup info destroyed
[2022-05-22 21:00:24.009995] WARN [SERVER] destroy (ob_server.cpp:392) [52][0][Y0-0000000000000000] [lt=3] [dc=0] ObBackupDestDetector destroyed
[2022-05-22 21:00:24.010000] WARN [SERVER] destroy (ob_server.cpp:394) [52][0][Y0-0000000000000000] [lt=3] [dc=0] backup file lock mgr detroy
[2022-05-22 21:00:24.011399] WARN [SERVER] destroy (ob_server.cpp:398) [52][0][Y0-0000000000000000] [lt=9] [dc=0] timer destroyed
[2022-05-22 21:00:24.012528] WARN [SERVER] destroy (ob_server.cpp:400) [52][0][Y0-0000000000000000] [lt=5] [dc=0] freeze timer destroyed
[2022-05-22 21:00:24.013629] WARN [SERVER] destroy (ob_server.cpp:402) [52][0][Y0-0000000000000000] [lt=8] [dc=0] sql memory manager timer destroyed
[2022-05-22 21:00:24.014561] WARN [SERVER] destroy (ob_server.cpp:404) [52][0][Y0-0000000000000000] [lt=6] [dc=0] server trace timer destroyed
[2022-05-22 21:00:24.094082] WARN [SHARE] add_event (ob_event_history_table_operator.h:433) [52][0][Y0-0000000000000000] [lt=3] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094103] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [52][0][Y0-0000000000000000] [lt=3] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094111] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [52][0][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094116] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [52][0][Y0-0000000000000000] [lt=1] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094121] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [52][0][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094127] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [52][0][Y0-0000000000000000] [lt=1] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094132] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [52][0][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094137] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [52][0][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094180] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [52][0][Y0-0000000000000000] [lt=4] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094202] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [52][0][Y0-0000000000000000] [lt=4] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094205] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [165][224][Y0-0000000000000000] [lt=5] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094211] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [52][0][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094212] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [165][224][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094218] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [165][224][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094220] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [52][0][Y0-0000000000000000] [lt=3] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094224] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [165][224][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094228] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [52][0][Y0-0000000000000000] [lt=3] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094229] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [165][224][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094235] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [165][224][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094236] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [52][0][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094240] WARN [SHARE] stop (ob_reentrant_thread.cpp:106) [165][224][Y0-0000000000000000] [lt=3] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094245] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [52][0][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094250] WARN [SHARE] wait (ob_reentrant_thread.cpp:123) [165][224][Y0-0000000000000000] [lt=2] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.094410] WARN [STORAGE.TRANS] getClock (ob_clock_generator.h:65) [165][224][Y0-0000000000000000] [lt=3] [dc=0] clock generator not inited
[2022-05-22 21:00:24.095010] WARN [STORAGE.TRANS] getClock (ob_clock_generator.h:65) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=20] [dc=0] clock generator not inited
[2022-05-22 21:00:24.095033] WARN [STORAGE.TRANS] getClock (ob_clock_generator.h:65) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=20] [dc=0] clock generator not inited
[2022-05-22 21:00:24.096257] WARN [SQL.RESV] resolve_table_relation_recursively (ob_dml_resolver.cpp:6639) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=2] [dc=0] synonym not exist(tenant_id=1, database_id=1099511627777, table_name=__all_tenant_backup_info, ret=-5019)
[2022-05-22 21:00:24.096265] WARN [SQL.RESV] resolve_table_relation_factor_normal (ob_dml_resolver.cpp:6515) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=6] [dc=0] fail to resolve table relation recursively(tenant_id=1, ret=-5019)
[2022-05-22 21:00:24.096275] WARN [SQL.RESV] resolve_table_relation_factor (ob_dml_resolver.cpp:6295) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=9] [dc=0] resolve table relation factor failed(ret=-5019)
[2022-05-22 21:00:24.096285] WARN [SHARE.SCHEMA] get_dblink_user (ob_schema_getter_guard.cpp:6554) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=4] [dc=0] dblink user name is empty(tenant_id=1, dblink_name=__all_tenant_backup_info)
[2022-05-22 21:00:24.096295] WARN [SQL.RESV] get_dblink_user (ob_schema_checker.cpp:1303) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=7] [dc=0] failed to get dblink id(ret=-4016, tenant_id=1, dblink_name=all_tenant_backup_info)
[2022-05-22 21:00:24.096299] WARN [SQL.RESV] resolve_dblink_with_synonym (ob_dml_resolver.cpp:6339) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] fail to exec schema_checker
->get_dblink_user(tenant_id, tmp_table_name, tmp_db_name, *allocator
)(ret=-4016)
[2022-05-22 21:00:24.096302] WARN [SQL.RESV] resolve_table_relation_factor (ob_dml_resolver.cpp:6300) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] try synonym with dblink failed(ret=-4016)
[2022-05-22 21:00:24.096308] WARN [SQL.RESV] inner_resolve_sys_view (ob_dml_resolver.cpp:950) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] fail to resolve table(ret=-5019)
[2022-05-22 21:00:24.096312] WARN [SQL.RESV] resolve_table_relation_factor_wrapper (ob_dml_resolver.cpp:983) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] fail to resolve sys view(ret=-5019)
[2022-05-22 21:00:24.096334] WARN resolve_basic_table (ob_dml_resolver.cpp:1076) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] Table 'oceanbase._all_tenant_backup_info' doesn't exist
[2022-05-22 21:00:24.096346] WARN [SQL.RESV] resolve_table (ob_dml_resolver.cpp:1301) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=10] [dc=0] resolve basic table failed(ret=-5019)
[2022-05-22 21:00:24.096351] WARN [SQL.RESV] resolve_table_list (ob_update_resolver.cpp:392) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=4] [dc=0] failed to resolve table(ret=-5019)
[2022-05-22 21:00:24.096354] WARN [SQL.RESV] resolve (ob_update_resolver.cpp:63) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] resolve table failed(ret=-5019)
[2022-05-22 21:00:24.096367] WARN [SQL.RESV] stmt_resolver_func (ob_resolver.cpp:126) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=4] [dc=0] execute stmt_resolver failed(ret=-5019, parse_tree.type
=3036)
[2022-05-22 21:00:24.096394] WARN [SQL] generate_stmt (ob_sql.cpp:1440) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=8] [dc=0] failed to resolve(ret=-5019)
[2022-05-22 21:00:24.096410] WARN [SQL] generate_physical_plan (ob_sql.cpp:1528) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=13] [dc=0] Failed to generate stmt(ret=-5019, result.get_exec_context().need_disconnect()=false)
[2022-05-22 21:00:24.096416] WARN [SQL] handle_physical_plan (ob_sql.cpp:3218) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=4] [dc=0] Failed to generate plan(ret=-5019, result.get_exec_context().need_disconnect()=false)
[2022-05-22 21:00:24.096420] WARN [SQL] handle_text_query (ob_sql.cpp:1209) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=4] [dc=0] fail to handle physical plan(ret=-5019)
[2022-05-22 21:00:24.096427] WARN [SQL] stmt_query (ob_sql.cpp:171) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] fail to handle text query(stmt=update __all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882', ret=-5019)
[2022-05-22 21:00:24.096431] WARN [SERVER] do_query (ob_inner_sql_connection.cpp:665) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] executor execute failed(ret=-5019)
[2022-05-22 21:00:24.096439] WARN [SERVER] query (ob_inner_sql_connection.cpp:815) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] execute failed(ret=-5019, executor={ObIExecutor:, sql:"update __all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882'"}, retry_cnt=0)
[2022-05-22 21:00:24.096452] WARN [SERVER] query (ob_inner_sql_connection.cpp:818) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=9] [dc=0] failed to process retry(tmp_ret=-5019, ret=-5019, executor={ObIExecutor:, sql:"update __all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882'"}, retry_cnt=0)
[2022-05-22 21:00:24.096468] WARN [SERVER] inner_close (ob_inner_sql_result.cpp:152) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=4] [dc=0] result set close failed(ret=-5019, need_retry=false)
[2022-05-22 21:00:24.096479] WARN [SERVER] force_close (ob_inner_sql_result.cpp:136) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=9] [dc=0] result set close failed(ret=-5019)
[2022-05-22 21:00:24.096483] WARN [SERVER] query (ob_inner_sql_connection.cpp:824) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] failed to close result(close_ret=-5019, ret=-5019)
[2022-05-22 21:00:24.096512] WARN [SERVER.OMT] get_tenant_ctx_with_tenant_lock (ob_multi_tenant.cpp:64) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] get tenant from omt failed(ret=-5150, tenant_id=1)
[2022-05-22 21:00:24.096526] WARN [SHARE] ObTenantSpaceFetcher (ob_context.cpp:176) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=12] [dc=0] get tenant ctx failed(ret=-5150, tenant_id=1)
[2022-05-22 21:00:24.096534] WARN [SERVER] query (ob_inner_sql_connection.cpp:852) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=5] [dc=0] failed to process record(executor={ObIExecutor:, sql:"update __all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882'"}, record_ret=-5019, ret=-5019)
[2022-05-22 21:00:24.096539] WARN [SERVER] query (ob_inner_sql_connection.cpp:868) [165][224][YB42AC1A9F4A-0005DF994E790E88] [lt=3] [dc=0] failed to process final(executor={ObIExecutor:, sql:"update __all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882'"}, aret=-5019, ret=-5019)
[2022-05-22 21:00:24.096546] WARN [SERVER] execute_write (ob_inner_sql_connection.cpp:1284) [165][224][Y0-0000000000000000] [lt=3] [dc=0] execute sql failed(ret=-5019, tenant_id=1, sql="update __all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882'")
[2022-05-22 21:00:24.096637] WARN [COMMON.MYSQLP] write (ob_mysql_proxy.cpp:104) [165][224][Y0-0000000000000000] [lt=4] [dc=0] execute sql failed(ret=-5019, conn=0x7f3a564d5850, start=1653224424094285, sql="update __all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882'")
[2022-05-22 21:00:24.096699] WARN [SERVER] clean_backup_scheduler_leader (ob_backup_operator.cpp:1519) [165][224][Y0-0000000000000000] [lt=14] [dc=0] execute sql failed(ret=-5019, sql=update _all_tenant_backup_info set value = '' where name = 'backup_scheduler_leader' and value='172.26.159.74:2882')
[2022-05-22 21:00:24.096712] WARN [SERVER] clean_backup_scheduler_leader (ob_backup_manager.cpp:895) [165][224][Y0-0000000000000000] [lt=11] [dc=0] failed to clean backup scheduler leader(ret=-5019)
[2022-05-22 21:00:24.097652] WARN [SERVER] destroy (ob_server.cpp:406) [52][0][Y0-0000000000000000] [lt=3] [dc=0] root service destroyed
[2022-05-22 21:00:24.167561] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=4] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:24.167583] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=20] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:24.214599] WARN [SERVER] destroy (ob_server.cpp:408) [52][0][Y0-0000000000000000] [lt=18] [dc=0] ob service destroyed
[2022-05-22 21:00:24.220559] WARN [SERVER] destroy (ob_server.cpp:410) [52][0][Y0-0000000000000000] [lt=11] [dc=0] sql engine destroyed
[2022-05-22 21:00:24.247763] WARN [SERVER] destroy (ob_server.cpp:412) [52][0][Y0-0000000000000000] [lt=18] [dc=0] ob dag scheduler destroyed
[2022-05-22 21:00:24.247848] WARN begin (ob_hashtable.h:686) [52][0][Y0-0000000000000000] [lt=3] [dc=0] hashtable not init, backtrace=0x99bf8ce 0x7b51f64 0x7afa378 0x7ab87d2 0x7dee9eb 0x9328d8d 0x932a6ab 0x2270dd5 0x7f3a6a0ff493 0x226faae
[2022-05-22 21:00:24.247910] WARN begin (ob_hashtable.h:686) [52][0][Y0-0000000000000000] [lt=9] [dc=0] hashtable not init, backtrace=0x99bf8ce 0x85488c4 0x850a488 0x8495166 0x7deea8b 0x9328d8d 0x932a6ab 0x2270dd5 0x7f3a6a0ff493 0x226faae
[2022-05-22 21:00:24.247932] WARN begin (ob_hashtable.h:686) [52][0][Y0-0000000000000000] [lt=7] [dc=0] hashtable not init, backtrace=0x99bf8ce 0x8541f1c 0x8486b95 0x8495180 0x7deea8b 0x9328d8d 0x932a6ab 0x2270dd5 0x7f3a6a0ff493 0x226faae
[2022-05-22 21:00:24.248412] WARN [SERVER] destroy (ob_server.cpp:414) [52][0][Y0-0000000000000000] [lt=3] [dc=0] partition service destroyed
[2022-05-22 21:00:24.248431] WARN [SERVER] destroy (ob_server.cpp:416) [52][0][Y0-0000000000000000] [lt=5] [dc=0] gc partition adapter destroyed
[2022-05-22 21:00:24.248448] WARN [SERVER] destroy (ob_server.cpp:418) [52][0][Y0-0000000000000000] [lt=2] [dc=0] location cache destroyed
[2022-05-22 21:00:24.248464] WARN [SERVER] destroy (ob_server.cpp:420) [52][0][Y0-0000000000000000] [lt=2] [dc=0] weak read service destroyed
[2022-05-22 21:00:24.248590] WARN [SERVER] destroy (ob_server.cpp:422) [52][0][Y0-0000000000000000] [lt=2] [dc=0] net frame destroyed
[2022-05-22 21:00:24.349352] WARN [COMMON] get_warning_disks (ob_io_disk.cpp:930) [71][36][Y0-0000000000000000] [lt=0] [dc=0] not init(ret=-4006)
[2022-05-22 21:00:24.349374] WARN [COMMON] check_disk_error (ob_io_manager.cpp:699) [71][36][Y0-0000000000000000] [lt=19] [dc=0] fail to get warning disks(ret=-4006)
[2022-05-22 21:00:24.383524] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=19] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:24.383547] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=21] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:24.537646] WARN [SHARE] blacklist_loop
(ob_server_blacklist.cpp:278) [181][256][Y0-0000000000000000] [lt=17] [dc=0] ObServerBlacklist is not inited
[2022-05-22 21:00:24.599574] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=4] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:24.599609] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=34] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:24.815640] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=4] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:24.815675] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=33] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:25.031542] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=4] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:25.031577] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=33] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:25.198148] WARN [SERVER] destroy (ob_server.cpp:428) [52][0][Y0-0000000000000000] [lt=48] [dc=0] io manager destroyed
[2022-05-22 21:00:25.247591] WARN [COMMON] get_all_tenant_id (ob_tenant_mgr.cpp:582) [101][96][Y0-0000000000000000] [lt=7] [dc=0] tenant manager not init(ret=-4006)
[2022-05-22 21:00:25.247628] WARN [COMMON] compute_tenant_wash_size (ob_kvcache_store.cpp:659) [101][96][Y0-0000000000000000] [lt=34] [dc=0] Fail to get all tenant ids, (ret=-4006)
[2022-05-22 21:00:25.280214] WARN [SERVER] destroy (ob_server.cpp:430) [52][0][Y0-0000000000000000] [lt=7] [dc=0] memory dump destroyed
[2022-05-22 21:00:25.280250] WARN [SERVER] destroy (ob_server.cpp:432) [52][0][Y0-0000000000000000] [lt=21] [dc=0] tenant timezone manager destroyed
[2022-05-22 21:00:25.282098] WARN [SERVER] destroy (ob_server.cpp:434) [52][0][Y0-0000000000000000] [lt=10] [dc=0] log compressor destroyed
[2022-05-22 21:00:25.282114] WARN [SERVER] destroy (ob_server.cpp:435) [52][0][Y0-0000000000000000] [lt=15] [dc=0] destroy observer end

[Doc]: add developer guide

Check Before Asking

  • Please check the issue list and confirm this issue is encountered for the first time.

Description

Add developer guide documentation

Documentation Links

No response

Are you willing to submit a pull request?

  • Yes I am willing to submit a pull request.

[Feature]: Support more cnis from cloud providers

Describe your use case

currently, ob-operator only supports calico to specify pod's ip address, but there's more cnis provide this ability, consider add more support for cnis provided by cloud providers

Describe the solution you'd like

add support for more cnis
if any user have this need, you can commend below
if any one wants to contribute, we will really appreciate that

Describe alternatives you've considered

No response

Additional context

No response

[Feat.]: OBCluster upgrade

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

on branch 2.0.x_dev
implement OBCluster upgrade

implement OBCluster upgrade

hints for development:
code related to OceanBase operations are placed in directory pkg/oceanbase
the code logic is pretty similar to other controller, create a manager and coordinator, implement interfaces of ResourceManager
verify by image change
image tag is not guaranteed to reflect the real observer version

Other Information

No response

[Feat.]: implement scale up and scale down

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

on branch 2.0.x_dev
implement OBCluster scale up and scale down

hints for development:
code related to OceanBase operations are placed in directory pkg/oceanbase
the code logic is pretty similar to other controller, create a manager and coordinator, implement interfaces of ResourceManager
verify by topology change of OBCluster

Other Information

No response

[Doc]: add architecture doc

Check Before Asking

  • Please check the issue list and confirm this issue is encountered for the first time.

Description

Add a doc to describe the architecture of this project

Documentation Links

No response

Are you willing to submit a pull request?

  • Yes I am willing to submit a pull request.

[Bug]: <create obcluster failed>

Describe the bug

Failed to create obcluster :

my-yaml:

apiVersion: cloud.oceanbase.com/v1
kind: OBCluster
metadata:
  name: ob-test
  namespace: obcluster
spec:
  imageRepo: xxx:5000/oceanbase-cloud-native
  tag: 4.1.0.0-100000192023032010
  imageObagent: xxx:5000/obagent:1.2.0
  clusterID: 1
  topology:
    - cluster: cn
      zone:
      - name: zone1
        region: region1
        nodeSelector:
          ob.zone: zone1
        replicas: 1
      - name: zone2
        region: region1
        nodeSelector:
          ob.zone: zone2
        replicas: 1
      - name: zone3
        region: region1
        nodeSelector:
          ob.zone: zone3
        replicas: 1
      parameters:
        - name: log_disk_size
          value: "250G"
  resources:
    cpu: 8
    memory: 64Gi
    storage:
      - name: data-file
        storageClassName: "local-path-hdd"
        size: 1024Gi
      - name: data-log
        storageClassName: "local-path-hdd"
        size: 300Gi
      - name: log
        storageClassName: "local-path-hdd"
        size: 30Gi
      - name: obagent-conf-file
        storageClassName: "local-path-hdd"
        size: 1Gi
    #volume:
    #    name: backup
    #    nfs:
    #      server: ${nfs_server_address}
    #      path: /opt/nfs

disk:hdd

the pod always restart

operator manager logs:

I0620 15:16:59.777510       1 log.go:26] update status StatefulApp sapp-ob-test {"cluster":"cn","clusterStatus":"Prepareing","subsets":[{"name":"zone1","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone1-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone1-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-obagent-conf-file","phase":"Pending"}]}]},{"name":"zone2","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone2-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone2-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-obagent-conf-file","phase":"Pending"}]}]},{"name":"zone3","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone3-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone3-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-obagent-conf-file","phase":"Pending"}]}]}]}
I0620 15:17:04.805994       1 log.go:26] update status StatefulApp sapp-ob-test {"cluster":"cn","clusterStatus":"Prepareing","subsets":[{"name":"zone1","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone1-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone1-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone2","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone2-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone2-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone3","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone3-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone3-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-obagent-conf-file","phase":"Pending"}]}]}]}
I0620 15:17:14.861043       1 log.go:26] update status StatefulApp sapp-ob-test {"cluster":"cn","clusterStatus":"Prepareing","subsets":[{"name":"zone1","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone1-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone1-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone2","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone2-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone2-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone3","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone3-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone3-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-data-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-obagent-conf-file","phase":"Bound"}]}]}]}
I0620 15:17:24.916263       1 log.go:26] update status StatefulApp sapp-ob-test {"cluster":"cn","clusterStatus":"Prepareing","subsets":[{"name":"zone1","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone1-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone1-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-log","phase":"Pending"},{"name":"sapp-ob-test-cn-zone1-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone2","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone2-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone2-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone3","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone3-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone3-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone3-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-obagent-conf-file","phase":"Bound"}]}]}]}
I0620 15:17:29.947631       1 log.go:26] update status StatefulApp sapp-ob-test {"cluster":"cn","clusterStatus":"Prepareing","subsets":[{"name":"zone1","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone1-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"172.16.44.133","pvcs":[{"name":"sapp-ob-test-cn-zone1-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone2","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone2-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"","pvcs":[{"name":"sapp-ob-test-cn-zone2-0-data-file","phase":"Pending"},{"name":"sapp-ob-test-cn-zone2-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone3","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone3-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"172.16.44.135","pvcs":[{"name":"sapp-ob-test-cn-zone3-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-obagent-conf-file","phase":"Bound"}]}]}]}
I0620 15:17:34.972000       1 log.go:26] update status StatefulApp sapp-ob-test {"cluster":"cn","clusterStatus":"Prepareing","subsets":[{"name":"zone1","region":"region1","expectedReplicas":1,"availableReplicas":1,"pods":[{"name":"sapp-ob-test-cn-zone1-0","index":0,"podPhase":"Running","podIP":"172.18.251.65","nodeIP":"172.16.44.133","pvcs":[{"name":"sapp-ob-test-cn-zone1-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone2","region":"region1","expectedReplicas":1,"availableReplicas":0,"pods":[{"name":"sapp-ob-test-cn-zone2-0","index":0,"podPhase":"Pending","podIP":"","nodeIP":"172.16.44.134","pvcs":[{"name":"sapp-ob-test-cn-zone2-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone3","region":"region1","expectedReplicas":1,"availableReplicas":1,"pods":[{"name":"sapp-ob-test-cn-zone3-0","index":0,"podPhase":"Running","podIP":"172.18.75.172","nodeIP":"172.16.44.135","pvcs":[{"name":"sapp-ob-test-cn-zone3-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-obagent-conf-file","phase":"Bound"}]}]}]}
I0620 15:17:34.980342       1 log.go:26] update status StatefulApp sapp-ob-test {"cluster":"cn","clusterStatus":"Prepareing","subsets":[{"name":"zone1","region":"region1","expectedReplicas":1,"availableReplicas":1,"pods":[{"name":"sapp-ob-test-cn-zone1-0","index":0,"podPhase":"Running","podIP":"172.18.251.65","nodeIP":"172.16.44.133","pvcs":[{"name":"sapp-ob-test-cn-zone1-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone1-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone2","region":"region1","expectedReplicas":1,"availableReplicas":1,"pods":[{"name":"sapp-ob-test-cn-zone2-0","index":0,"podPhase":"Running","podIP":"172.18.11.50","nodeIP":"172.16.44.134","pvcs":[{"name":"sapp-ob-test-cn-zone2-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone2-0-obagent-conf-file","phase":"Bound"}]}]},{"name":"zone3","region":"region1","expectedReplicas":1,"availableReplicas":1,"pods":[{"name":"sapp-ob-test-cn-zone3-0","index":0,"podPhase":"Running","podIP":"172.18.75.172","nodeIP":"172.16.44.135","pvcs":[{"name":"sapp-ob-test-cn-zone3-0-data-file","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-data-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-log","phase":"Bound"},{"name":"sapp-ob-test-cn-zone3-0-obagent-conf-file","phase":"Bound"}]}]}]}
E0620 15:17:35.289786       1 service.go:51] Service "svc-ob-test" not found
E0620 15:17:35.289808       1 service.go:51] Service "svc-ob-test" not found
I0620 15:17:35.297205       1 log.go:28] core.(*OBClusterCtrl).ResourcePrepareingEffectorForBootstrap update OBCluster ob-test to Resource Ready Zone  Status 
E0620 15:17:35.778523       1 service.go:51] Service "svc-ob-test" not found
E0620 15:17:35.778543       1 service.go:51] Service "svc-ob-test" not found
E0620 15:17:35.779054       1 cable_executer.go:49] OBCluster  observer 172.18.251.65 starting not ready
I0620 15:17:35.780327       1 cable_executer.go:41] start observer 172.18.251.65 succeed {"clusterId":1,"clusterName":"ob-test","cpuLimit":8,"customParameters":[{"name":"log_disk_size","value":"250G"}],"memoryLimit":64,"rsList":"172.18.251.65:2882:2881;172.18.11.50:2882:2881;172.18.75.172:2882:2881","version":"4.1.0.0","zoneName":"zone1"}
E0620 15:17:35.781005       1 cable_executer.go:49] OBCluster  observer 172.18.11.50 starting not ready
I0620 15:17:35.782391       1 cable_executer.go:41] start observer 172.18.11.50 succeed {"clusterId":1,"clusterName":"ob-test","cpuLimit":8,"customParameters":[{"name":"log_disk_size","value":"250G"}],"memoryLimit":64,"rsList":"172.18.251.65:2882:2881;172.18.11.50:2882:2881;172.18.75.172:2882:2881","version":"4.1.0.0","zoneName":"zone2"}
E0620 15:17:35.782917       1 cable_executer.go:49] OBCluster  observer 172.18.75.172 starting not ready
I0620 15:17:35.783946       1 cable_executer.go:41] start observer 172.18.75.172 succeed {"clusterId":1,"clusterName":"ob-test","cpuLimit":8,"customParameters":[{"name":"log_disk_size","value":"250G"}],"memoryLimit":64,"rsList":"172.18.251.65:2882:2881;172.18.11.50:2882:2881;172.18.75.172:2882:2881","version":"4.1.0.0","zoneName":"zone3"}
I0620 15:17:35.787938       1 log.go:28] core.(*OBClusterCtrl).ResourceReadyEffectorForBootstrap update OBCluster ob-test to OBServer Prepareing Zone  Status 
E0620 15:17:35.788656       1 cable_executer.go:49] OBCluster ob-test observer 172.18.251.65 starting not ready
E0620 15:17:40.116602       1 cable_executer.go:49] OBCluster ob-test observer 172.18.251.65 starting not ready
E0620 15:17:44.942632       1 cable_executer.go:49] OBCluster ob-test observer 172.18.251.65 starting not ready
E0620 15:17:49.768326       1 cable_executer.go:49] OBCluster ob-test observer 172.18.251.65 starting not ready
E0620 15:17:54.595742       1 service.go:51] Service "svc-ob-test" not found
E0620 15:17:54.595760       1 service.go:51] Service "svc-ob-test" not found
I0620 15:17:54.601343       1 log.go:28] core.(*OBClusterCtrl).OBServerPrepareingEffectorForBootstrap update OBCluster ob-test to OBServer Ready Zone  Status 
I0620 15:17:54.601531       1 cable_converter.go:104] OBCluster bootstrap args ALTER SYSTEM BOOTSTRAP REGION 'region1' ZONE 'zone1' SERVER '172.18.251.65:2882', REGION 'region1' ZONE 'zone2' SERVER '172.18.11.50:2882', REGION 'region1' ZONE 'zone3' SERVER '172.18.75.172:2882'
E0620 15:17:54.601597       1 service.go:51] Service "svc-ob-test" not found
E0620 15:17:54.601617       1 service.go:51] Service "svc-ob-test" not found
I0620 15:17:54.609150       1 log.go:28] core.(*OBClusterCtrl).OBServerReadyEffectorForBootstrap update OBCluster ob-test to OBCluster Bootstraping Zone  Status 
E0620 15:17:54.609261       1 secret.go:51] Secret "secret-ob-test-sys-admin" not found
E0620 15:17:54.974262       1 conn.go:38] 1049 Unknown database 'oceanbase'
I0620 15:17:54.974289       1 pod_controller.go:84] try empty database
E0620 15:18:05.031688       1 sql_operator.go:49] execute sql: ALTER SYSTEM BOOTSTRAP REGION 'region1' ZONE 'zone1' SERVER '172.18.251.65:2882', REGION 'region1' ZONE 'zone2' SERVER '172.18.11.50:2882', REGION 'region1' ZONE 'zone3' SERVER '172.18.75.172:2882' failed
E0620 15:18:05.031703       1 sql_operator.go:50] 4012 Timeout

(/workspace/pkg/controllers/observer/sql/sql_operator.go:46) 
[2023-06-20 15:18:05]  Error 4012: Timeout 

Environment

  • OS Version and CPU Arch(uname -a):
    CentOS 7.2 - 3.10.0-1160.90.1.el7.x86_64
  • Component Version:

Fast Reproduce Steps(Required)

Steps to reproduce the behavior:

Expected behavior

Actual Behavior

Additional context

[Feat.]: obcluster backup support k8s volume

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

obcluster backup support k8s volume

Other Information

No response

[Enhancement]: optimize task manager

on branch 2.0.x_dev
every time a task is submitted, task manager will create a go routine to execute it, maybe it's a better way to use a coroutine pool, and add timeout for task execution

[Feature]: Support startup parameters of observer

Describe your use case

Observer supports setting startup parameters with -o argument according to official document Deploy OceanBase in Terminal and Cluster-level Parameters. Maybe it's better to support setting startup parameters in some ways.

Describe the solution you'd like

  1. Add a type for parameters in OBCluster.Spec.Parameters to mark those parameters need to be placed in startup options.
  2. Add a separate startupParameters field in OBCluster.Spec

Describe alternatives you've considered

No response

Additional context

No response

[Feature]: A powerful cli tool

Describe your use case

A powerful cli tool to simplify user create and maintain resources

Describe the solution you'd like

Add a cli tool implementing all the features provided by ob-operator

Describe alternatives you've considered

No response

Additional context

No response

[Feature]: Migrate existing obcluster to ob-operator

Describe your use case

migrate obcluster from local to K8s, use oceanbase's dynamic scale ability, make this procedure transparent to users

Describe the solution you'd like

  1. add a mode to create an obcluster from existing local obcluster, just add the observers in k8s into the obcluster
  2. delete the local observer
  3. wait local observers deleted
  4. done

Describe alternatives you've considered

No response

Additional context

No response

[Feat.]: ob-operator upgrade

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

ob-operator upgrade policy
since api definitions are quite different before v1 and v2alpha1, upgrade should be considered

Other Information

No response

[Bug]: <ERROR 2013 (HY000): Lost connection to MySQL server during query>

Describe the bug

Sometimes lost connection:

#  obclient -hhadoop002 -P30083 -uroot@sys oceanbase -A -c

obclient [oceanbase]> SET SESSION ob_query_timeout=200000000;
ERROR 2013 (HY000): Lost connection to MySQL server during query
obclient [oceanbase]> SET SESSION ob_query_timeout=200000000;
ERROR 2006 (HY000): OceanBase server has gone away
No connection. Trying to reconnect...
Connection id:    10
Current database: oceanbase

Query OK, 0 rows affected (0.063 sec)

obclient [oceanbase]> set global ob_query_timeout=200000000;
ERROR 2013 (HY000): Lost connection to MySQL server during query
obclient [oceanbase]> set global ob_query_timeout=200000000;
ERROR 2006 (HY000): OceanBase server has gone away
No connection. Trying to reconnect...
Connection id:    13
Current database: oceanbase

Query OK, 0 rows affected (0.294 sec)

obclient [oceanbase]> 

Environment

  • OS Version and CPU Arch(uname -a):

  • Component Version:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: obproxy
  namespace: obcluster
spec:
  selector:
    matchLabels:
      app: obproxy
  replicas: 1
  template:
    metadata:
      labels:
        app: obproxy
    spec:
      containers:
        - name: obproxy
          image: xxxx:5000/oceanbasedev/obproxy-ce:4.1.0.0-7
          ports:
            - containerPort: 2883
              name: "sql"
            - containerPort: 2884
              name: "prometheus"
          env:
            - name: APP_NAME
              value: helloworld
            - name: OB_CLUSTER
              value: ob-test
            - name: RS_LIST
              value: $(SVC_OB_TEST_SERVICE_HOST):$(SVC_OB_TEST_SERVICE_PORT)
          resources:
            limits:
              memory: 16Gi
              cpu: "4"

apiVersion: cloud.oceanbase.com/v1
kind: OBCluster
metadata:
  name: ob-test
  namespace: obcluster
spec:
  imageRepo: hadoop002:5000/oceanbase-cloud-native
  tag: 4.1.0.0-100000192023032010
  imageObagent: hadoop002:5000/obagent:1.2.0
  clusterID: 1
  topology:
    - cluster: cn
      zone:
      - name: zone1
        region: region1
        nodeSelector:
          ob.zone: zone1
        replicas: 1
      - name: zone2
        region: region1
        nodeSelector:
          ob.zone: zone2
        replicas: 1
      - name: zone3
        region: region1
        nodeSelector:
          ob.zone: zone3
        replicas: 1
      parameters:
        - name: log_disk_size
          value: "180G"
  resources:
    cpu: 8
    memory: 64Gi
    storage:
      - name: data-file
        storageClassName: "local-path-hdd"
        size: 200Gi
      - name: data-log
        storageClassName: "local-path-hdd"
        size: 200Gi
      - name: log
        storageClassName: "local-path-hdd"
        size: 30Gi
      - name: obagent-conf-file
        storageClassName: "local-path-hdd"
        size: 1Gi
    #volume:
    #    name: backup
    #    nfs:
    #      server: ${nfs_server_address}
    #      path: /opt/nfs
    #      readOnly: false
apiVersion: v1
kind: Service
metadata:
  name: obproxy-service
  namespace: obcluster
spec:
  type: NodePort
  selector:
    app: obproxy
  ports:
    - name: "sql"
      port: 2883
      targetPort: 2883
      nodePort: 30083
    - name: "prometheus"
      port: 2884
      targetPort: 2884
      nodePort: 30084

k8s : v1.23.12
Fast Reproduce Steps(Required)

Steps to reproduce the behavior:

Expected behavior

Actual Behavior

Additional context

[Feat.]: implement parameter management

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

on branch 2.0.x_dev
implement parameter controller, watch the specific parameters in spec and obcluster, and make them consistent.

hints for development:
code related to OceanBase operations are placed in directory pkg/oceanbase
the code logic is pretty similar to other controller, create a manager and coordinator, implement interfaces of ResourceManager
parameters are defined within OBCluster, when create a OBCluster with parameter definitions should create OBParameter, and OBCluster as it's owner reference

Other Information

No response

[Question]: centos7 升级完内核到 5.4.257,pod 就一直报500,如何进行下一步排查?

Description

env:
[root@centos7-10-8-22-231 ~]# uname -a
Linux centos7-10-8-22-231 5.4.257-1.el7.elrepo.x86_64 #1 SMP Sat Sep 23 07:34:32 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

kubectl logs sapp-ob-cluster-cn-zone1-0 -n publicservice

2023-10-07T09:56:19.52717+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 25.469µs, comment
2023-10-07T09:56:21.52761+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 27.7µs, comment
2023-10-07T09:56:23.52702+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 28.884µs, comment
2023/10/07 09:56:24 observer 73 [sleep]
2023-10-07T09:56:25.52778+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 27.367µs, comment
2023-10-07T09:56:27.52799+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 24.907µs, comment
2023/10/07 09:56:29 observer 73 [sleep]
2023-10-07T09:56:29.52717+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 27.811µs, comment
2023-10-07T09:56:31.5279+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 26.812µs, comment
2023-10-07T09:56:33.52797+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 26.41µs, comment
2023/10/07 09:56:34 observer 73 [sleep]
2023-10-07T09:56:35.52766+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 26.133µs, comment
2023-10-07T09:56:37.52806+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 25.275µs, comment
2023/10/07 09:56:39 observer 73 [sleep]
2023-10-07T09:56:39.52743+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 25.028µs, comment
2023-10-07T09:56:41.52699+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 33.168µs, comment
2023-10-07T09:56:43.52729+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 31.36µs, comment
2023/10/07 09:56:44 observer 73 [sleep]
2023-10-07T09:56:45.52828+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 28.016µs, comment
2023-10-07T09:56:47.527+08:00 INFO [7,] caller=server/http.go:101:func1: request: from 10.8.22.233, method GET, path /api/ob/readiness, response: status code 500, latency 29.062µs, comment

[Question]: observer启动命令中能否只使用local_ip不使用devname

Description

ob-operator:2.0
我们的环境没办法使用calico来支持IP保持不变,并且要考虑在k8s整体挂掉的情况下还能使用数据目录还原数据,所以尝试在使用Hostnetwork模式

func (m *OBServerManager) SupportStaticIp() bool {
	if obcluster.Spec.HostNetwork {
		m.Logger.Info("current network is HostNetwork")
		return true
	}

目前有个问题就是observer启动命令中写死了网卡名称为eth0,在多网卡并且没有eth0的情况下服务启动会有问题
cmd := fmt.Sprintf("cd %s && %s/bin/observer --nodaemon --appname %s --cluster_id %s --zone %s --devname %s
将devname修改为local_ip的方式是否会影响operater的其它功能

[Enhancement]: optimize oceanbase connector manager cache

on branch 2.0.x_dev
Optimize oceanbase connector's cache implementation, currently it's as simple as a sync.Map, no size limitations and data expire policy, maybe it can be substitute with a cache library or add the elimination policy.

使用 NodePort 作为 Service 暴露

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

通过 operator 拉起来的 ob cluster 使用 NodePort 作为透出

Other Information

image

是否有必要使用 NodePort,这样看起来并不是很安全

[Question]: The information of the deleted pod still remains in the cluster.

Description

imager version: oceanbasedev/ob-operator:1.2.0

apiVersion: cloud.oceanbase.com/v1
kind: OBCluster
metadata:
  name: ob-test
  namespace: test
spec:
  imageRepo: oceanbasedev/oceanbase-cn
  tag: v4.1.0.0-100000192023032010
  imageObagent: oceanbase/obagent:1.2.0
  clusterID: 1
  topology:
    - cluster: cn
      zone:
      - name: zone1
        region: region1
        nodeSelector:
          ob.zone: zone1
        replicas: 3
      parameters:
        - name: log_disk_size
          value: "24G"
        - name: system_memory
          value: "1G"
        - name: memory_limit
          value: "9G"
  resources:
    cpu: 4
    memory: 10Gi
    storage:
      - name: data-file
        storageClassName: "local-path"
        size: 50Gi
      - name: data-log
        storageClassName: "local-path"
        size: 50Gi
      - name: log
        storageClassName: "local-path"
        size: 30Gi
      - name: obagent-conf-file
        storageClassName: "local-path"
        size: 1Gi

when testing ob-operator, it was observed that after deleting a pod, it still appears in the cluster information and is displayed as 'INACTIVE' status. The operator continues to perform health checks. Is this considered normal? In a scenario with three zones, each having one node, removing a node from one zone can recover normally. However, when attempting to delete nodes from other zones, new pods fail to start properly.
image
image

[Feature]: Find a better way to maintain observer's ip address

Describe your use case

Find a better way to maintain observer's ip address instead relying on CNI

Describe the solution you'd like

Find a better way to maintain observer's ip address instead relying on CNI
maybe add a service in front of each pod

Describe alternatives you've considered

No response

Additional context

No response

[Bug]: 使用kubernete v1.27.2 helm部署ob-operator ,出现 User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope

Describe the bug

使用helm 部署 ob-operator ,manager报错如下:

image

E0825 10:05:47.016362 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:47.047356 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:47.201811 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:48.393312 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.TenantBackup: failed to list *v1.TenantBackup: tenantbackups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenantbackups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:48.707764 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:49.984304 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:50.342153 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:53.729373 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:53.821711 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.TenantBackup: failed to list *v1.TenantBackup: tenantbackups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenantbackups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:55.396156 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:56.695090 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:04.554326 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.TenantBackup: failed to list *v1.TenantBackup: tenantbackups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenantbackups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:05.043140 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:05.963808 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:06.751226 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:20.786552 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:24.288853 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:26.521183 1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope

helm安装命令:

oceanbase

helm install ob-operator oceanbase/ob-operator \
--set image=******/dockerhub/oceanbasedev/ob-operator:1.2.0 \
--namespace publicservice \
--create-namespace

Environment

uname -a
image

kubernetes: v1.27.2

image

Fast reproduce steps

helm安装命令:

oceanbase

helm install ob-operator oceanbase/ob-operator \
--set image=******/dockerhub/oceanbasedev/ob-operator:1.2.0 \
--namespace publicservice \
--create-namespace

日志报错如下:
service:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:53.729373       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:53.821711       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.TenantBackup: failed to list *v1.TenantBackup: tenantbackups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenantbackups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:55.396156       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:05:56.695090       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:04.554326       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.TenantBackup: failed to list *v1.TenantBackup: tenantbackups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenantbackups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:05.043140       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:05.963808       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:06.751226       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:20.786552       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Restore: failed to list *v1.Restore: restores.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "restores" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:24.288853       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Tenant: failed to list *v1.Tenant: tenants.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "tenants" in API group "cloud.oceanbase.com" at the cluster scope
E0825 10:06:26.521183       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.Backup: failed to list *v1.Backup: backups.cloud.oceanbase.com is forbidden: User "system:serviceaccount:publicservice:ob-operator-controller-manager" cannot list resource "backups" in API group "cloud.oceanbase.com" at the cluster scope



### Expected behavior

_No response_

### Actual behavior

_No response_

### Additional context

_No response_

部署到k8s集群上,pod一直不ready

软件版本:
ob-operator使用v3.1.3-10100042022051821版本
问题说明:
我按照官方文档的步骤,进行部署,官方文档地址:https://open.oceanbase.com/docs/observer-cn/V3.1.3/0000000000160093,
由于默认需要的资源占用比较大,默认2核10G的资源配置我调小了,现在pod部署成功了,但是一直没有ready.如图所示:
image
查看pod容器内的日志,显示探针报错:
image
这个问题和资源有关吗?还是镜像有bug?

[Feature]: Support scale up

Describe your use case

support scale up resource, although OceanBase is an distributed database system, scaling up is still the best choice in some scenario, especially the initial resource is too small

Describe the solution you'd like

support resource scale up

Describe alternatives you've considered

No response

Additional context

No response

[Feat.]: implement tenant management

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

2.0.x_dev
implement tenant controller to handle OBTenant

hints for development:
code related to OceanBase operations are placed in directory pkg/oceanbase
the code logic is pretty similar to other controller, create a manager and coordinator, implement interfaces of ResourceManager
OBTenant defined separately from OBCluster, owner reference maintenance should be considered, when OBCluster is deleted, the related OBTenant should be deleted simultaneously
OBTenant has OBUnit as it's sub resource, actually, unit controller should implemented at the same time

Other Information

No response

[Feature]: Support standalone mode

Describe your use case

OceanBase relies on multiple replicas to achieve high availability, but this consumes lots of resource, users may have limited resource, the current implementation relies on CNI plugins to specify pod's ip address, otherwise, the cluster will crash when the only observer restarts with another ip address, so a need to properly implement the standalone mode

Describe the solution you'd like

  1. define an annotation to specify the standalone mode
  2. start obcluster with one obzone and one observer ip address using '127.0.0.1'
  3. (optional) use one storage for all the data, clog and log, if the storage class supports snapshot, then you can flash back to any savepoint
  4. create a service for this obcluster incase to use standby tenant

Describe alternatives you've considered

No response

Additional context

No response

[Enhancement]: OB Pod与业务集群Pod隔离开

Enhancement
背景:OB集群Pod与业务应用系统的众多Pod运行在同一K8S集群内
问题:有时由于业务应用Pod较多调度到了OB集群的nodeSelector机器上,这样OB因为CPU、内存不足导致OB集群Pod无法启动(K8S报主机资源不足无法调度)
需求:OB Pod与业务集群Pod隔离开。
目前nodeSelector只能决定OB的Pod调度到哪些机器,且我们是没权限变更业务应用Pod让其调度到非OB集群机器的。我们设想通过设置主机污点让业务应用Pod无法调度到OB机器上,同时OB容忍这个污点,达到OB Pod与业务集群Pod隔离开的目的。
不过,据我了解,目前OB没有"容忍污点"的能力,请问是否能纳入版本计划加入这个功能?或者请问有其他方式来实现OB Pod与业务集群Pod隔离开的方案吗?

[Feature]: Deploy oceanbase in multiple K8s cluster

Describe your use case

deploy oceanbase in multiple K8s cluster

Describe the solution you'd like

  1. deploy cluster in multiple K8s cluster and create tenant and standby tenant
  2. deploy multiple zones in multiple K8s cluster (optional)

Describe alternatives you've considered

No response

Additional context

No response

[Bug]: Specify memory limit when create observer

Describe the bug

current implementation relies on observer to calculate memory limit automatically, but in some environment, observer will get the memory size of node instead of pod and gives a wrong memory limit size, this will cause OOM when observer reach the memory limit of the pod

Environment

cri: containerd

Fast reproduce steps

create a obcluster with memory size less then the node's memory size
check the memory_limit in obcluster

Expected behavior

No response

Actual behavior

No response

Additional context

No response

[Feature]: Support cluster operation resource like tenant operation

Describe your use case

It is common that users modify topology and parameters of a OBClsuter, which is perhaps more common than what users do to OBTenant. Small single jobs like update parameters, change topology, upgrade cluster should be supported with a single auxiliary resource like OBClusterOperation.

Describe the solution you'd like

Create a new CRD naming OBClusterOperation, which contains parameter modification, topology modification, cluster upgrading and any other operational tasks. Make OBClsuter thinner.

Describe alternatives you've considered

No response

Additional context

No response

[Feat.]: add proper event to show the key operation and errors

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

add proper event to show the key operation and errors

Other Information

No response

[Question]: <How to stop OBCluster>

If the question is concise and probably has a short answer, asking it in community Slack

Question
hello,
I want to know how to stop OBCluster, not delete the OBCluster. I want to keep the pvc data. Thank you !

Environment

  • OS Version and CPU Arch(uname -a):

  • Component Version:

Weekly Meeting

What is ob-operator

The ob-operator is a Kubernetes operator that simplifies the deployment and management of OceanBase clusters on Kubernetes.

Current status of ob-operator

v1

Covers all fundamental operations of oceanbase and related resource, including

  • Cluster creation
  • Metric collection ability
  • Scale up and scale down of obzone and observer
  • Cluster upgrade
  • Tenant management
  • Backup and restore

v2alpha1

  • Code refactoring
  • Successfully bootstraped obcluster

What's new in v2alpha1

  • API(CRD) definitions are more conformed with OceanBase
  • Split the whole process to operate a resource into smaller tasks and executed within unified processing logic
  • Seperate common util logics into independent module
  • Support new features
  • Make some enhancement

Why we need you

  • Real production scenarios and requirements using K8s
  • More feedback
  • Collabration

Issues

https://github.com/oceanbase/ob-operator/issues?q=is%3Aopen+is%3Aissue

scale down should delete observer in obcluster first

ob-operator deletes observer pod first when scale down before delete observer in obcluster, this may cause data lose when the majority of a unit were on the deleted pod.

suggest: delete observer in obcluster first, and then delete the pod

[Feat.]: implement backup and restore management

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

on branch 2.0.x_dev
implement backup and restore controller to handle backup and restore

hints for development:
code related to OceanBase operations are placed in directory pkg/oceanbase
the code logic is pretty similar to other controller, create a manager and coordinator, implement interfaces of ResourceManager
backup and restore for version 3.x and 4.x are with large difference, define different CRDs to describe different version

Other Information

No response

[Feat.]: disaster recovery use existing data when possible

Check Before Asking

  • Please check the issue list and confirm this feature is encountered for the first time.
  • Please try full text in English and attach precise description.

Description

currently disaster recovery logic is to create a new pod as a new observer scaled into OBCluster and delete the unhealthy one, this may cause data copy

Is it possible to reuse the data when PVC's reclaim policy is retain
to reuse the data require the new pod to have the same ip address with the former one

Other Information

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.