前言:
由于业务属性,加上crmeb php代码执行效率低下,服务器每晚7点cpu准时报警,swarm搞了三个worker有时候仍然需要手动扩容, 所以干脆移到了k8s。不知道咋从开发也兼职上运维了,好在原来也感兴趣折腾下吧。
本次迁移使用全新k8s集群,等于从0开始。将会使用阿里云ack 基础版本(自带面板和prometheus,主节点免维护)且本身可以无缝升级成pro,基础版最多只支持10个node。
虽然阿里云面板很智能可以基本靠点点点大法完成资源申请绑定等,但还是建议使用本机kubelet + yaml文件管理资源(方便服务重建和跑路)。
迁移后增加成本计算 nat网关费用5元 + slb 4元 + 公网ip费用4毛 加上云盘和入口流量,每天增加费用控制在20以内。
主要安装 loki+alloy+grafaba用做日志。ingress + cret-manager 实现证书下发。
ACK服务创建和网络打通
ACK创建
可以选择绑定EIP(api调用) + SLB(默认且不可删除,如果失误删除集群只能重建)。
网络打通
选择与原服务同地区服务器方便网络打通 选择专有网络 vps对等链接(vpcpeers), 打通原swarm 和 k8s 集群网络。这样可以使原来的数据库服务不需要切换交换机导致服务中断,新集群的pod也可正常访问数据库。
节点池配置
主要是涉及到 kubelet(日志和image自动清理) 和 contained(翻墙)设置
node文件位置 /var/lib/kubelet/config.yaml /etc/containerd/config.toml
可以直接使用节点池配置增加,这样新的node将会直接配置好这些东西
日志搭建
日志效果:
安装Loki记录日志
# Loki安装 helm repo add grafana https://grafana.github.io/helm-charts helm repo update helm install loki grafana/loki -f values.yaml ### 安装 helm install --values values.yaml loki grafana/loki
配置使用最小化安装
loki: replicaCount: 1 # 设置副本数为1 commonConfig: replication_factor: 1 schemaConfig: configs: - from: "2024-04-01" store: tsdb # 或 tsdb object_store: alibabacloud schema: v13 index: prefix: loki_index_ period: 24h auth_enabled: false minio: enabled: false storage: type: s3 bucketNames: chunks: k8s-xxx-oss ruler: k8s-xxx-oss storage_config: alibabacloud: bucket: k8s-xxxx-oss endpoint: oss-cn-hangzhou-internal.aliyuncs.com access_key_id:xxxxx secret_access_key:xxxx table_manager: retention_deletes_enabled: true # 启用自动删除过期日志 retention_period: 336h # 日志保留时长(需为周期表时长的整数倍) lokiCanary: enabled: false # 启用直接发送日志到 Loki # 不生成 canary容器 test: enabled: false deploymentMode: SingleBinary singleBinary: replicas: 1 chunksCache: enabled: false backend: replicas: 0 read: replicas: 0 write: replicas: 0 ingester: replicas: 0 querier: replicas: 0 queryFrontend: replicas: 0 queryScheduler: replicas: 0 distributor: replicas: 0 compactor: replicas: 0 indexGateway: replicas: 0 bloomCompactor: replicas: 0 bloomGateway: replicas: 0
安装 alloy 采集日志
kubectl create configmap --namespace monitoring alloy-config "--from-file=config.alloy=./config.alloy"
配置如下 注意动态标签只能再申明regex块后被添加,而静态lable则不需要。这是个坑。
alloy: configMap: content: |- local.file_match "local_files" { path_targets = [{"__path__" = "/var/log/apps/*/*.log"}] sync_period = "1s" } loki.source.file "log_scrape" { targets = local.file_match.local_files.targets forward_to = [loki.process.filter_logs.receiver] tail_from_end = true } loki.process "filter_logs" { // stage.drop { // source = "" // expression = ".*Connection closed by authenticating user root" // drop_counter_reason = "noisy" // } forward_to = [loki.process.add_labels.receiver] } loki.process "add_labels" { forward_to = [loki.relabel.relable.receiver] stage.regex { expression = "/var/log/apps/(?P[^/]+)/[^/]+\\.[^/]+" source = "filename" } stage.static_labels { values = { "env" = "prod", } } stage.labels { values = { "appname" = "appname", } } } loki.relabel "relable" { forward_to = [loki.write.grafana_loki.receiver] } loki.write "grafana_loki" { endpoint { url = "http://loki-gateway.loki.svc.cluster.local/loki/api/v1/push" } } mounts: # -- Mount /var/log from the host into the container for log collection. varlog: false
grafana 安装
helm install grafana grafana/grafana –namespace monitoring –create-namespace
剩下的就是把loki source放到grafana了。
实现自动证书签发
建议直接去看这个文章 https://zhuanlan.zhihu.com/p/691940896
这里只给出一个ClusterIssuer demo 调试时可以使用下面等查看签发状态
kubectl get certificate
kubectl get certificaterequests
apiVersion: cert-manager.io/v1 kind: ClusterIssuer metadata: name: zerossl-production spec: acme: # ZeroSSL ACME server server: https://acme.zerossl.com/v2/DV90 email: cyanprober@gmail.com # name of a secret used to store the ACME account private key privateKeySecretRef: name: zerossl-prod # for each cert-manager new EAB credencials are required externalAccountBinding: keyID: xxxxx keySecretRef: name: zero-ssl-eabsecret key: secret solvers: - dns01: webhook: groupName: acme.cyanprobe.com solverName: alidns config: region: "cn-hangzhou" accessKeyIdRef: name: alidns-secret key: access-key-id accessKeySecretRef: name: alidns-secret key: access-key-secret selector: dnsZones: - 'xxxx1.com' - dns01: webhook: groupName: acme.cyanprobe.com solverName: alidns config: region: "cn-hangzhou" accessKeyIdRef: name: alidns-secret-k key: access-key-id accessKeySecretRef: name: alidns-secret-k key: access-key-secret selector: dnsZones: - 'xxx.top'