[카테고리:] 기술
-
Kubernetes Node Disk Pressure Threshold Adjustment (evictionHard)
When node disk usage exceeds a certain threshold, the Disk Pressure condition is triggered and pods are evicted.Default hard thresholds (Linux-based): In other words, by default, DiskPressure occurs when root (nodefs) reaches 90% usage and imagefs reaches 85% usage. Adjustment Result (usage basis): Threshold Key Descriptions 📌 Summary: Disk usage can now reach up to…
-
Kubernetes 노드 Disk Pressure 임계치 조정 (evictionHard)
노드 디스크 사용량이 임계치를 넘으면 DiskPressure 조건이 발생하고 파드가 축출(eviction).기본 하드 임계치 (Linux 기준): 즉, 기본값은 루트(nodefs) 90% 사용, 이미지(imagefs) 85% 사용 시점에서 DiskPressure 발생. 조정 내용 📌 요약: 디스크 사용량이 95%까지 올라가도 DiskPressure가 걸리지 않도록 상향 조정→ 기본 대비 여유 공간 버퍼를 크게 줄인 설정.⚠️ 리스크: 로그 기록, 업데이트, 임시 파일 쓰기 실패 가능성…
-
Resolving Installation Conflicts Caused by Undeleted Resources in Kubernetes
When operating a Kubernetes cluster, you may encounter situations where deploying a new add-on or application fails because resources from a previous installation were not completely removed. A common culprit is leftover Webhook resources such as MutatingWebhookConfiguration or ValidatingWebhookConfiguration. Since these components intercept API requests to validate or mutate resources, their presence can cause unexpected…
-
Kubernetes에서 삭제되지 않은 리소스로 인한 설치 충돌 문제 해결하기
운영 중 Kubernetes 클러스터에서 애드온(addon)이나 신규 애플리케이션을 배포하다 보면,종종 기존 리소스가 완전히 삭제되지 않은 상태에서 잔여 리소스가 충돌을 일으키는 경우가 있다. 대표적으로 MutatingWebhookConfiguration 또는 ValidatingWebhookConfiguration 과 같은 웹훅(Webhook) 리소스가 삭제되지 않고 남아있을 때 이런 문제가 발생.이 리소스들은 API 요청 경로에 개입하여 리소스 생성/수정을 검증하기 때문에, 예상치 못한 에러를 유발할 수 있음. 증상 1. 잔여 웹훅…
-
Clean Removal & Re-Addition of Kubernetes Worker Nodes (with Kubespray)
Scope: Worker nodes only (excluding control plane and etcd)Assumption: Running cluster in production, minimize downtime ✅ Checklist 0. Pre-check kubectl drain respects PodDisruptionBudgets (PDBs). If drain is blocked by a PDB, scale out/in temporarily or relax the PDB before proceeding. 📌 PodDisruptionBudget (PDB) If there are 3 pods with label app=my-api, drain will only remove…
-
Kubespray 기반 K8s에서 워커 노드 삭제 & 재추가
대상: Kubespray로 구축/운영 중인 워커 노드(control-plane, etcd 제외)전제: 운영 중 클러스터, 다운타임 최소화 목표 체크리스트 0. 변경 전 점검 1. 워커 노드 드레인 & K8s 오브젝트 제거 기본 절차: 드레인 → 노드 삭제. CNI(예: Calico) 잔여 리소스 정리 (선택) 드물게 노드 오브젝트 삭제 후에도 CNI의 노드/아이피 할당이 남는 경우가 있습니다. Calico 사용 시: ⚠️ 주의:…
-
An Examination of Monitoring Metrics: Part 5 MongoDB
MongoDB is more than a simple document database. It is widely used as a session store, log analytics engine, and even a messaging backbone.To ensure stable operations, monitoring must cover availability, performance, resources, cursors & connections, and network usage.The following are the core metrics that should always be part of a MongoDB monitoring strategy. 1.…
-
An Examination of Monitoring Metrics: Part 4 Elasticsearch
1. Cluster Health Metrics cluster healthOverall cluster status. unassigned shardsNumber of shards not assigned to any node. 2. Resource Metrics Total size of all file stores / Total available size to JVM in all file stores ⚠️ Problem points when Available decreases rapidly Summary jvm_heap_usage_percent node uptime 3. Performance Metrics query latencySearch query response time.…
-
An Examination of Monitoring Metrics: Part 3 Redis
1. Memory Metrics used_memory mem_fragmentation_ratio evicted_keys 2. Performance Metrics instantaneous_ops_per_sec slowlog 3. Connection Metrics blocked_clients connected_clients rejected_connections 4. Network Metrics total_net_input_bytes / total_net_output_bytes 5. Persistence Metrics (Persistence: the property of data being safely preserved beyond memory to disk) rdb_last_bgsave_status aof_last_bgrewrite_status rdb_changes_since_last_save 6. Cache Efficiency Metrics keyspace_hits / keyspace_misses ⚠ Note: The default Redis template for…
-
An Examination of Monitoring Metrics: Part 2 Kafka
In the previous article, we looked at MySQL metrics. This time, we turn to Kafka.In production environments, Kafka has grown beyond being just a simple message queue to become a critical data streaming platform.Therefore, closely monitoring the state of Kafka brokers and clusters is essential for preventing incidents and ensuring stable performance. In this article,…