#!/bin/bash # 物联网基站服务健康检查脚本 set -e echo "=== 物联网基站服务健康检查 ===" # 检查PostgreSQL echo "检查PostgreSQL..." if docker-compose exec -T postgres pg_isready -U postgres > /dev/null 2>&1; then echo "✓ PostgreSQL: 健康" # 检查数据库连接 if docker-compose exec -T postgres psql -U postgres -d iot_base_station -c "SELECT 1;" > /dev/null 2>&1; then echo " - 数据库连接: 正常" else echo " - 数据库连接: 异常" fi else echo "✗ PostgreSQL: 不可用" fi # 检查InfluxDB echo "检查InfluxDB..." if curl -s http://localhost:8086/health > /dev/null 2>&1; then echo "✓ InfluxDB: 健康" # 检查组织是否存在 if curl -s -H "Authorization: Token influxdb-token" "http://localhost:8086/api/v2/orgs" | grep -q "iot-org"; then echo " - 组织配置: 正常" else echo " - 组织配置: 异常" fi else echo "✗ InfluxDB: 不可用" fi # 检查Redis echo "检查Redis..." if docker-compose exec -T redis redis-cli ping > /dev/null 2>&1; then echo "✓ Redis: 健康" # 检查内存使用 MEMORY=$(docker-compose exec -T redis redis-cli info memory | grep used_memory_human | cut -d: -f2 | tr -d '\r') echo " - 内存使用: $MEMORY" elif redis-cli ping > /dev/null 2>&1; then echo "✓ Redis (本地): 健康" MEMORY=$(redis-cli info memory | grep used_memory_human | cut -d: -f2 | tr -d '\r') echo " - 内存使用: $MEMORY" else echo "✗ Redis: 不可用" fi # 检查NATS echo "检查NATS..." if curl -s http://localhost:8222/varz > /dev/null 2>&1; then echo "✓ NATS: 健康" # 检查连接数 CONNECTIONS=$(curl -s http://localhost:8222/varz | jq -r '.connections // 0') echo " - 当前连接数: $CONNECTIONS" else echo "✗ NATS: 不可用" fi # 检查MQTT echo "检查MQTT..." if docker-compose exec -T mqtt mosquitto_pub -h localhost -t '$SYS/broker/version' -m 'test' > /dev/null 2>&1; then echo "✓ MQTT: 健康" # 检查连接数 CONNECTIONS=$(docker-compose exec -T mqtt mosquitto_sub -h localhost -t '$SYS/broker/connections' -C 1 | grep -o '[0-9]\+' || echo "0") echo " - 当前连接数: $CONNECTIONS" elif mosquitto_pub -h localhost -t '$SYS/broker/version' -m 'test' > /dev/null 2>&1; then echo "✓ MQTT (本地): 健康" else echo "✗ MQTT: 不可用" fi # 检查Grafana echo "检查Grafana..." if curl -s http://localhost:3000/api/health > /dev/null 2>&1; then echo "✓ Grafana: 健康" else echo "✗ Grafana: 不可用" fi # 检查Prometheus echo "检查Prometheus..." if curl -s http://localhost:9090/-/healthy > /dev/null 2>&1; then echo "✓ Prometheus: 健康" # 检查目标数量 TARGETS=$(curl -s http://localhost:9090/api/v1/targets | jq -r '.data.activeTargets | length') echo " - 活跃目标数: $TARGETS" else echo "✗ Prometheus: 不可用" fi # 检查应用服务 echo "检查应用服务..." if curl -s http://localhost:8080/health > /dev/null 2>&1; then echo "✓ 主服务器: 健康" else echo "✗ 主服务器: 不可用" fi if curl -s http://localhost:8081/health > /dev/null 2>&1; then echo "✓ 数据网关: 健康" else echo "✗ 数据网关: 不可用" fi if curl -s http://localhost:8082/health > /dev/null 2>&1; then echo "✓ 监控服务: 健康" else echo "✗ 监控服务: 不可用" fi echo "" echo "=== 健康检查完成 ===" # 显示资源使用情况 echo "" echo "=== 资源使用情况 ===" docker stats --no-stream --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}" # 显示磁盘使用情况 echo "" echo "=== 磁盘使用情况 ===" df -h | grep -E "(Filesystem|/dev/)" # 显示网络连接 echo "" echo "=== 网络连接 ===" netstat -tuln | grep -E "(LISTEN|5432|8086|6379|4222|1883|3000|9090|8080|8081|8082)" || echo "无法获取网络连接信息"