commandby lancejames221b
Hv Status
Monitor hAIveMind collective health, agent status, and system performance
Installs: 0
Used in: 1 repos
Updated: 1d ago
$
npx ai-builder add command lancejames221b/hv-statusInstalls to .claude/commands/hv-status.md
# hv-status - Collective Health Monitor ## Purpose Comprehensive monitoring of hAIveMind collective health including agent availability, memory utilization, network connectivity, and system performance metrics. ## When to Use - **Daily Health Checks**: Monitor collective operational status - **Troubleshooting Issues**: Diagnose connectivity or performance problems - **Capacity Planning**: Monitor memory usage and agent workload - **Network Monitoring**: Check Tailscale connectivity between agents - **Before Delegating**: Verify target agents are available and responsive - **Performance Analysis**: Understand collective resource utilization ## Syntax ``` hv-status [options] ``` ## Parameters - **options** (optional): Display filtering and formatting - `--detailed`: Show comprehensive information for all sections - `--agents`: Show only agent roster and availability - `--memory`: Show only memory statistics and usage - `--network`: Show only network connectivity status - `--json`: Output in JSON format for programmatic use - `--quiet`: Show only critical issues, minimal output ## Status Information Sections ### Agent Roster and Availability - **Active Agents**: Currently online and responding - **Agent Capabilities**: Skills and expertise each agent provides - **Response Times**: Average response latency for each agent - **Workload Status**: Current task queue and availability - **Last Seen**: When each agent was last active ### Memory Statistics - **Storage Utilization**: ChromaDB and Redis usage metrics - **Memory Categories**: Distribution across infrastructure, incidents, etc. - **Growth Trends**: Memory usage over time - **Cache Performance**: Redis hit rates and efficiency - **Cleanup Status**: Old memory removal and optimization ### Network Health - **Tailscale Connectivity**: Connection status to each machine - **API Endpoints**: Health check status for MCP servers - **Sync Performance**: Inter-machine synchronization latency - **Certificate Status**: SSL/TLS certificate validity - **Firewall Status**: Port accessibility between nodes ## Real-World Examples ### Quick Health Check ``` hv-status ``` **Result**: Overview of collective health with key metrics and any issues highlighted ### Detailed System Analysis ``` hv-status --detailed ``` **Result**: Comprehensive report suitable for troubleshooting or performance analysis ### Agent Availability Check ``` hv-status --agents ``` **Result**: Focus on which agents are available for task delegation ### Memory Usage Analysis ``` hv-status --memory ``` **Result**: Storage metrics for capacity planning and cleanup decisions ### Programmatic Monitoring ``` hv-status --json --quiet ``` **Result**: JSON output for automated monitoring scripts, errors only ## Expected Output ### Standard Status Overview ``` ๐ hAIveMind Collective Status - 2025-01-24 14:30:00 ๐ฏ Collective Health: โ OPERATIONAL โณ 12 of 14 agents responding (85.7%) โณ 2 agents offline: tony-dev, mike-dev (non-critical) โณ Average response time: 245ms โณ No critical issues detected ๐ค Agent Roster (Top 5 by Activity): โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ โ Agent Name โ Capabilities โ Status โ Response โ Workload โ โโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโค โ elastic1-specialist โ elasticsearch โ โ Online โ 180ms โ โโโโโโโโโโ โ โ lance-dev-agent โ coordination โ โ Online โ 120ms โ โโโโโโโโโโ โ โ security-analyst โ security โ โ Online โ 290ms โ โโโโโโโโโโ โ โ mysql-specialist โ database_ops โ โ Online โ 340ms โ โโโโโโโโโโ โ โ monitoring-agent โ monitoring โ โ Online โ 205ms โ โโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโ ๐พ Memory Statistics: โณ Total Memories: 8,742 (โ 127 today) โณ Storage Usage: 2.3 GB / 50 GB (4.6%) โณ Categories: Infrastructure 35%, Incidents 28%, Security 18%, Other 19% โณ Redis Cache: 89% hit rate, 512 MB used ๐ Network Health: โณ Tailscale: โ Connected to 11 nodes โณ MCP Servers: โ All endpoints responding โณ Sync Status: โ Last sync 14 minutes ago โณ Certificate: Valid until 2025-06-15 ๐ Recent Activity (Last 24h): โณ Broadcasts: 23 (โ 8 from yesterday) โณ Delegations: 45 (โ 12 from yesterday) โณ Queries: 156 (โ 3 from yesterday) โณ Memory Stores: 89 (โ 15 from yesterday) โ ๏ธ Warnings: โณ elastic2 response time increased 40% (480ms avg) โณ Memory growth rate above normal (โ 18% this week) ๐ก Recommendations: โณ Consider restarting elastic2 agent to improve response time โณ Schedule memory cleanup for memories older than 6 months โณ Monitor tony-dev and mike-dev connectivity issues ``` ### Agents-Only View ``` ๐ค hAIveMind Agent Roster - 2025-01-24 14:30:00 ๐ 12 Active Agents | 2 Offline | 14 Total Registered ๐ข ONLINE AGENTS: โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ โ Agent โ Capabilities โ Response โ Workload โ Last Task โ โโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโค โ lance-dev-agent โ coordination, infrastructure โ 120ms โ โโโโโโโโโโ โ 12 min ago โ โ elastic1-specialist โ elasticsearch_ops, cluster โ 180ms โ โโโโโโโโโโ โ 45 min ago โ โ security-analyst โ security, incident_response โ 290ms โ โโโโโโโโโโ โ 2 hours ago โ โ mysql-specialist โ database_ops, optimization โ 340ms โ โโโโโโโโโโ โ 30 min ago โ โ monitoring-agent โ monitoring, alerting โ 205ms โ โโโโโโโโโโ โ 8 min ago โ โ proxy1-agent โ scraping, data_collection โ 410ms โ โโโโโโโโโโ โ 3 min ago โ โ auth-specialist โ security, authentication โ 198ms โ โโโโโโโโโโ โ 1 hour ago โ โ grafana-agent โ monitoring, visualization โ 234ms โ โโโโโโโโโโ โ 25 min ago โ โ elastic3-specialist โ elasticsearch_ops โ 267ms โ โโโโโโโโโโ โ 18 min ago โ โ dev-coordinator โ development, code_review โ 156ms โ โโโโโโโโโโ โ 1.5 hr ago โ โ kafka-specialist โ data_processing, streaming โ 445ms โ โโโโโโโโโโ โ 22 min ago โ โ redis-specialist โ caching, performance โ 189ms โ โโโโโโโโโโ โ 38 min ago โ โโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโ ๐ด OFFLINE AGENTS: โณ tony-dev (development) - Last seen: 6 hours ago โณ mike-dev (development) - Last seen: 2 days ago ๐ฏ CAPABILITY DISTRIBUTION: โณ Development: 3 agents (2 offline) โณ Infrastructure: 4 agents โณ Database: 3 agents โณ Security: 2 agents โณ Monitoring: 2 agents โณ Data Processing: 2 agents โจ TOP PERFORMERS (Last 24h): 1. lance-dev-agent: 23 tasks completed 2. monitoring-agent: 18 tasks completed 3. proxy1-agent: 15 tasks completed ``` ### Memory Statistics Detail ``` ๐พ hAIveMind Memory Statistics - 2025-01-24 14:30:00 ๐ STORAGE OVERVIEW: โณ Total Memories: 8,742 items โณ Storage Size: 2.3 GB (compressed) โณ Growth Rate: +127 memories today (+18% this week) โณ Oldest Memory: 2024-06-15 (223 days ago) ๐ CATEGORY BREAKDOWN: โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ โ Category โ Count โ Size (MB) โ Avg Size โ Growth (7 days) โ โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโค โ infrastructure โ 3,059 โ 847 โ 284 KB โ +156 (+5.4%) โ โ incidents โ 2,448 โ 623 โ 261 KB โ +89 (+3.8%) โ โ security โ 1,573 โ 412 โ 269 KB โ +45 (+2.9%) โ โ deployments โ 874 โ 198 โ 232 KB โ +23 (+2.7%) โ โ monitoring โ 523 โ 134 โ 263 KB โ +34 (+7.0%) โ โ runbooks โ 265 โ 89 โ 344 KB โ +12 (+4.7%) โ โโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโ ๐ PERFORMANCE METRICS: โณ Search Latency: 287ms average (โ 15ms from last week) โณ Insert Rate: 43.2 memories/hour โณ Redis Hit Rate: 89.3% (excellent) โณ Vector Index: 94.1% efficiency ๐งน CLEANUP STATUS: โณ Last Cleanup: 2025-01-20 02:00:00 โณ Eligible for Cleanup: 234 memories (older than 180 days) โณ Estimated Space Recovery: 67 MB โณ Next Scheduled Cleanup: 2025-01-27 02:00:00 ๐ TRENDING TOPICS (Last 7 days): 1. elasticsearch performance (47 memories) 2. security vulnerability patches (31 memories) 3. database optimization (28 memories) 4. network connectivity issues (22 memories) 5. deployment automation (19 memories) ``` ## Performance Metrics and Thresholds ### Agent Response Time Classifications - **Excellent**: < 200ms (immediate response) - **Good**: 200-400ms (normal operation) - **Slow**: 400-800ms (potential issues) - **Critical**: > 800ms (needs investigation) ### Memory Usage Thresholds - **Normal**: < 60% of allocated storage - **Warning**: 60-80% of allocated storage - **Critical**: > 80% of allocated storage - **Emergency**: > 95% of allocated storage ### Network Health Indicators - **All Green**: > 90% agents responsive - **Warning**: 70-90% agents responsive - **Degraded**: 50-70% agents responsive - **Critical**: < 50% agents responsive ## Common Status Issues and Solutions ### Offline Agents ``` ๐ด OFFLINE: elastic2-specialist (Last seen: 2 hours ago) ๐ก Troubleshooting Steps: 1. Check machine connectivity: ping elastic2 2. Verify MCP server: curl http://elastic2:8900/health 3. Check system resources: ssh elastic2 'top -bn1' 4. Restart services: ssh elastic2 'sudo systemctl restart memory-mcp-server' ``` ### High Memory Usage ``` โ ๏ธ Memory usage at 78% (Warning threshold) ๐ก Recommended Actions: 1. Run memory cleanup: hv-sync clean --memory 2. Archive old memories: memories older than 6 months 3. Review memory retention policies 4. Consider storage expansion if growth continues ``` ### Network Connectivity Issues ``` โ Tailscale connectivity degraded (67% nodes reachable) ๐ก Diagnostic Steps: 1. Check Tailscale status: tailscale status 2. Restart Tailscale: sudo systemctl restart tailscaled 3. Verify routing: tailscale ping elastic1 4. Check firewall rules on affected machines ``` ### Poor Performance ``` ๐ Average response time: 847ms (Above normal threshold) ๐ก Performance Optimization: 1. Check system resources on slow agents 2. Review network latency between machines 3. Consider Redis cache optimization 4. Restart high-latency agents ``` ## Best Practices for Status Monitoring - **Daily Checks**: Run `hv-status` as part of daily routine - **Performance Baselines**: Track response times and memory growth trends - **Proactive Maintenance**: Address warnings before they become critical - **Automation**: Use `--json` output for automated monitoring scripts - **Documentation**: Record recurring issues and solutions in collective memory ## Related Commands - **After finding issues**: Use `hv-delegate` to assign resolution tasks - **For connectivity issues**: Use `hv-sync` to refresh configurations - **Performance problems**: Use `hv-query` to find similar past incidents - **Share findings**: Use `hv-broadcast` to inform collective about status changes ## Troubleshooting Status Command Issues ### Command Not Responding 1. Check local MCP server: `curl http://localhost:8900/health` 2. Verify Redis connectivity: `redis-cli ping` 3. Check system resources: `top`, `df -h`, `free -m` 4. Restart local services if needed ### Incomplete Data 1. Some agents may be temporarily unreachable (normal) 2. Network partitions can affect data collection 3. Check Tailscale connectivity to affected machines 4. Wait 1-2 minutes and retry for transient issues ### Outdated Information 1. Status data cached for 60 seconds for performance 2. Use `--detailed` to force fresh data collection 3. Check last sync timestamp in output 4. Network delays may affect data freshness --- This command provides comprehensive health monitoring for the hAIveMind collective, helping you maintain optimal performance and quickly identify issues requiring attention.
Quick Install
$
npx ai-builder add command lancejames221b/hv-statusDetails
- Type
- command
- Author
- lancejames221b
- Slug
- lancejames221b/hv-status
- Created
- 4d ago