Monitoring Server dengan Grafana dan Prometheus: Panduan Lengkap 2025

Pelajari cara setup monitoring server comprehensive menggunakan Grafana dan Prometheus. Tutorial step-by-step untuk IT Infrastructure professionals.

Irfan Haris

15 Januari 20251 menit baca

Pendahuluan

Monitoring infrastruktur IT adalah aspek krusial dalam menjaga performa dan availability sistem enterprise. Sebagai IT Profesional & Technology Enthusiast dengan pengalaman 14+ tahun, saya akan berbagi panduan lengkap setup monitoring menggunakan Grafana dan Prometheus.

Dalam tutorial ini, kita akan:

  • Setup Prometheus untuk metrics collection
  • Install dan konfigurasi Grafana
  • Membuat dashboard monitoring yang comprehensive
  • Setup alerting untuk proactive monitoring

Prerequisites

Sebelum memulai, pastikan Anda memiliki:

  • Ubuntu Server 20.04 LTS atau lebih baru
  • Akses root atau sudo privileges
  • Basic knowledge tentang Linux command line
  • Pemahaman konsep dasar networking

Step 1: Install Prometheus

Download dan Install Prometheus

# Update system package
sudo apt update && sudo apt upgrade -y

# Create prometheus user
sudo useradd --no-create-home --shell /bin/false prometheus

# Create directories
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus

# Set ownership
sudo chown prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus

# Download Prometheus
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.40.0/prometheus-2.40.0.linux-amd64.tar.gz

# Extract
tar xvf prometheus-2.40.0.linux-amd64.tar.gz
cd prometheus-2.40.0.linux-amd64

# Copy binaries
sudo cp prometheus /usr/local/bin/
sudo cp promtool /usr/local/bin/

# Set ownership
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool

Konfigurasi Prometheus

# Copy config files
sudo cp -r consoles /etc/prometheus
sudo cp -r console_libraries /etc/prometheus

# Set ownership
sudo chown -R prometheus:prometheus /etc/prometheus/consoles
sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries

Buat file konfigurasi Prometheus:

# /etc/prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']

Setup Prometheus Service

# Create systemd service file
sudo tee /etc/systemd/system/prometheus.service > /dev/null <<EOF
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \\
    --config.file /etc/prometheus/prometheus.yml \\
    --storage.tsdb.path /var/lib/prometheus/ \\
    --web.console.templates=/etc/prometheus/consoles \\
    --web.console.libraries=/etc/prometheus/console_libraries \\
    --web.listen-address=0.0.0.0:9090 \\
    --web.enable-lifecycle

[Install]
WantedBy=multi-user.target
EOF

# Reload systemd and start prometheus
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus

# Check status
sudo systemctl status prometheus

Step 2: Install Node Exporter

Node Exporter diperlukan untuk mengumpulkan metrics dari server:

# Download Node Exporter
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz

# Extract
tar xvf node_exporter-1.5.0.linux-amd64.tar.gz

# Copy binary
sudo cp node_exporter-1.5.0.linux-amd64/node_exporter /usr/local/bin

# Create user
sudo useradd --no-create-home --shell /bin/false node_exporter

# Set ownership
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter

Setup Node Exporter Service

# Create systemd service
sudo tee /etc/systemd/system/node_exporter.service > /dev/null <<EOF
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
EOF

# Start service
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter

Step 3: Install Grafana

Add Grafana Repository

# Install required packages
sudo apt-get install -y software-properties-common

# Add GPG key
sudo wget -q -O /usr/share/keyrings/grafana.key https://apt.grafana.com/gpg.key

# Add repository
echo "deb [signed-by=/usr/share/keyrings/grafana.key] https://apt.grafana.com stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list

# Update package list
sudo apt-get update

# Install Grafana
sudo apt-get install grafana

Setup Grafana Service

# Enable and start Grafana
sudo systemctl enable grafana-server
sudo systemctl start grafana-server

# Check status
sudo systemctl status grafana-server

Step 4: Konfigurasi Grafana

Akses Grafana Web Interface

  1. Buka browser dan akses: http://your-server-ip:3000
  2. Login dengan kredensial default:
    • Username: admin
    • Password: admin
  3. Anda akan diminta mengganti password default

Add Prometheus Data Source

  1. Klik ConfigurationData Sources
  2. Klik Add data source
  3. Pilih Prometheus
  4. Set URL: http://localhost:9090
  5. Klik Save & Test

Step 5: Membuat Dashboard

Import Dashboard Template

Grafana menyediakan dashboard template siap pakai:

  1. Klik +Import
  2. Masukkan dashboard ID: 1860 (Node Exporter Full)
  3. Klik Load
  4. Pilih Prometheus data source
  5. Klik Import

Custom Dashboard

Untuk kebutuhan spesifik, Anda bisa membuat custom dashboard:

{
  "dashboard": {
    "title": "Server Infrastructure Monitoring",
    "panels": [
      {
        "title": "CPU Usage",
        "type": "stat",
        "targets": [
          {
            "expr": "100 - (avg by (instance) (irate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)"
          }
        ]
      }
    ]
  }
}

Step 6: Setup Alerting

Konfigurasi Alert Rules

Buat file alert rules:

# /etc/prometheus/alert_rules.yml
groups:
- name: server_alerts
  rules:
  - alert: ServerDown
    expr: up == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Server {{ $labels.instance }} is down"
      description: "{{ $labels.instance }} has been down for more than 1 minute."

  - alert: HighCPUUsage
    expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage on {{ $labels.instance }}"
      description: "CPU usage is above 80% for more than 5 minutes."

Update Prometheus config untuk include alert rules:

# Add to /etc/prometheus/prometheus.yml
rule_files:
  - "alert_rules.yml"

Setup Notification Channels

Di Grafana:

  1. AlertingNotification channels
  2. Add channel
  3. Pilih type (Email, Slack, Teams, dll)
  4. Konfigurasi sesuai kebutuhan

Best Practices

1. Security

# Setup firewall
sudo ufw allow 9090/tcp  # Prometheus
sudo ufw allow 3000/tcp  # Grafana
sudo ufw allow 9100/tcp  # Node Exporter

# Atau restrict ke specific IPs
sudo ufw allow from YOUR_IP to any port 9090

2. Performance Tuning

# Prometheus retention settings
global:
  scrape_interval: 15s
  evaluation_interval: 15s

# Storage retention
--storage.tsdb.retention.time=30d
--storage.tsdb.retention.size=10GB

3. Backup Strategy

# Backup Grafana database
sudo cp /var/lib/grafana/grafana.db /backup/grafana-$(date +%Y%m%d).db

# Backup Prometheus data
sudo rsync -av /var/lib/prometheus/ /backup/prometheus-$(date +%Y%m%d)/

Monitoring 700+ Servers Experience

Berdasarkan pengalaman saya mengelola monitoring untuk 700+ server di Kawan Lama Group:

1. Scaling Considerations

  • Gunakan Prometheus federation untuk multiple clusters
  • Implement service discovery untuk auto-registration
  • Setup load balancing untuk Grafana

2. Alert Management

  • Group alerts by severity dan team responsibility
  • Implement escalation policies
  • Use alert correlation untuk reduce noise

3. Dashboard Organization

  • Buat dashboard per service/team
  • Implement consistent naming convention
  • Use template variables untuk flexibility

Troubleshooting Common Issues

Issue 1: Prometheus Target Down

# Check service status
sudo systemctl status node_exporter

# Check connectivity
curl http://localhost:9100/metrics

# Check firewall
sudo ufw status

Issue 2: Grafana Dashboard Tidak Load

# Check Grafana logs
sudo journalctl -u grafana-server -f

# Verify data source connection
# Check Prometheus query syntax

Issue 3: High Memory Usage

# Monitor Prometheus memory
ps aux | grep prometheus

# Adjust retention settings
# Implement recording rules untuk complex queries

Penutup

Dengan mengikuti tutorial ini, Anda telah berhasil setup monitoring infrastructure yang robust menggunakan Grafana dan Prometheus. Setup ini mampu:

  • Monitor multiple servers secara real-time
  • Provide comprehensive dashboards
  • Send alerts untuk proactive monitoring
  • Scale untuk enterprise environments

Next Steps

  1. Expand Monitoring Coverage

    • Add database monitoring (MySQL, PostgreSQL)
    • Implement application monitoring
    • Setup log aggregation with ELK Stack
  2. Advanced Features

    • Setup Prometheus clustering
    • Implement custom metrics
    • Add business metrics monitoring
  3. Integration

    • Connect dengan ITSM tools
    • Setup automated remediation
    • Implement capacity planning

Related Posts:


💡 Pro Tip: Monitoring adalah investasi jangka panjang untuk operational excellence. Start simple, tapi build untuk scale!

Tags: #Monitoring #Grafana #Prometheus #Infrastructure #DevOps #ServerManagement