1
0
Fork 0
mirror of https://github.com/chrislusf/seaweedfs synced 2025-06-29 08:12:47 +02:00
seaweedfs/telemetry/README.md
2025-06-28 20:27:26 -07:00

9.2 KiB

SeaweedFS Telemetry System

A privacy-respecting telemetry system for SeaweedFS that collects cluster-level usage statistics and provides visualization through Prometheus and Grafana.

Features

  • Privacy-First Design: Uses in-memory cluster IDs (regenerated on restart), no personal data collection
  • Prometheus Integration: Native Prometheus metrics for monitoring and alerting
  • Grafana Dashboards: Pre-built dashboards for data visualization
  • Protocol Buffers: Efficient binary data transmission for optimal performance
  • Opt-in Only: Disabled by default, requires explicit configuration
  • Docker Compose: Complete monitoring stack deployment
  • Automatic Cleanup: Configurable data retention policies

Architecture

SeaweedFS Cluster → Telemetry Client → Telemetry Server → Prometheus → Grafana
                       (protobuf)         (metrics)      (queries)

Data Transmission

The telemetry system uses Protocol Buffers exclusively for efficient binary data transmission:

  • Compact Format: 30-50% smaller than JSON
  • Fast Serialization: Better performance than text-based formats
  • Type Safety: Strong typing with generated Go structs
  • Schema Evolution: Built-in versioning support

Protobuf Schema

message TelemetryData {
  string cluster_id = 1;           // In-memory generated UUID
  string version = 2;              // SeaweedFS version
  string os = 3;                   // Operating system
  // Field 4 reserved (was features)
  // Field 5 reserved (was deployment)
  int32 volume_server_count = 6;   // Number of volume servers
  uint64 total_disk_bytes = 7;     // Total disk usage
  int32 total_volume_count = 8;    // Total volume count
  int32 filer_count = 9;           // Number of filer servers
  int32 broker_count = 10;         // Number of broker servers
  int64 timestamp = 11;            // Collection timestamp
}

Privacy Approach

  • No Personal Data: No hostnames, IP addresses, or user information
  • In-Memory IDs: Cluster IDs are generated in-memory and change on restart
  • Aggregated Data: Only cluster-level statistics, no individual file/user data
  • Opt-in Only: Telemetry is disabled by default
  • Transparent: Open source implementation, clear data collection policy

Collected Data

Field Description Example
cluster_id In-memory UUID (changes on restart) a1b2c3d4-...
version SeaweedFS version 3.45
os Operating system and architecture linux/amd64
volume_server_count Number of volume servers 5
total_disk_bytes Total disk usage across cluster 1073741824
total_volume_count Total number of volumes 120
filer_count Number of filer servers 2
broker_count Number of broker servers 1
timestamp When data was collected 1640995200

Quick Start

1. Deploy Telemetry Server

# Clone and start the complete monitoring stack
git clone https://github.com/seaweedfs/seaweedfs.git
cd seaweedfs/telemetry
docker-compose up -d

# Or run the server directly
cd server
go run . -port=8080 -dashboard=true

2. Configure SeaweedFS

# Enable telemetry in SeaweedFS master (uses default telemetry.seaweedfs.com)
weed master -telemetry=true

# Or in server mode
weed server -telemetry=true

# Or specify custom telemetry server
weed master -telemetry=true -telemetry.url=http://localhost:8080/api/collect

3. Access Dashboards

Configuration

SeaweedFS Master/Server

# Enable telemetry
-telemetry=true

# Set custom telemetry server URL (optional, defaults to telemetry.seaweedfs.com)
-telemetry.url=http://your-telemetry-server:8080/api/collect

Telemetry Server

# Server configuration
-port=8080                    # Server port
-dashboard=true               # Enable built-in dashboard
-cleanup=24h                  # Cleanup interval
-max-age=720h                 # Maximum data retention (30 days)

# Example
./telemetry-server -port=8080 -dashboard=true -cleanup=24h -max-age=720h

Prometheus Metrics

The telemetry server exposes these Prometheus metrics:

Cluster Metrics

  • seaweedfs_telemetry_total_clusters: Total unique clusters (30 days)
  • seaweedfs_telemetry_active_clusters: Active clusters (7 days)

Per-Cluster Metrics

  • seaweedfs_telemetry_volume_servers{cluster_id, version, os}: Volume servers per cluster
  • seaweedfs_telemetry_disk_bytes{cluster_id, version, os}: Disk usage per cluster
  • seaweedfs_telemetry_volume_count{cluster_id, version, os}: Volume count per cluster
  • seaweedfs_telemetry_filer_count{cluster_id, version, os}: Filer servers per cluster
  • seaweedfs_telemetry_broker_count{cluster_id, version, os}: Broker servers per cluster
  • seaweedfs_telemetry_cluster_info{cluster_id, version, os}: Cluster metadata

Server Metrics

  • seaweedfs_telemetry_reports_received_total: Total telemetry reports received

API Endpoints

Data Collection

# Submit telemetry data (protobuf only)
POST /api/collect
Content-Type: application/x-protobuf
[TelemetryRequest protobuf data]

Statistics (JSON for dashboard/debugging)

# Get aggregated statistics
GET /api/stats

# Get recent cluster instances
GET /api/instances?limit=100

# Get metrics over time
GET /api/metrics?days=30

Monitoring

# Prometheus metrics
GET /metrics

Docker Deployment

# docker-compose.yml
version: '3.8'
services:
  telemetry-server:
    build: ./server
    ports:
      - "8080:8080"
    command: ["-port=8080", "-dashboard=true", "-cleanup=24h"]
    
  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      
  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - ./grafana-provisioning:/etc/grafana/provisioning
      - ./grafana-dashboard.json:/var/lib/grafana/dashboards/seaweedfs.json
# Deploy the stack
docker-compose up -d

# Scale telemetry server if needed
docker-compose up -d --scale telemetry-server=3

Server Only

# Build and run telemetry server
cd server
docker build -t seaweedfs-telemetry .
docker run -p 8080:8080 seaweedfs-telemetry -port=8080 -dashboard=true

Development

Protocol Buffer Development

# Generate protobuf code
cd telemetry
protoc --go_out=. --go_opt=paths=source_relative proto/telemetry.proto

# The generated code is already included in the repository

Build from Source

# Build telemetry server
cd telemetry/server
go build -o telemetry-server .

# Build SeaweedFS with telemetry support
cd ../..
go build -o weed ./weed

Testing

# Test telemetry server
cd telemetry/server
go test ./...

# Test protobuf communication (requires protobuf tools)
# See telemetry client code for examples

Grafana Dashboard

The included Grafana dashboard provides:

  • Overview: Total and active clusters, version distribution
  • Resource Usage: Volume servers and disk usage over time
  • Infrastructure: Operating system distribution and server counts
  • Growth Trends: Historical growth patterns

Custom Queries

# Total active clusters
seaweedfs_telemetry_active_clusters

# Disk usage by version
sum by (version) (seaweedfs_telemetry_disk_bytes)

# Volume servers by operating system
sum by (os) (seaweedfs_telemetry_volume_servers)

# Filer servers by version
sum by (version) (seaweedfs_telemetry_filer_count)

# Broker servers across all clusters
sum(seaweedfs_telemetry_broker_count)

# Growth rate (weekly)
increase(seaweedfs_telemetry_total_clusters[7d])

Security Considerations

  • Network Security: Use HTTPS in production environments
  • Access Control: Implement authentication for Grafana and Prometheus
  • Data Retention: Configure appropriate retention policies
  • Monitoring: Monitor the telemetry infrastructure itself

Troubleshooting

Common Issues

SeaweedFS not sending data:

# Check telemetry configuration
weed master -h | grep telemetry

# Verify connectivity
curl -v http://your-telemetry-server:8080/api/collect

Server not receiving data:

# Check server logs
docker-compose logs telemetry-server

# Verify metrics endpoint
curl http://localhost:8080/metrics

Prometheus not scraping:

# Check Prometheus targets
curl http://localhost:9090/api/v1/targets

# Verify configuration
docker-compose logs prometheus

Debugging

# Enable verbose logging in SeaweedFS
weed master -v=2 -telemetry=true

# Check telemetry server metrics
curl http://localhost:8080/metrics | grep seaweedfs_telemetry

# Test data flow
curl http://localhost:8080/api/stats

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

This telemetry system is part of SeaweedFS and follows the same Apache 2.0 license.