Troubleshooting — Elasticsearch

Overview

Elasticsearch is the search and analytics engine at the core of the Diskover platform. Every file and directory crawled by Diskover is stored as a document in Elasticsearch — metadata such as file name, path, size, owner, timestamps, and tags are all indexed here. Diskover's search, analytics dashboards, and reporting all query Elasticsearch directly.

Diskover uses the Python elasticsearch client library (7.13.4) and is compatible with Elasticsearch 8.x. By default, each crawl creates a new index named diskover-<path>-<timestamp> (e.g., diskover-home-prod-240318143022), though index names are configurable — the only requirement is the diskover- prefix.

Service Management

Elasticsearch is commonly run as a systemd service. If the service file has been installed:

RHEL / CentOS / Rocky Linux / Ubuntu / Debian

# Start
sudo systemctl start elasticsearch

# Stop
sudo systemctl stop elasticsearch

# Restart
sudo systemctl restart elasticsearch

# Status
sudo systemctl status elasticsearch

# Enable on boot
sudo systemctl enable elasticsearch

Note: If Elasticsearch was installed via RPM (RHEL/Rocky) or DEB (Ubuntu), the systemd service name is always elasticsearch. The service file is located at /usr/lib/systemd/system/elasticsearch.service.

Configuration

The primary Elasticsearch configuration file is /etc/elasticsearch/elasticsearch.yml.

Key Settings

# Cluster and node identity
cluster.name: diskover
node.name: node-1

# Data and log paths
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

# Network binding
network.host: 0.0.0.0        # or specific IP; use localhost for single-node
http.port: 9200

# Single-node discovery (standalone installs)
discovery.type: single-node

# Disable X-Pack security (if not using authentication)
xpack.security.enabled: false

Note: Elasticsearch must be restarted for changes to elasticsearch.yml to take effect.

JVM Heap Size

Heap is configured in /etc/elasticsearch/jvm.options (or /etc/elasticsearch/jvm.options.d/).

Rule of thumb: Set heap to 50% of available RAM, with a maximum of 31GB.

# Example: 8GB heap for a 16GB RAM server
-Xms8g
-Xmx8g

Heap values for -Xms (initial) and -Xmx (max) should always be equal. Restart Elasticsearch after changing heap settings.

Multi-Node Clusters

Node Roles

In larger deployments, Elasticsearch nodes are assigned dedicated roles rather than running all roles on every node. Common production topologies for Diskover:

Role	Recommended count	Responsibility
`master`	3 (odd number)	Cluster coordination and state management
`data`	2+	Storing and querying index data
`ingest`	Optional	Pre-processing pipelines (rarely needed for Diskover)

For smaller installs a single node or two-node setup with all roles is common and fully supported.

elasticsearch.yml for Multi-Node

Each node needs its own elasticsearch.yml. Key settings that differ from single-node:

# On all nodes
cluster.name: diskover
node.name: node-1                    # unique per node

network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

# Seed hosts — list all nodes in the cluster
discovery.seed_hosts:
  - "node1.example.com"
  - "node2.example.com"
  - "node3.example.com"

# Initial master nodes — only used on first cluster formation, list master-eligible nodes
cluster.initial_master_nodes:
  - "node1"
  - "node2"
  - "node3"

# Node roles (omit for all-roles node, or set explicitly)
node.roles: [master, data]

Note: cluster.initial_master_nodes is only required on first startup to form the cluster. It can be removed from config after the cluster has formed, but leaving it in is harmless.

Note: Elasticsearch must be restarted on each node for configuration changes to take effect.

Checking Cluster Nodes

# List all nodes with roles, RAM, heap, and load
curl -s "http://localhost:9200/_cat/nodes?v&h=name,ip,roles,heap.percent,ram.percent,load_1m,node.role"

# Check shard distribution across nodes
curl -s "http://localhost:9200/_cat/shards/diskover-*?v&h=index,shard,prirep,state,node"

# Check allocation status (node disk usage and shard counts)
curl -s "http://localhost:9200/_cat/allocation?v"

Configuring Diskover for Multiple Nodes

Diskover maintains two separate Elasticsearch connection configurations — one for the web UI and one for the indexing workers:

Web UI connection: Diskover Admin > Configuration > Elasticsearch (Web)
Indexer/worker connection: Diskover Admin > Configuration > Diskover > Elasticsearch

Within each configuration, use the + Add Item button under Hosts to add each node's hostname or IP. All nodes should share the same port, auth, and SSL settings within a config.

Cluster sniffing (auto-discover all nodes) can be enabled in Diskover Admin > Configuration > Diskover > Elasticsearch by toggling on Sniff Cluster. When enabled, Diskover queries the cluster for available nodes on startup and automatically updates its connection pool if nodes are added or removed. This is useful for larger clusters but should not be used with load balancers.

Tip: For clusters with nodes in different network locations (e.g., remote data centres), enable HTTP Compress in Diskover Admin > Configuration > Diskover > Configurations > {ConfigName} > Elasticsearch Overrides to reduce bandwidth during bulk indexing.

Cross-Cluster Search (Enterprise)

Diskover supports searching across multiple Elasticsearch clusters from a single Diskover-Web instance. This is an Enterprise edition feature and requires configuring remote clusters in Elasticsearch first.

Once remote clusters are set up in Elasticsearch, add them in Diskover Admin > Configuration > Elasticsearch (Web) under Remote Clusters. Each remote cluster requires:

Cluster Alias — must match the alias configured in Elasticsearch
Hosts / Port / Auth — connection details for that cluster
Offline toggle — disable a remote cluster without removing its config

Security & HTTPS

Enabling X-Pack Security

Elasticsearch ships with X-Pack security disabled by default in some versions. To require authentication and enable TLS:

# /etc/elasticsearch/elasticsearch.yml
xpack.security.enabled: true

# TLS for HTTP (client-to-node traffic — used by Diskover)
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: certs/http.p12

# TLS for transport (node-to-node traffic — required for multi-node clusters)
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12

Note: Elasticsearch must be restarted for security changes to take effect.

Generating Certificates

Elasticsearch includes elasticsearch-certutil for generating certificates.

Single-node or internal CA (self-signed):

# Generate a CA
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil ca

# Generate node certificate signed by that CA
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12

# For HTTP layer specifically
sudo /usr/share/elasticsearch/bin/elasticsearch-certutil http

Move the generated .p12 files to /etc/elasticsearch/certs/ and ensure they are owned by elasticsearch:

sudo mkdir -p /etc/elasticsearch/certs
sudo mv elastic-certificates.p12 /etc/elasticsearch/certs/
sudo chown -R elasticsearch:elasticsearch /etc/elasticsearch/certs
sudo chmod 660 /etc/elasticsearch/certs/*.p12

Set the keystore password (if you set one during cert generation):

sudo /usr/share/elasticsearch/bin/elasticsearch-keystore add xpack.security.transport.ssl.keystore.secure_password
sudo /usr/share/elasticsearch/bin/elasticsearch-keystore add xpack.security.http.ssl.keystore.secure_password

Setting Up Passwords

After enabling security, set built-in user passwords:

sudo /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive

This sets passwords for elastic, kibana_system, logstash_system, and others. The elastic superuser account is typically what Diskover connects with, though a dedicated role/user is recommended for production.

Configuring Diskover to Use HTTPS

Both the web UI and indexer connection configs have independent HTTPS settings.

Web UI: Diskover Admin > Configuration > Elasticsearch (Web) Indexer: Diskover Admin > Configuration > Diskover > Elasticsearch

In each config:

Field	Description
HTTPS	Toggle on to use encrypted HTTPS instead of HTTP
Username / Password	HTTP Basic Auth credentials (leave blank if not using auth)
Verify Certificate	Toggle off to skip SSL verification (useful for self-signed certs in dev; not recommended for production)
CA Certificates Path	Absolute path to a CA bundle file on the server running Diskover (e.g., `/etc/elasticsearch/certs/ca.crt`). Leave blank to use the system default CA bundle.

For API Key authentication (ES 8+), set the API key in Diskover Admin > Configuration > Diskover > Elasticsearch under the Advanced section:

API Key ID — leave blank if using ES 8 secret-only format
API Key Secret — the full API key value

curl Commands with HTTPS

Update all curl commands in this document by replacing http:// with https:// and adding the appropriate flags:

With certificate verification (CA-signed cert):

curl -s --cacert /etc/elasticsearch/certs/ca.crt \
  -u elastic:yourpassword \
  "https://localhost:9200/_cluster/health?pretty"

Skipping certificate verification (self-signed, dev/test only):

curl -sk -u elastic:yourpassword "https://localhost:9200/_cluster/health?pretty"

With API key:

curl -s --cacert /etc/elasticsearch/certs/ca.crt \
  -H "Authorization: ApiKey <base64-encoded-key>" \
  "https://localhost:9200/_cluster/health?pretty"

Log Locations

Log	Path
Main application log	`/var/log/elasticsearch/<cluster_name>.log`
Slow search log	`/var/log/elasticsearch/<cluster_name>_index_search_slowlog.log`
Deprecation log	`/var/log/elasticsearch/<cluster_name>_deprecation.log`
GC log	`/var/log/elasticsearch/gc.log`

The cluster name defaults to diskover unless changed in elasticsearch.yml.

Tail the main log:

sudo tail -f /var/log/elasticsearch/diskover.log

Common Operations

All operations below use the Elasticsearch REST API via curl. Replace localhost:9200 with your Elasticsearch host/port if remote. See Security & HTTPS for how to adapt these commands for authenticated or TLS-secured clusters.

Check Cluster Health

curl -s http://localhost:9200/_cluster/health?pretty

Response status will be one of:

green — All shards assigned and healthy
yellow — Primary shards assigned, but some replicas are unassigned (common and expected on single-node installs with replicas: 0)
red — One or more primary shards are unassigned; data may be unavailable

List Diskover Indices

Diskover surfaces index management in the UI under Diskover Admin > Indices. To query directly:

curl -s "http://localhost:9200/_cat/indices/diskover-*?v&s=creation.date:desc"

This lists all Diskover indices sorted by creation date (newest first), along with size, doc count, and health status.

Check Disk Usage by Index

curl -s "http://localhost:9200/_cat/indices/diskover-*?v&h=index,store.size,docs.count&s=store.size:desc"

Delete an Index

Indices can be deleted from Diskover Admin > Indices. To delete via the API directly:

curl -X DELETE "http://localhost:9200/diskover-<indexname>"

Replace <indexname> with the full index name. Deleted indices cannot be recovered.

Clear Disk Watermark Issues

Elasticsearch stops accepting writes when disk usage exceeds the flood-stage watermark (default 95%). To temporarily raise the watermarks or clear the read-only block:

Check current disk usage:

curl -s "http://localhost:9200/_cat/allocation?v"

Remove read-only block from all indices:

curl -X PUT "http://localhost:9200/_all/_settings" \
  -H 'Content-Type: application/json' \
  -d '{"index.blocks.read_only_allow_delete": null}'

Temporarily raise watermarks (use after freeing disk space):

curl -X PUT "http://localhost:9200/_cluster/settings" \
  -H 'Content-Type: application/json' \
  -d '{
    "transient": {
      "cluster.routing.allocation.disk.watermark.low": "90%",
      "cluster.routing.allocation.disk.watermark.high": "92%",
      "cluster.routing.allocation.disk.flood_stage": "95%"
    }
  }'

After freeing disk space, reset these to defaults by setting values to null.

Force Merge an Index (Reclaim Disk Space)

After deleting documents or finishing a large crawl, force-merging reduces segment count and reclaims disk space:

curl -X POST "http://localhost:9200/diskover-<indexname>/_forcemerge?max_num_segments=1"

This is a resource-intensive operation. Run during off-peak hours.

Manually Refresh an Index

Diskover sets refresh_interval during crawls for performance (see Performance Tuning During Crawls). To make newly indexed documents immediately searchable outside of the normal refresh cycle:

curl -X POST "http://localhost:9200/diskover-<indexname>/_refresh"

Check Index Settings (Shards, Replicas, Refresh Interval)

curl -s "http://localhost:9200/diskover-<indexname>/_settings?pretty"

Update Index Replicas

Diskover defaults to 0 replicas (appropriate for single-node). Replica count can be set per scan configuration in Diskover Admin > Configuration > Diskover > Configurations > {ConfigName} under Elasticsearch Overrides. To update an existing index directly:

curl -X PUT "http://localhost:9200/diskover-<indexname>/_settings" \
  -H 'Content-Type: application/json' \
  -d '{"index": {"number_of_replicas": 1}}'

Diskover-Specific Behaviour

Index Naming

All Diskover indices must be prefixed with diskover-. The default naming pattern is:

diskover-<crawl_path>-<YYMMDDHHmmss>

Index names are configurable per scan — the only hard requirement enforced by Diskover is the diskover- prefix. Diskover will refuse to write to or delete indices that do not match it.

Performance Tuning During Crawls

Diskover automatically adjusts index settings at the start and end of each crawl to maximise indexing throughput:

Setting	During crawl	After crawl
`refresh_interval`	`30s` (configurable)	`1s`
`number_of_replicas`	`0` (if `disable_replicas=true`)	configured value
`translog.durability`	`async`	`request`
`translog.flush_threshold_size`	`1gb`	`512mb`

Per-scan overrides (shards, replicas, chunk size, max connections, HTTP compression) are configured per scan in Diskover Admin > Configuration > Diskover > Configurations > {ConfigName}, at the bottom of the config under Elasticsearch Overrides. Multiple named configurations can be saved and selected per scan.

Global settings (refresh interval, translog size, translog sync interval) apply across all scans and are set in Diskover Admin > Configuration > Diskover > Elasticsearch.

Special Document Types

Within each Diskover index, two document types track crawl metadata:

indexinfo — Records crawl start/end time, hostname, Diskover version, and task info
spaceinfo — Records disk space totals (total, used, free) for each scanned mount point

These are used by Diskover's UI for crawl history and storage reporting. Do not manually delete documents of these types.

Troubleshooting

Elasticsearch Won't Start

Check the service status and recent journal entries:

sudo systemctl status elasticsearch
sudo journalctl -u elasticsearch -n 100 --no-pager

Common causes:

Heap too large: JVM heap exceeds available RAM. Reduce -Xmx in jvm.options.
Port conflict: Another process is using port 9200. Check with sudo ss -tlnp | grep 9200.
File descriptor limits: Check /etc/security/limits.conf — Elasticsearch requires at least 65535 open files.
```
# Verify current limit for the elasticsearch user
sudo -u elasticsearch ulimit -n
```
Permissions on data directory: The elasticsearch user must own /var/lib/elasticsearch.
```
sudo chown -R elasticsearch:elasticsearch /var/lib/elasticsearch
```

Diskover Cannot Connect to Elasticsearch

Test connectivity from the Diskover server:

curl -s http://<es-host>:9200/_cluster/health

Things to check:

Firewall rules: port 9200 (HTTP) and 9300 (transport) must be open between Diskover and ES hosts
network.host in elasticsearch.yml — must not be localhost if connecting remotely

Credentials: if xpack.security.enabled: true, verify the username/password configured in Diskover

RHEL/Rocky — check firewall:

sudo firewall-cmd --list-all
sudo firewall-cmd --add-port=9200/tcp --permanent
sudo firewall-cmd --reload

Ubuntu — check ufw:

sudo ufw status
sudo ufw allow 9200/tcp

Cluster Status is Red

A red cluster means one or more primary shards are unassigned. This is usually caused by a node going down.

Identify unassigned shards:

curl -s "http://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason" | grep UNASSIGNED

Check allocation explanation:

curl -s "http://localhost:9200/_cluster/allocation/explain?pretty"

On single-node installs, a red status after restart often resolves on its own once the node finishes loading shards. Wait 1–2 minutes and re-check health.

Crawl Fails with Bulk Upload Errors

Symptoms: Diskover logs show BulkIndexError or TransportError during a crawl.

Common causes and fixes:

Disk watermark exceeded: See Clear Disk Watermark Issues above.
Index is read-only: Remove the read-only block (see above).
Timeout during large bulk operations: Increase the connection timeout (default 60s) in Diskover Admin > Configuration > Diskover > Elasticsearch.
Mapping conflicts: A field was indexed with an incompatible type. Delete the index and re-crawl.

High Memory Usage / GC Pressure

Check JVM stats:

curl -s "http://localhost:9200/_nodes/stats/jvm?pretty" | grep -A5 heap

If heap usage is consistently above 75%, either:

Increase heap in jvm.options (up to 31GB max)
Delete old Diskover indices to reduce the number of shards in the cluster
Increase available RAM on the host

SSL / Certificate Errors

Symptom: Diskover logs or Admin UI shows SSL: CERTIFICATE_VERIFY_FAILED or ConnectionError when HTTPS is enabled.

Steps:

Verify the certificate path is correct and readable by the Diskover process:
```
ls -la /etc/elasticsearch/certs/ca.crt
```

Test the connection manually from the Diskover server:

curl -s --cacert /path/to/ca.crt -u user:pass https://<es-host>:9200/_cluster/health

If using a self-signed cert, either supply the CA path in Diskover Admin > Configuration > Elasticsearch under CA Certificates Path, or disable Verify Certificate (dev/test only).

Check certificate expiry:

openssl x509 -enddate -noout -in /etc/elasticsearch/certs/ca.crt

Node Left the Cluster / Split Brain

Symptom: Cluster health is red or yellow unexpectedly; _cat/nodes shows fewer nodes than expected.

Check which nodes are in the cluster:

curl -s "http://localhost:9200/_cat/nodes?v&h=name,ip,roles,node.role"

Check cluster state and master:

curl -s "http://localhost:9200/_cluster/state/master_node?pretty"