Dell - PowerScale Attributes
License: PRO+ (Professional Edition or higher)
Plugin Type: Post-Index Plugin
Author: Diskover Data, Inc.
Overview
The Dell PowerScale Attributes plugin enriches your Diskover indices with comprehensive storage metadata from Dell PowerScale (Isilon) clusters. By querying a custom PowerScale API service, this plugin provides deep visibility into physical and logical storage consumption, data protection levels, SSD tier placement, compression status, and file attributes—all searchable directly within Diskover.
This plugin is ideal for organizations running Dell PowerScale storage who need granular insights into how their data is stored, protected, and distributed across storage tiers and node pools.
Key Capabilities
Storage Metrics: Physical size, logical size, protection overhead, and compression ratios
Data Protection Analysis: Current and target protection policies, pool assignments
SSD Optimization: SSD status, strategy tracking, and tier placement verification
File Properties: Compression, deduplication, SmartLink (CloudPools), sparse files
Permission Auditing: Unix permissions, ACLs, owner/group information
Cost Analysis: Analyze storage costs across different node pools and tiers
Sample data from a PowerScale Attributes execution:
Here we can see the entire list of metadata returned from the PowerScale Attirbutes plugin!
Use Cases
Storage Tiering Analysis
Identify files by their storage pool placement and SSD strategy to optimize tier utilization. Understand which data resides on which node pools and make informed decisions about data placement policies.
Cost Analysis Across Node Pools
Analyze storage costs by examining how data is distributed across different node pools. Identify opportunities to move data to more cost-effective tiers based on access patterns and business requirements.
SSD Usage Optimization
Track SSD utilization and identify large files that should or should not be on SSD tiers. Optimize your high-performance storage by ensuring only appropriate data consumes expensive SSD capacity.
Access Pattern Analysis
Review file access patterns to make informed decisions about data placement and tiering. Use access time metadata to identify cold data candidates for archival tiers.
Permission Auditing
Audit file permissions including Unix mode, ACLs, and ownership for security compliance. Quickly identify files with overly permissive access or ownership anomalies.
Data Protection Compliance
Verify that files have appropriate protection levels applied and identify any mismatches between current and target protection policies.
Prerequisites
Important Deployment Requirement
This plugin does not use the standard Dell OneFS Platform API. It requires a custom Python library that runs directly on your PowerScale node(s). Deployment of this component requires completing an RPQ (Request for Product Qualification) process with Dell before the Diskover team can deploy this solution in your environment.
Please contact your Dell representative and the Diskover support team to initiate this process before proceeding with installation.
System Requirements
Component |
Requirement |
|---|---|
Python |
3.9 or higher |
Diskover |
Core installation with plugin support |
Elasticsearch |
7.x or 8.x (as supported by Diskover) |
Dell PowerScale Cluster |
Running OneFS with custom Diskover API service installed |
Network Connectivity |
HTTPS access to PowerScale API service (default port 8080) |
Dell RPQ Process |
Completed with Dell for custom API deployment |
Python Dependencies
The following Python modules are required and typically included with the Diskover installation:
requestsurllib3
Verification
Linux:
# Verify Python dependencies
python3 -c "import requests; print('requests version:', requests.__version__)"
python3 -c "import urllib3; print('urllib3 version:', urllib3.__version__)"
# Verify Elasticsearch connectivity
python3 -c "from diskover_elasticsearch import elasticsearch_connection; print(elasticsearch_connection().info())"
# Verify plugin version
python3 /opt/diskover/plugins_postindex/diskover_powerscale/diskover_powerscale.py --version
Windows:
# Verify Python dependencies
python -c "import requests; print('requests version:', requests.__version__)"
python -c "import urllib3; print('urllib3 version:', urllib3.__version__)"
# Verify Elasticsearch connectivity
python -c "from diskover_elasticsearch import elasticsearch_connection; print(elasticsearch_connection().info())"
# Verify plugin version
python "C:\Program Files\Diskover\plugins_postindex\diskover_powerscale\diskover_powerscale.py" --version
PowerScale API Server (ps_scan_api_server.py)
The PowerScale Attributes plugin requires a custom API server (ps_scan_api_server.py) to be deployed and running on your Dell PowerScale (OneFS) cluster. This API server was developed by the Dell CAE (Customer Advisory Engineering) Team and provides the interface between Diskover and your PowerScale storage metadata.
Important: The deployment of
ps_scan_api_server.pyon your OneFS nodes requires an RPQ (Request for Product Qualification) process to be completed with Dell. Please reach out to Diskover Support to initiate this process!
Source Code Location
The most up-to-date version of ps_scan_api_server.py and its supporting libraries are maintained in the DiskoverData GitHub repository:
GitHub Repository: https://github.com/diskoverdata/diskover-solution-development/tree/main/Dell
The repository contains:
ps_scan_api_server.py— The main API server script/helpersdirectory — Helper modules required by the API server/libsdirectory — Python libraries required by the API server
Network Requirements
The Diskover worker nodes and the OneFS system must be able to communicate over the network. Specifically, the Diskover worker node must be able to reach the OneFS system on Port 8080 (or your configured API port).
Installing on OneFS
After completing the Dell RPQ process and receiving approval, follow these steps to deploy the API server on your PowerScale cluster:
Upload the API server package to your OneFS host:
# Copy the API server package to the OneFS host scp /path/to/ps_scan_api_server.tgz root@onefshost:/tmp/
SSH to your OneFS host and create a directory for the API server:
# SSH to OneFS host ssh root@onefshost # Create a run directory for the ps_scan code cd / mkdir ps_scan_api_server # Move and extract the code mv /tmp/ps_scan_api_server.tgz /ps_scan_api_server/ cd /ps_scan_api_server tar -xvzf ps_scan_api_server.tgz
Start the API server:
Since OneFS does not use systemd or init.d, we recommend running the API server in a screen session so it can run persistently in the background:
# Start a screen session screen -S diskover # Navigate to the API server directory and execute cd /ps_scan_api_server/ python3 ps_scan_api_server.py # Detach from screen session (the API server continues running) # Press: CTRL+A, then release and press: d
Helpful screen commands:
screen -ls— List all screen sessionsscreen -r diskover— Reattach to the diskover screen sessionscreen -S diskover -X quit— Terminate the diskover screen session
For a complete reference of screen commands, see: Screen Quick Reference
Note: The API server does not output anything to the console during normal operation—it simply runs and waits for connections. As long as the process is running, Diskover will be able to connect to it.
Multi-Node Clusters
For PowerScale clusters with multiple nodes, you may need to deploy and run the API server on each node. Configure the Diskover plugin with multiple hosts to enable load balancing across all API server instances.
Installation
The PowerScale Attributes plugin is included with Diskover Professional+ editions. After completing the Dell RPQ process and having the custom API service deployed on your PowerScale cluster:
-
Verify the plugin files exist in your Diskover installation:
Linux:
ls -la /opt/diskover/plugins_postindex/diskover_powerscale/
Windows:
dir "C:\Program Files\Diskover\plugins_postindex\diskover_powerscale\"
-
Confirm the custom PowerScale API service is accessible from your Diskover server:
Linux/Windows:
curl -v https://powerscale-node:8080/ps_stat/single?path=/ifs
Configure the plugin through the Diskover Admin UI (see Configuration section below).
Configuration
Configuration is managed through the Diskover Admin Panel. Navigate to Plugins → Post Index → PowerScale to access the settings.
Sample Configuraiton in Diskover Admin:
Here is the beginning of our sample configuration There are many other configuraitons for the PowerScale plugin - covered in detail below!
Configuration Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int |
4 |
Maximum threads for API requests. Match this to the number of API hosts for optimal load distribution. |
|
list |
PowerScale API host URLs (supports multiple hosts for load balancing) |
|
|
string |
PowerScale API username |
|
|
SecretStr |
PowerScale API password |
|
|
bool |
False |
Include ACL information in API requests (increases API overhead) |
|
bool |
False |
Translate UIDs/GIDs to human-readable names |
|
int |
5000 |
API request limit parameter |
|
list |
[] |
Specific fields to retrieve (empty = all fields) |
|
list |
Path translation rules between indexed paths and PowerScale paths |
Host Configuration
Configure the PowerScale API server host(s) in the hosts parameter. The port should match where your ps_scan_api_server.py is listening (default is 8080):
hosts: - https://powerscale-node1.example.com:8080
User/Password Configuration
Use the same credentials you would use to log into the OneFS Management Console.
Path Mapping Configuration
Path mapping is critical for proper operation. Diskover indexes files from mounted paths (e.g., /mnt/isilon/data) while the PowerScale API expects native OneFS paths (e.g., /ifs/data). You must configure mappings to translate between these representations.
Example Path Mappings:
Indexed Path (from_path) |
PowerScale Path (to_path) |
|---|---|
|
|
|
|
|
|
How Mapping Works:
Plugin retrieves document path from Elasticsearch (e.g.,
/mnt/isilon/data/project/file.txt)Path is checked against
from_pathpatterns in orderFirst matching pattern replaces with
to_pathTranslated path is sent to PowerScale API (e.g.,
/ifs/data/project/file.txt)
Tip: Place more specific path mappings before general ones. The plugin uses the first matching pattern.
Selective Field Configuration
To reduce API overhead and index only the fields you need, specify them in the pscale_fields parameter:
size_physical size_logical size_protection protection_current protection_target ssd_status ssd_strategy_name pool_target_data_name file_is_compressed file_is_deduped
All Available Fields:
atime, atime_date, btime, btime_date, ctime, ctime_date, file_access_pattern, file_compression_ratio, file_is_ads, file_is_compressed, file_is_dedupe_disabled, file_is_deduped, file_is_inlined, file_is_packed, file_is_smartlinked, file_is_sparse, inode_mirror_count, inode_parent, inode_revision, mtime, mtime_date, perms_acl_aces, perms_acl_group, perms_acl_user, perms_group, perms_unix_bitmask, perms_unix_gid, perms_unix_uid, perms_user, pool_target_data, pool_target_data_name, pool_target_metadata, pool_target_metadata_name, protection_current, protection_target, size, size_logical, size_physical, size_physical_data, size_protection, ssd_status, ssd_status_name, ssd_strategy, ssd_strategy_name
Multi-Host Load Balancing
For optimal performance with large indices, configure multiple API hosts. The plugin distributes requests across hosts using round-robin load balancing:
hosts: - https://powerscale-node1.example.com:8080 - https://powerscale-node2.example.com:8080 - https://powerscale-node3.example.com:8080
Set maxthreads to match the number of API hosts for optimal distribution.
Diskover Web Extra Field Configuration
The PowerScale plugin creates metadata fields nested under a top-level field called pscale. To visualize these fields in Diskover Web, you must add an extra field configuration:
Navigate to Diskover Admin → Web → General → Expose Extra Fields from Index and Post-Index Plugins
-
Add a new entry:
Display Name:
PowerScaleName in Index:
pscale
Execution
The PowerScale Attributes plugin can be executed manually via command line or automatically as part of your indexing workflow.
Manual Execution
Linux:
# Basic execution on a specific index python3 /opt/diskover/plugins_postindex/diskover_powerscale/diskover_powerscale.py diskover-isilon-2026.01.15 # Use a named configuration python3 /opt/diskover/plugins_postindex/diskover_powerscale/diskover_powerscale.py -c production-cluster diskover-prod-data # Auto-find latest index by top path with verbose output python3 /opt/diskover/plugins_postindex/diskover_powerscale/diskover_powerscale.py -l /mnt/isilon/data -v # Filter to specific file types python3 /opt/diskover/plugins_postindex/diskover_powerscale/diskover_powerscale.py -q "extension:(mp4 OR mov OR mxf)" diskover-media # Filter to large files only (>10GB) python3 /opt/diskover/plugins_postindex/diskover_powerscale/diskover_powerscale.py -q "size:>10737418240" diskover-archive
Windows:
# Basic execution on a specific index python "C:\Program Files\Diskover\plugins_postindex\diskover_powerscale\diskover_powerscale.py" diskover-isilon-2026.01.15 # Use a named configuration python "C:\Program Files\Diskover\plugins_postindex\diskover_powerscale\diskover_powerscale.py" -c production-cluster diskover-prod-data # Auto-find latest index by top path with verbose output python "C:\Program Files\Diskover\plugins_postindex\diskover_powerscale\diskover_powerscale.py" -l /mnt/isilon/data -v # Filter to specific file types python "C:\Program Files\Diskover\plugins_postindex\diskover_powerscale\diskover_powerscale.py" -q "extension:(mp4 OR mov OR mxf)" diskover-media
Command-Line Options
Option |
Long Form |
Description |
|---|---|---|
|
|
Use a named configuration from Diskover Web Admin |
|
|
Elasticsearch query to filter which files to process |
|
|
Auto-find latest index by top path |
|
|
Enable verbose logging |
|
Print version and exit |
Automated Execution
Using Custom Tasks
Schedule the PowerScale plugin to run on a recurring basis using Diskover's Custom Tasks feature.
Sample Custom Task Configuration:
Here we can see the Run Command & args needed for the Custom Task - Note that in this case you cannot use the {indexname} variable as this is not a task that creates an index, so we must use the -l (toppath) CLI option and pass in our top path!
Using Post-Crawl Commands
Configure the plugin to run automatically after each index crawl completes by adding it as a Post-Crawl Command in your Index Task.
Linux Example:
Field |
Value |
|---|---|
Post-Crawl Command |
|
Post-Crawl Command Args |
|
Windows Example:
Field |
Value |
|---|---|
Post-Crawl Command |
|
Post-Crawl Command Args |
|
Available Index Task Tokens:
{indexname}— The name of the index that was just created
Sample Post-Crawl Command configuraiton for PowerScale executing with an Index Task:
In your system ensure to replace the ConfigurationName above with a named configuraiton that you’ve created at Diskover Admin → Plugins → Post-Index → PowerScale – If you are not using a custom configuration and you’re just using Default than the -c flag and the ConfigurationName is not required!
Query Filtering Examples
Process only relevant files to reduce execution time and API calls:
# Process only files modified in the last 7 days -q "mtime:[now-7d TO now]" # Process only specific file extensions -q "extension:(mxf OR mov OR mp4)" # Process only large files (>1GB) -q "size:>1073741824" # Combined filters -q "type:file AND size:>1073741824 AND extension:mxf"
Reviewing the Output
Execution Progress
When running with verbose mode (-v), the plugin displays detailed progress:
INFO - Connecting to Dell PowerScale API... INFO - Starting PowerScale metadata collection... INFO - Starting 4 threads INFO - Searching for all docs in index diskover-isilon-2026.01.15... INFO - Found 1,250,000 matching docs INFO - Queued 10,000 of 1,250,000 docs, 1,240,000 remaining INFO - thread 0 updated 250 docs INFO - thread 1 updated 248 docs ... INFO - --- es search time 45.123456s --- INFO - --- es update time 892.654321s --- INFO - --- pscale api request time 1245.789012s ---
Verifying Results
After execution, verify that PowerScale metadata has been added to your documents:
Using Diskover Web UI:
Navigate to the processed index
Search for any file
View the file details - you should see a
pscalesection with PowerScale metadata
Quick Search Test:
Run this search in Diskover to find all files with PowerScale metadata:
pscale:*
Using Elasticsearch directly:
curl -s "localhost:9200/diskover-isilon-2026.01.15/_search?size=1" | jq '.hits.hits[]._source.pscale'
Expected Output Structure
Documents are updated with a pscale object containing the requested metadata:
{
"pscale": {
"size_physical": 1234567890,
"size_logical": 1000000000,
"size_protection": 234567890,
"protection_current": "+2d:1n",
"protection_target": "+2d:1n",
"pool_target_data_name": "hot_tier",
"ssd_status_name": "complete",
"ssd_strategy_name": "metadata",
"file_is_compressed": true,
"file_is_deduped": false,
"perms_unix_bitmask": "0644"
}
}
Searching in Diskover
The PowerScale plugin adds extensive metadata to your indexed files, all searchable using the pscale. field prefix in Diskover's search bar.
Size and Capacity Queries
Find files by physical storage size (>10GB):
pscale.size_physical:>10737418240
Find files with high protection overhead (>1GB):
pscale.size_protection:>1073741824
Find large uncompressed files (compression candidates):
pscale.size_logical:>10737418240 AND pscale.file_is_compressed:false
Storage Pool and Tier Queries
Find files on a specific storage pool:
pscale.pool_target_data_name:archive_pool
Cost analysis - find large files on expensive tiers:
pscale.pool_target_data_name:hot_tier AND size:>10737418240
Find files by SSD strategy:
pscale.ssd_strategy_name:metadata
File Properties Queries
Find compressed files:
pscale.file_is_compressed:true
Find deduplicated files:
pscale.file_is_deduped:true
Find SmartLinked files (CloudPools):
pscale.file_is_smartlinked:true
Data Protection Queries
Find files by current protection level:
pscale.protection_current:"+2d:1n"
Find files where protection doesn't match target:
pscale.protection_current:"+1d:1n" AND NOT pscale.protection_target:"+1d:1n"
Permission Queries
Find world-writable files (security concern):
pscale.perms_unix_bitmask:0777
Find files by owner UID:
pscale.perms_unix_uid:1001
Combined Analysis Queries
Cost optimization - large files on expensive pools that are rarely accessed:
pscale.pool_target_data_name:hot_tier AND size:>10737418240 AND atime:<now-180d
Deduplication candidates in archive pool:
pscale.file_is_deduped:false AND pscale.pool_target_data_name:archive_pool
Troubleshooting
API Connection Failures
Symptom: Error messages about connection refused or timeout.
Possible Causes:
PowerScale API service (
ps_scan_api_server.py) not running on the clusterNetwork connectivity issues between Diskover server and PowerScale
Firewall blocking the API port (default 8080)
Resolution:
-
Verify the API server is running on your OneFS host:
# Reattach to the screen session screen -r diskover # Check if the process is still running
-
Test API connectivity directly:
curl -v https://powerscale-host:8080/ps_stat/single?path=/ifs
-
Verify network connectivity:
nc -zv powerscale-api.example.com 8080
Check firewall rules allow traffic to the API port
Authentication Errors
Symptom: HTTP 401 or 403 errors in logs.
Resolution:
Verify username and password in configuration match your OneFS Management Console credentials
Confirm API user has read permissions for target paths
Check if password contains special characters that need escaping
Verify API user account is not locked or expired
No Metadata Returned
Symptom: Plugin runs successfully but documents have no pscale field.
Common Causes:
Incorrect path mappings
Files don't exist at the translated paths
API user lacks permissions to access the files
Resolution:
-
Run with verbose logging to see detailed output:
python3 diskover_powerscale.py -v diskover-myindex 2>&1 | head -100
Verify your path mappings translate correctly
-
Test the API directly with a translated path:
curl -u user:pass "https://api:8080/ps_stat/single?path=/ifs/data/test/file.txt"
Path Mapping Issues
Symptom: "No pscale metadata returned from api" warnings for all files.
Resolution:
-
Check indexed paths in Elasticsearch:
curl -s "localhost:9200/diskover-myindex/_search?size=1" | jq '.hits.hits[]._source.parent_path'
Ensure
from_pathmatches the exact indexed path prefixVerify
to_pathis the correct PowerScale native path (starts with/ifs/)Place more specific mappings before general ones
Slow Performance
Symptom: Plugin takes much longer than expected to complete.
Resolution:
Add more API hosts and increase thread count
Use query filtering to process fewer files
Configure selective fields to reduce API response size
Monitor API server load and add capacity if needed
Debug Logging
Enable verbose logging to diagnose issues:
Linux:
python3 /opt/diskover/plugins_postindex/diskover_powerscale/diskover_powerscale.py -v diskover-myindex 2>&1 | tee powerscale_debug.log
Windows:
python "C:\Program Files\Diskover\plugins_postindex\diskover_powerscale\diskover_powerscale.py" -v diskover-myindex 2>&1 | Tee-Object -FilePath powerscale_debug.log
Support
Last Updated: April 2026
Comments
0 comments
Please sign in to leave a comment.