Dell PowerScale
License: PRO+ (Professional Edition or higher)
Module Type: Alternate Scanner
Author: Diskover Data, Inc.
Overview / Use Cases
The Dell PowerScale alternate scanner lets you index files directly from Dell PowerScale (Isilon) clusters via a REST API — capturing rich storage metadata like physical and logical sizing, data protection levels, SSD strategies, compression status, and file properties all in a single crawl pass. Instead of scanning the filesystem and then enriching metadata afterward, this scanner pulls everything at once, giving you a complete picture of your PowerScale storage right from the start.
All of this metadata is indexed into Elasticsearch and fully searchable through the Diskover Web UI, appearing alongside standard file information like name, path, size, and timestamps.
Who Benefits and How
Storage Administrators — Analyze storage tier utilization across your cluster. See which files live on hot, warm, or cold tiers, verify that SmartPools policies are placing data correctly, and identify misallocated files that are driving up storage costs.
Capacity Planners — Understand the real cost of data protection overhead. Compare physical vs. logical sizes across file populations to see how much storage is consumed by protection policies, and use compression ratios to identify space-saving opportunities.
Compliance & Security Teams — Audit file permissions (Unix mode, ACLs, ownership) across the entire cluster from a single search interface. Identify files with overly permissive access, incorrect ownership, or protection levels that don't meet policy requirements.
Media & Content Professionals — Track SSD placement for performance-sensitive media files. Verify that frequently accessed project files are on the right storage tier and that completed projects have been properly migrated to archive tiers.
Understanding Dell PowerScale
Dell PowerScale (formerly Isilon) is a scale-out NAS platform that presents a single filesystem (/ifs) across a cluster of nodes. Understanding a few key PowerScale concepts will help you make the most of the metadata this scanner collects.
Storage Tiers and Pools
PowerScale organizes storage into node pools (groups of similar hardware) and tiers (logical groupings of pools). SmartPools policies control where files are placed based on criteria like age, size, or access patterns.
Concept |
What It Means |
Why It Matters in Diskover |
|---|---|---|
Node Pool |
A group of nodes with similar hardware (e.g., all-flash, hybrid, archive) |
The |
Storage Tier |
A logical grouping of pools (hot, warm, cold) |
Search by pool name to analyze tier distribution |
SmartPools Policy |
Rules that automatically move files between tiers |
Verify policy compliance by searching for files on unexpected tiers |
Data Protection Levels
PowerScale protects data using a notation like +2d:1n, which means the cluster can survive 2 simultaneous drive failures and 1 node failure. The scanner captures both the current protection level and the target level — if these don't match, the file is in the process of being re-protected.
Protection Field |
What It Tells You |
|---|---|
|
The protection level actually applied right now |
|
The protection level the system is working toward |
|
How many bytes are consumed by protection overhead |
SSD Strategies
PowerScale uses SSD storage strategically. The SSD strategy determines what data is placed on SSDs:
Strategy |
Meaning |
|---|---|
|
Only file metadata is stored on SSD (most common) |
|
Metadata on SSD, with write acceleration |
|
Both file data and metadata reside on SSD |
|
File avoids SSD entirely |
File Properties
The scanner captures several boolean properties that reveal how PowerScale is handling each file:
Property |
What It Means |
|---|---|
Compressed |
File data is compressed on-disk, reducing physical storage |
Deduplicated |
Duplicate blocks are stored only once |
SmartLinked |
File has been tiered to cloud storage via CloudPools |
Sparse |
File contains unallocated regions (common for databases, VMs) |
Inlined |
Small file data stored directly in the inode for performance |
Requirements
System Requirements
Component |
Requirement |
|---|---|
Python |
3.9 or higher |
Diskover |
Core installation with alternate scanner support |
Dell PowerScale |
Cluster running OneFS with the Dell |
Network |
HTTP/HTTPS connectivity from the Diskover server to the |
Python Dependencies
All required Python packages (requests, urllib3) are included with the standard Diskover installation. No additional package installation is needed.
Dell PowerScale API Requirements
This scanner communicates with a custom REST API utility called ps_scan_server, developed by Dell Engineering specifically for high-performance metadata retrieval. This utility must be installed and running on your PowerScale cluster before the scanner can operate.
Important: Access to the
ps_scan_serverutility requires a Request for Product Qualification (RPQ) process with Dell. The source code is maintained at https://github.com/diskoverdata/diskover-solution-development/tree/main/Dell. Please contact Diskover Support to initiate the RPQ process with Dell.
Requirement |
Description |
|---|---|
|
Dell-provided REST API server running on the PowerScale cluster |
Network Access |
HTTP/HTTPS connectivity from the Diskover server to each API host on port 4242 (or your configured port) |
API Credentials |
Username and password for API authentication (if authentication is enabled) |
PowerScale API Server (ps_scan_server)
The PowerScale alternate scanner requires a custom API server (ps_scan_server) to be deployed and running on your Dell PowerScale (OneFS) cluster. This API server was developed by the Dell CAE (Customer Advisory Engineering) Team and provides the high-performance interface between Diskover and your PowerScale storage metadata.
Important: The deployment of
ps_scan_serveron your OneFS nodes requires an RPQ (Request for Product Qualification) process to be completed with Dell. Please reach out to Diskover Support to initiate this process!
Source Code Location
The most up-to-date version of ps_scan_server and its supporting libraries are maintained in the DiskoverData GitHub repository:
GitHub Repository: https://github.com/diskoverdata/diskover-solution-development/tree/main/Dell
The repository contains:
ps_scan_api_server.py— The main API server script/helpersdirectory — Helper modules required by the API server/libsdirectory — Python libraries required by the API server
Network Requirements
The Diskover worker nodes and the OneFS system must be able to communicate over the network. Specifically, the Diskover worker node must be able to reach the OneFS system on Port 4242 (or your configured API port).
Installing on OneFS
After completing the Dell RPQ process and receiving approval, follow these steps to deploy the API server on your PowerScale cluster:
1. Upload the API server package to your OneFS host:
# Copy the API server package to the OneFS host scp /path/to/ps_scan_api_server.tgz root@onefshost:/tmp/
2. SSH to your OneFS host and create a directory for the API server:
# SSH to OneFS host ssh root@onefshost # Create a run directory for the ps_scan code cd / mkdir ps_scan_api_server # Move and extract the code mv /tmp/ps_scan_api_server.tgz /ps_scan_api_server/ cd /ps_scan_api_server tar -xvzf ps_scan_api_server.tgz
3. Start the API server:
Since OneFS does not use systemd or init.d, we recommend running the API server in a screen session so it can run persistently in the background:
# Start a screen session screen -S diskover # Navigate to the API server directory and execute cd /ps_scan_api_server/ python3 ps_scan_api_server.py # Detach from screen session (the API server continues running) # Press: CTRL+A, then release and press: d
Helpful screen commands:
screen -ls— List all screen sessionsscreen -r diskover— Reattach to the diskover screen sessionscreen -S diskover -X quit— Terminate the diskover screen session
For a complete reference of screen commands, see: Screen Quick Reference
Note: The API server does not output anything to the console during normal operation—it simply runs and waits for connections. As long as the process is running, Diskover will be able to connect to it.
Multi-Node Clusters
For PowerScale clusters with multiple nodes, you may need to deploy and run the API server on each node. Configure the Diskover scanner with multiple hosts to enable load balancing across all API server instances:
hosts: - http://powerscale-node1:4242 - http://powerscale-node2:4242 - http://powerscale-node3:4242
The scanner uses round-robin load balancing to distribute requests across all configured hosts, improving throughput for large crawls.
Installation
Step 1: Install Scanner Package
Linux:
dnf install diskover-scanner-powerscale
Windows:
The scanner files are included with the Diskover Windows installation. No separate installation step is required.
Install locations:
Linux:
/opt/diskover/scanners/scandir_powerscale/Windows:
C:\Program Files\Diskover\scanners\scandir_powerscale\
Step 2: Verify Installation
Confirm the required Python packages are available:
python3 -c "import requests; print('requests version:', requests.__version__)"
python3 -c "import urllib3; print('urllib3 version:', urllib3.__version__)"
Both packages should report their version numbers without errors.
Step 3: Verify API Connectivity
Test that your Diskover server can reach the PowerScale API:
curl "http://your-powerscale-api-host:4242/cluster_storage_stats"
A successful response returns JSON containing cluster storage statistics (ifs.bytes.total, ifs.bytes.free, ifs.bytes.avail). If this fails, check your network configuration, firewall rules, and confirm the ps_scan_server utility is running on the PowerScale cluster.
Configuration
Configuration is managed through the Diskover Admin UI at Settings > Alternate Scanners > PowerScale.
Configuration Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
list |
|
List of PowerScale API host URLs. Multiple hosts enable round-robin load balancing across API endpoints. |
|
string |
|
Username for PowerScale API authentication |
|
secret |
(empty) |
Password for PowerScale API authentication (stored securely) |
|
bool |
|
When enabled, includes ACL (Access Control List) information in API requests. Adds |
|
bool |
|
When enabled, translates numeric UIDs/GIDs to human-readable usernames and group names. Adds |
|
int |
|
Number of entries returned per API request (pagination batch size). Higher values reduce API round trips but increase memory usage per request. |
|
list |
|
Specific PowerScale fields to retrieve. An empty list retrieves all available fields. Specifying a subset reduces API response size and index storage. |
|
list |
(see below) |
Rules for translating between mount paths on the Diskover server and native OneFS paths on PowerScale. |
Path Mapping Parameters
Path mappings translate the paths your Diskover server sees (mount points) into the corresponding native OneFS paths that the PowerScale API expects. Each entry contains:
Parameter |
Type |
Description |
|---|---|---|
|
string |
The mount point path as seen by the Diskover server (e.g., |
|
string |
The corresponding native OneFS path on PowerScale (e.g., |
For example, if your PowerScale /ifs/data share is mounted at /mnt/isilon/data on your Diskover server, you would configure:
from_path:/mnt/isilon/datato_path:/ifs/data
When the scanner encounters a file at /mnt/isilon/data/projects/report.pdf, it translates the path to /ifs/data/projects/report.pdf for the API request.
YAML Configuration Example
# Diskover Admin Configuration
Diskover:
Alternate Scanners:
Powerscale:
Default:
hosts:
- https://powerscale-api-1.example.com:4242
- https://powerscale-api-2.example.com:4242
user: diskover_api_user
password: your_secure_password
include_acls: false
translate_names: false
limit: 5000
pscale_fields: []
path_mappings:
- from_path: /mnt/isilon/data
to_path: /ifs/data
- from_path: /mnt/isilon/archive
to_path: /ifs/archive
Configuration via Diskover Admin
Navigate to Settings > Alternate Scanners > PowerScale
Enter your API host URLs in the
hostslistProvide API credentials (
userandpassword)Configure path mappings for each mount point
Optionally enable
include_aclsortranslate_namesif neededOptionally configure
pscale_fieldsto limit which fields are collectedSave the configuration
Sample Configuration in Diskover Admin:
Configuration Examples
Standard Configuration — Full Metadata Collection
This is the recommended starting point. It collects all available PowerScale metadata using a single API host:
Diskover:
Alternate Scanners:
Powerscale:
Default:
hosts:
- https://powerscale-api.example.com:4242
user: diskover_svc
password: secure_password
include_acls: false
translate_names: false
limit: 5000
pscale_fields: []
path_mappings:
- from_path: /mnt/isilon/data
to_path: /ifs/data
High-Performance Configuration — Selective Fields with Load Balancing
For large environments where you want to minimize API overhead and index size, retrieve only the fields you need and distribute requests across multiple API hosts:
Diskover:
Alternate Scanners:
Powerscale:
Default:
hosts:
- https://pscale-node1.example.com:4242
- https://pscale-node2.example.com:4242
- https://pscale-node3.example.com:4242
user: diskover_svc
password: secure_password
include_acls: false
translate_names: false
limit: 10000
pscale_fields:
- size_physical
- size_logical
- size_protection
- protection_current
- protection_target
- pool_target_data_name
- ssd_strategy_name
- file_is_compressed
- file_is_deduped
- file_compression_ratio
path_mappings:
- from_path: /mnt/isilon/data
to_path: /ifs/data
- from_path: /mnt/isilon/archive
to_path: /ifs/archive
Usage / Execution
The PowerScale scanner is a standard alternate scanner that integrates with diskover.py using the --altscanner flag.
Basic Usage
Linux:
cd /opt/diskover python3 diskover.py --altscanner scandir_powerscale /mnt/isilon/data
Windows:
cd "C:\Program Files\Diskover" python diskover.py --altscanner scandir_powerscale /mnt/isilon/data
Path Format Reference
Path Format |
Description |
Example |
|---|---|---|
|
Standard NFS/SMB mount point |
|
|
Subdirectory within a mount |
|
Note: The path you provide should be the mount point path as seen by the Diskover server, not the native OneFS path. The scanner translates paths automatically using your configured path mappings.
Advanced Usage Examples
Custom index name:
python3 diskover.py -i diskover-powerscale-data --altscanner scandir_powerscale /mnt/isilon/data
Debug logging for troubleshooting:
python3 diskover.py --altscanner scandir_powerscale --loglevel DEBUG /mnt/isilon/data
Using a named configuration:
python3 diskover.py --altscanner scandir_powerscale -c production-cluster /mnt/isilon/production
Integration with Index Tasks
To run the PowerScale scanner as part of a scheduled Index Task, configure the task with:
Field |
Value |
|---|---|
Alternate Scanner |
|
Top Path |
Your mount point path (e.g., |
Sample Alternate Scanner configuration for a Diskover Scan Task:
Performance Tips
Add multiple API hosts — The scanner uses round-robin load balancing to distribute requests across all configured hosts. Adding more hosts improves throughput for large crawls.
Increase the
limitparameter — For directories with many files, increasing from the default5000to10000reduces the number of API round trips. Reduce to1000if memory is a constraint.Use selective field collection — If you only need specific metadata (e.g., just tiering and compression info), configure
pscale_fieldsto retrieve only those fields. This reduces API response sizes and Elasticsearch index storage.Check cluster load — The scanner generates API traffic on your PowerScale cluster. Schedule scans during off-peak hours for large environments or distribute the load with multiple API hosts.
Metadata Fields / Elasticsearch Mappings
The PowerScale scanner adds all of its metadata under a pscale object field in each Elasticsearch document. These fields appear alongside the standard Diskover file metadata (name, path, size, timestamps, etc.) and are fully searchable.
Size Metrics
Field Path |
ES Type |
Description |
|---|---|---|
|
long |
File size in bytes |
|
long |
Logical file size in bytes |
|
long |
Actual physical storage consumed (includes protection overhead) |
|
long |
Physical data size excluding protection overhead |
|
long |
Storage consumed by data protection in bytes |
|
float |
Compression ratio (for compressed files) |
Data Protection
Field Path |
ES Type |
Description |
|---|---|---|
|
keyword |
Current protection policy (e.g., |
|
keyword |
Target protection policy |
Storage Pools
Field Path |
ES Type |
Description |
|---|---|---|
|
integer |
Target data pool ID |
|
keyword |
Target data pool name (e.g., |
|
integer |
Target metadata pool ID |
|
keyword |
Target metadata pool name |
SSD Status
Field Path |
ES Type |
Description |
|---|---|---|
|
integer |
SSD status code |
|
keyword |
SSD status name (e.g., |
|
integer |
SSD strategy code |
|
keyword |
SSD strategy name (e.g., |
File Properties
Field Path |
ES Type |
Description |
|---|---|---|
|
boolean |
File data is compressed on-disk |
|
boolean |
File blocks are deduplicated |
|
boolean |
Deduplication is disabled for this file |
|
boolean |
File is tiered to cloud storage via CloudPools |
|
boolean |
File contains unallocated regions |
|
boolean |
File data is stored directly in the inode |
|
boolean |
File is packed |
|
boolean |
File is an Alternate Data Stream |
|
keyword |
Access pattern hint (e.g., streaming, random) |
Inode Information
Field Path |
ES Type |
Description |
|---|---|---|
|
integer |
Number of inode mirrors |
|
keyword |
Parent inode identifier |
|
integer |
Inode revision number |
Timestamps
Field Path |
ES Type |
Description |
|---|---|---|
|
long / keyword |
Access time (Unix timestamp and formatted date) |
|
long / keyword |
Birth/creation time (Unix timestamp and formatted date) |
|
long / keyword |
Change time (Unix timestamp and formatted date) |
|
long / keyword |
Modification time (Unix timestamp and formatted date) |
Permissions
Field Path |
ES Type |
Description |
|---|---|---|
|
keyword |
Unix permission bitmask (e.g., |
|
integer |
Unix owner UID |
|
integer |
Unix group GID |
|
keyword |
Owner username (requires |
|
keyword |
Group name (requires |
|
keyword |
ACL user (requires |
|
keyword |
ACL group (requires |
|
keyword |
ACL access control entries (requires |
Elasticsearch Mapping Definition
{
"mappings": {
"properties": {
"pscale": {
"type": "object"
}
}
}
}
Example Indexed Document
{
"name": "project_data.mxf",
"path": "/mnt/isilon/data/projects/project_data.mxf",
"extension": "mxf",
"size": 10737418240,
"type": "file",
"pscale": {
"size_physical": 11811160064,
"size_logical": 10737418240,
"size_protection": 1073741824,
"protection_current": "+2d:1n",
"protection_target": "+2d:1n",
"pool_target_data_name": "hot_tier",
"pool_target_metadata_name": "ssd_pool",
"ssd_status_name": "complete",
"ssd_strategy_name": "metadata",
"file_is_compressed": false,
"file_is_deduped": false,
"file_is_smartlinked": false,
"perms_unix_bitmask": "0644",
"perms_unix_uid": 1000,
"perms_unix_gid": 1000
}
}
Searching in Diskover
All PowerScale metadata fields are searchable using Diskover's standard query syntax. Fields are prefixed with pscale. in search queries.
Storage Tiering & Pool Queries
Query |
Description |
|---|---|
|
Files targeted to the hot storage tier |
|
Files targeted to the archive pool |
|
Files with both data and metadata on SSD |
|
Files with only metadata on SSD |
|
Files tiered to cloud via CloudPools |
Compression & Deduplication Queries
Query |
Description |
|---|---|
|
All compressed files |
|
All deduplicated files |
|
Non-deduplicated files on archive tier (candidates for dedup) |
|
Compressed files larger than 1 GB |
|
Files with a compression ratio of 2x or higher |
Data Protection Queries
Query |
Description |
|---|---|
|
Files with a specific protection level |
|
Files with more than 500 MB of protection overhead |
File Property Queries
Query |
Description |
|---|---|
|
Sparse files (common for databases and VMs) |
|
Small files stored inline in the inode |
|
Alternate Data Stream files |
Permission & Security Queries
Query |
Description |
|---|---|
|
Files with world-readable/writable permissions |
|
Files owned by root |
Combined Queries
Query |
Description |
|---|---|
|
Large compressed files on SSD |
|
Non-deduplicated files on archive (dedup candidates) |
|
Files on hot tier not modified in 90+ days (tiering candidates) |
Troubleshooting
Common Issues
Issue |
Cause |
Solution |
|---|---|---|
|
Diskover server cannot reach the API host |
Verify network connectivity, firewall rules, and that |
|
Path mapping misconfiguration — the translated OneFS path doesn't exist |
Check your |
|
API host is slow or unreachable |
Check cluster load, add more API hosts for load balancing, or increase network timeout |
No |
|
Clear the |
Missing ACL or username fields |
|
Set |
Scan is very slow |
Large directories with default pagination, or single API host bottleneck |
Increase |
High memory usage during scan |
Very large directories combined with high |
Reduce |
Verifying the API Server is Running
If you're experiencing connection issues, first verify the API server is running on your OneFS host:
# Reattach to the screen session screen -r diskover # If the session doesn't exist, check for running processes ps aux | grep ps_scan # If not running, restart the API server cd /ps_scan_api_server/ screen -S diskover python3 ps_scan_api_server.py # Press CTRL+A then d to detach
Debug Logging
Enable debug-level logging to see detailed API request and response information:
python3 /opt/diskover/diskover.py --altscanner scandir_powerscale --loglevel DEBUG /mnt/isilon/data
Log File Locations
Linux:
/var/log/diskover/diskover.logWindows: Check the Diskover service logs or configured log location
Diagnosing API Connectivity
Step 1 — Test basic connectivity:
nc -zv powerscale-api.example.com 4242
Step 2 — Test the cluster stats endpoint:
curl "http://powerscale-api.example.com:4242/cluster_storage_stats"
Step 3 — Test a directory listing:
curl "http://powerscale-api.example.com:4242/ps_stat/list?path=/ifs/data&type=diskover&limit=10"
Step 4 — Test a single file stat:
curl "http://powerscale-api.example.com:4242/ps_stat/single?path=/ifs/data/testfile.txt&type=diskover"
If any of these fail, check DNS resolution (nslookup/dig), firewall rules, and confirm the ps_scan_server service is running.
Diagnosing Path Mapping Issues
If files are scanned but PowerScale metadata is missing or you're seeing 404 errors, the most likely cause is a path mapping mismatch. To verify:
Check what path the scanner is sending to the API by running with
--loglevel DEBUG-
Manually test the translated path against the API:
curl "http://powerscale-api:4242/ps_stat/single?path=/ifs/data/your/file.txt&type=diskover"
Confirm the file exists at that path on the PowerScale cluster
Ensure more specific path mappings appear before broader ones in the configuration
Support
Last Updated: April 2026
Diskover Data, Inc.
Comments
0 comments
Please sign in to leave a comment.