ElasticSearch Field Copier
Overview
The ElasticSearch Field Copier plugin migrates custom field data from one Diskover index to another. This is essential when you need to preserve metadata that was enriched by other plugins, workflows, or manual processes after rebuilding or re-indexing your storage.
When Diskover creates a new index during a scheduled scan, any custom fields that were added to the previous index—such as cost calculations, project codes, or classification data—don't automatically transfer to the new index. The ES Field Copier solves this by copying those field values from your source index to your target index, matching documents by their file path and name.
Use Cases
Preserving Data After Re-Indexing
When you run scheduled re-indexing (weekly, monthly, or after storage changes), any custom metadata added to files since the last index would be lost. ES Field Copier preserves this data by copying fields from the previous index to the newly created one.
Copying Custom Fields from Workflow Processes
If you have external workflows that inject custom fields into your Diskover indices—such as project management integrations, data classification systems, or compliance tagging—ES Field Copier ensures those fields persist across index rebuilds.
Migrating Plugin-Enriched Metadata
After running plugins like Costs, MediaInfo, or custom integrations that add fields to your index, ES Field Copier lets you preserve that enriched metadata when you need to rebuild an index due to mapping changes or optimization.
Field Copier vs. Tag Copier
Diskover offers two plugins for preserving data across indices:
Plugin | What It Copies | When to Use |
|---|---|---|
ES Field Copier | Any custom Elasticsearch fields (costs, project codes, classifications, plugin data) | When preserving enriched metadata beyond tags |
Tag Copier | Tag arrays specifically | When preserving workflow and organizational tags only |
Use ES Field Copier when you need to preserve plugin-generated fields or custom metadata. Use Tag Copier when you only need to preserve tags.
Installation
Prerequisites
Before installing the ES Field Copier plugin, ensure your environment meets these requirements:
Component | Requirement |
|---|---|
Diskover License | Professional Edition (PRO) or higher |
Python | 3.9 or higher |
Elasticsearch | 7.x or 8.x (as supported by your Diskover installation) |
The plugin has no external Python dependencies beyond what Diskover already requires.
Installation Steps
The ES Field Copier plugin is included with Diskover Professional Edition and higher. The plugin files are located in the post-index plugins directory:
Linux:
/opt/diskover/plugins_postindex/diskover_esfieldcopier/
Windows:
C:\Program Files\Diskover\plugins_postindex\diskover_esfieldcopier\
To verify the plugin is installed correctly, run the version check:
Linux:
python3 /opt/diskover/plugins_postindex/diskover_esfieldcopier/diskover_esfieldcopier.py --version
Windows:
python "C:\Program Files\Diskover\plugins_postindex\diskover_esfieldcopier\diskover_esfieldcopier.py" --version
You should see output displaying the plugin version number.
Configuration
Configuration is managed through the Diskover Admin Panel. Navigate to Settings > Plugins > Post Index > ES Field Copier to access the configuration options.
Sample Configuraiton in Diskover Admin:
Here is the beginning of our sample configuration There are many other configuraitons for the ES Field Copier plugin - covered in detail below!
Configuration Parameters
Parameter | Type | Default | Description |
|---|---|---|---|
| Integer | 0 | Maximum processing threads. Set to |
| Boolean | true | When enabled, copies fields from file documents. |
| Boolean | true | When enabled, copies fields from directory documents. |
| List |
| List of field names to copy. Can be overridden via command line with |
| Boolean | true | When enabled, overwrites existing field values in the target index. |
Configuration Examples
Basic Cost Field Migration
This configuration copies cost-related fields that were added by the Costs plugin:
maxthreads: 0 file_fields: true directory_fields: true fields_to_copy: - cost - cost_per_gb overwrite_existing: true
Project and Department Fields
Copy organizational metadata fields for project tracking:
maxthreads: 0 file_fields: true directory_fields: true fields_to_copy: - project_code - department - business_unit overwrite_existing: true
Files-Only Configuration
When your custom fields only apply to files (not directories), disable directory field copying to improve performance:
maxthreads: 4 file_fields: true directory_fields: false fields_to_copy: - checksum - mediainfo overwrite_existing: true
Preserve Existing Values (No Overwrite)
When you want to copy fields only if the target document doesn't already have a value, disable overwriting. This is useful when merging data from multiple sources:
maxthreads: 0 file_fields: true directory_fields: true fields_to_copy: - custom_classification - retention_policy overwrite_existing: false
Understanding the Overwrite Setting
The overwrite_existing parameter determines what happens when the target document already has a value for a field being copied:
Setting | Behavior |
|---|---|
| Always copies the source value, replacing any existing value in the target |
| Only copies the source value if the target field is empty or missing |
Set overwrite_existing to false when you want to:
Preserve manually-edited values in the target index
Avoid overwriting values set by another process
Merge data from multiple sources without losing existing values
Execution
The ES Field Copier can be run manually from the command line or automatically as part of your indexing workflow.
Manual Execution
Basic Syntax
diskover_esfieldcopier.py [OPTIONS] <source_index> <target_index>
Command-Line Options
Option | Long Form | Description |
|---|---|---|
|
| Automatically find the source index based on the target index's top paths |
|
| Use a named configuration defined in Diskover Admin |
|
| Specify field(s) to copy (can use multiple times); overrides configuration |
|
| Enable verbose logging for detailed output |
| Display the plugin version and exit |
Examples
Copy a single field between explicitly named indices:
Linux:
python3 /opt/diskover/plugins_postindex/diskover_esfieldcopier/diskover_esfieldcopier.py -f cost diskover-2024.12.01 diskover-2025.01.15
Windows:
python "C:\Program Files\Diskover\plugins_postindex\diskover_esfieldcopier\diskover_esfieldcopier.py" -f cost diskover-2024.12.01 diskover-2025.01.15
Copy multiple fields:
Linux:
python3 /opt/diskover/plugins_postindex/diskover_esfieldcopier/diskover_esfieldcopier.py -f cost -f project_code -f department diskover-old diskover-new
Windows:
python "C:\Program Files\Diskover\plugins_postindex\diskover_esfieldcopier\diskover_esfieldcopier.py" -f cost -f project_code -f department diskover-old diskover-new
Auto-discover the source index:
When using the -a flag, the plugin automatically finds the previous index that shares the same top paths as your target index. You only need to specify the target index:
Linux:
python3 /opt/diskover/plugins_postindex/diskover_esfieldcopier/diskover_esfieldcopier.py -a -f cost diskover-2025.01.15
Windows:
python "C:\Program Files\Diskover\plugins_postindex\diskover_esfieldcopier\diskover_esfieldcopier.py" -a -f cost diskover-2025.01.15
Run with verbose output for troubleshooting:
Linux:
python3 /opt/diskover/plugins_postindex/diskover_esfieldcopier/diskover_esfieldcopier.py -v -f cost -f project_code diskover-old diskover-new
Windows:
python "C:\Program Files\Diskover\plugins_postindex\diskover_esfieldcopier\diskover_esfieldcopier.py" -v -f cost -f project_code diskover-old diskover-new
Automated Execution
For production environments, you'll typically want to run ES Field Copier automatically after each indexing operation. Diskover provides two methods for automated execution.
Method 1: Post-Crawl Command (Recommended)
Configure ES Field Copier to run automatically after an Index Task completes by adding it as a Post-Crawl Command.
In the Diskover Admin Panel, navigate to your Index Task configuration and add the following:
Linux Example:
Field | Value |
|---|---|
Post-Crawl Command |
|
Post-Crawl Command Args |
|
Windows Example:
Field | Value |
|---|---|
Post-Crawl Command |
|
Post-Crawl Command Args |
|
The {indexname} token is automatically replaced with the name of the index that was just created.
Sample Post-Crawl Command configuraiton for ES Field Copier executing with an Index Task:
In your system ensure to replace the ConfigurationName above with a named configuraiton that you’ve created at Diskover Admin → Plugins → Post-Index → ES Field Copier – If you are not using a custom configuration and you’re just using Default than the -c flag and the ConfigurationName is not required!
Method 2: Custom Task
Create a standalone Custom Task that can be scheduled or triggered manually.
In the Diskover Admin Panel, navigate to Task Panel > Custom Tasks and create a new task:
Linux Example:
Field | Value |
|---|---|
Command |
|
Arguments |
|
Windows Example:
Field | Value |
|---|---|
Command |
|
Arguments |
|
Replace <target_index_name> with your actual target index name, or use the -a flag with index naming patterns that allow auto-discovery.
Sample Custom Task Configuration:
Here we can see the Run Command & args needed for the Custom Task - Note that in this case you cannot use the {indexname} variable as this is not a task that creates an index, so we must use the -l (toppath) CLI option and pass in our top path!
What to Expect During Execution
When the ES Field Copier runs, it performs these steps:
Validates indices — Confirms both source and target indices exist
Migrates mappings — Checks for missing field mappings in the target index and adds them
Queries source index — Searches for documents that have values in the specified fields
Processes in batches — Groups documents into batches of 500 for efficient processing
Updates target index — Finds matching documents in the target index and copies field values
With verbose logging enabled (-v), you'll see progress messages showing thread activity and document counts.
Reviewing the Output
Successful Execution
When ES Field Copier completes successfully, you'll see a summary message indicating how many documents were updated:
Finished updating 15432 docs in 2m 34s
Log Messages
Log Message | Meaning |
|---|---|
| Plugin initialized successfully |
| Using |
| Auto-discovery succeeded |
| Main processing started |
| Processing completed successfully |
Verbose Output
With the -v flag, you'll see additional detail about the processing:
Log Message | Meaning |
|---|---|
| Elasticsearch query being executed |
| Worker thread processing a batch |
| Target index search completed |
| Batch update completed |
Verifying Field Values
After running ES Field Copier, you can verify that fields were copied by searching for documents with those field values in Diskover's search interface or by querying Elasticsearch directly:
curl -s "localhost:9200/<target_index>/_search" -H 'Content-Type: application/json' -d'
{
"size": 5,
"query": { "exists": { "field": "cost" } },
"_source": ["parent_path", "name", "cost"]
}'
Troubleshooting
No Documents Updated
Symptom: The plugin runs successfully but reports 0 documents updated.
Possible Causes and Solutions:
Source index has no documents with the specified fields
Verify the field name is spelled correctly (field names are case-sensitive)
Check that documents in the source index actually have values in those fields
Documents don't exist in both indices
Files must exist in both the source and target indices with matching paths
If files were added or removed between scans, they won't have matching documents
Document type mismatch
If your fields are only on files but
file_fieldsis disabled, no documents will matchVerify your
file_fieldsanddirectory_fieldssettings match your data
Index Not Found Error
Symptom: Error message: <index_name> no such index!
Solutions:
Verify the exact index name spelling (case-sensitive)
Ensure the target index has completed indexing before running ES Field Copier
Use the
-aflag to let the plugin auto-discover the source index
Auto-Index Find Fails
Symptom: No previous index found! when using the -a flag
Solutions:
Ensure a previous index exists that scanned the same top-level paths
If the previous index was deleted, you'll need to manually specify both indices
Verify the index naming follows a pattern that allows auto-discovery
Fields Not Overwriting
Symptom: Existing field values in the target index are not being updated.
Solutions:
Verify
overwrite_existingis set totruein your configurationIf using a named configuration (
-c), confirm that configuration has the correct setting
Enabling Debug Logging
For detailed troubleshooting, run with the verbose flag:
Linux:
python3 /opt/diskover/plugins_postindex/diskover_esfieldcopier/diskover_esfieldcopier.py -v -f cost diskover-source diskover-target 2>&1 | tee esfieldcopier_debug.log
Windows:
python "C:\Program Files\Diskover\plugins_postindex\diskover_esfieldcopier\diskover_esfieldcopier.py" -v -f cost diskover-source diskover-target > esfieldcopier_debug.log 2>&1
This captures all output to a log file for review.
Support
For additional assistance with the ES Field Copier plugin:
Resource | Link |
|---|---|
Diskover Documentation | |
Diskover Support Portal |
Last Updated: April 2026
Comments
0 comments
Please sign in to leave a comment.