How to Copy Data from One Elasticsearch Server to Another with Reindexing
Migrating data between Elasticsearch servers is a common task for scaling, upgrading, or relocating your Elasticsearch clusters. This document covers the one of primary methods: reindexing data directly between clusters.
Prerequisites
- Access to both source and destination Elasticsearch clusters
- Elasticsearch 7.x or later (recommendation for compatibility)
Migrating Using Reindex from Remote
Step 1: Enable Remote Reindexing on Destination Cluster
Configure the destination cluster to accept remote reindex requests by setting the reindex.remote.whitelist
:
PUT /_cluster/settings
{
"persistent": {
"reindex.remote.whitelist": ["source_host:9200"]
}
}
Important: specify the port otherwise call validation will fail
Step 2: Run Reindex Command
Execute reindex from the target cluster:
POST /_reindex?wait_for_completion=false&pretty
{
"source": {
"remote": {
"host": "http://source_es:9200",
"username": "",
"password": "",
"socket_timeout": "2m",
"connect_timeout": "30s"
},
"index": "source_index",
"size": 1000
},
"dest": {
"index": "destination_index",
"op_type": "create"
},
"conflicts": "proceed"
}
As in the example use chunck of 1000 batch size, steady for a starting point. Later if stable CPU/heap on nodes, can be increased.
As an output you will get the task id
{
"task": "task_id"
}
This method streams data directly between clusters.
Step 3: Monitor progress & health
Reindexing task can be a long running, but you can monitor it using the following call:
GET /_tasks/<task_id>?pretty
Additional Tips
- Always test migration procedures on a staging environment.
- Use query to filter large data sets.
- Verify data integrity after transfer.
- Consider shutting down indexing or write operations during migration to prevent data inconsistency.
- Monitor cluster health throughout the process.
Other methods
For whole-index moves, snapshot/restore from source to target cluster is usually faster and gentler than remote reindex.
Conclusion
Migrating Elasticsearch data can be accomplished efficiently using reindexing. Choose the method based on your data size, infrastructure, and downtime constraints. Proper planning ensures a smooth transition with minimal disruption.
No comments are allowed for this post