Elasticsearch- GCS Repository Configuration

Updated: May 31


Introduction

This article will walk the readers through the steps required to configure snapshot and restore using Google Cloud Storage Bucket as a repository. I tried to keep the steps as simple as they could be by omitting most content from the official documentation. This guide help users who want to perform snapshot and get it working without getting in-depth.


Pre-requisites

If you already have login to Elasticsearch and Kibana, navigate to 'Stack Management' > 'Snapshot and Restore' and see the options available. Here's the step by step representation of what needs to be done.

  1. Create a bucket in Google cloud storage.

  2. Create a service account and assign storage.admin permission.

  3. Import the Json key from the service account.

  4. Install gcs repository plugin in Elasticsearch.

  5. Add the service account json in keystore.

  6. Add the repository from Kibana stack management console.

  7. Create a snapshot and verify the success.

Create a bucket in Google cloud storage

  • Access the Google cloud console



  • Go to 'Storage' > 'Cloud Storage' > 'Browser'

  • Click on 'Create Bucket' and fill out the information to the bottom. If you're performing a first time test, let's give it a name 'elasticbucket'.


Create a service account and assign storage.admin permission

  • Go to 'IAM' > 'IAM & Admin' > 'Service accounts'.

  • Click on 'Create Service Account' and fill in the information to the bottom.

  • Assign the Cloud storage admin role to the service account. Alternatively, you could do it on object level if you wish to be more specific.

Import json key from the service account

  • Once the service account is created, you are now on service accounts page.

  • Click on 3 dots and select 'Manage Keys'

  • Click on 'Add Key' and then 'Create new key'

  • Key type must be Json and click on 'Create'

  • Once we click on Create a json file is downloaded to the system. Please keep it safe since it contains sensitive information and could be misused if stolen.

  • Copy the key to the Elasticsearch host file system to any location. For example, in Centos 7 paste the content of the downloaded file to a file in tmp. Elasticsearch will use many bucket related parameters included in json like project id, type, private key etc.

cd /tmp                                            #Navigate to directory
vi elastic-gcs-key.json                            #Create JSON file

Install Google cloud storage repository plugin in Elasticsearch

  • Connect to the Elasticsearch host and install the gcs repository plugin.

cd /usr/share/elasticsearch/bin                    #Navigate to directory
./elasticsearch-plugin install repository-gcs      #Install directory
systemctl restart elasticsearch                    #Restart Elasticsearch
systemctl status elasticsearch                     #Check status
  • Verify the repository installation by executing the command as below

cd /usr/share/elasticsearch/bin                    #Navigate to directory
./elasticsearch-plugin list                        #Plugin Listing

Add the service account json in keystore

  1. Assuming that we have copied the json file to a local disk location, we will now add a keystore with bucket provided credentials to be fetched from json file using below command.

cd /usr/share/elasticsearch/bin                     #Navigate to directory
./elasticsearch-keystore add-file gcs.client.default.credentials_file /tmp/elastic-gcs-key.json                           #Add credentials

Add the repository from Kibana stack management console

  • Login to Kibana console and go to 'Stack management' > 'Snapshot and Restore'

  • Click on 'Repositories' tab and click on 'Register repository'

  • Add a Repository Name of your choice say 'gcsrepository1'

  • Select Repository Type : Google Cloud Storage

  • Enable or disable 'Source-only snapshots': Optional

Source only snapshot reduces the storage consumption up to 50%. After restoring this snapshot, it requires us to re-index data to the new index since it does only store index metadata and fields.
  • Click 'Next' and we are on the page which is really crucial and sensitive.

  • Fill the information required which I tried to simplify for you

Client: This is google cloud storage client which we should leave default since this information will be retrieved from the json we have added in section 'Add the service account json in keystore'

Bucket: This is the exact name of the bucket you have created which is 'elasticbucket' in this case.

Base Path: This is the name of the folder that will be created within the bucket to keep snapshot. We could type any name here and elasticsearch will create it for us. For example, we will enter the name 'elasticsnapshot' and the folder will automatically appear once we finish the registration.

Bucket and Base Path can be better understood by looking at the bucket configuration gsutil URI. Here gs://bucket/basepath is gs://elasticbucket/elasticsnapshot

Compress Snapshots: It's easy to understand. It reduces your storage consumption by compressing the files :)

Chunk Size: This helps you to store the snapshots in smaller units.

Max snapshot bytes per second: If you wish to keep network traffic remain unaffected and you have been too calculative, you could define the maximum size per second.

Max restore bytes per second: The same is applied while performing restore of the snapshot.

Read-Only: If you have only one cluster, do not enable. If you have more than one clusters, you could keep one cluster write and rest read-only.

  • Click on 'Register'

  • A pane should appear on right side of the screen. Click on 'Verify repository'.

  • Verification status should be connected.



Alternatively, you could use Elasticsearch Dev Tools to register a repository.

PUT _snapshot/gcsrepository1
{
    "type": "gcs",
    "settings": {
        "bucket": "elasticbucket",
        "base_path": "elasticsnapshot"
        }
}
  • Output should be acknowledged: true

Verify the snapshot

  • A snapshot must appear listed under 'Snapshot' Tab

  • Go to Cloud storage bucket and indices must appear listed there as well.


Point to consider

I assume that you have deployed Elasticsearch on a Linux distribution that is installed using RPM package. If this is a different distro, few points may need little changes that uses Linux commands and file system involvement.


Thank you for reading. If you encounter any errors following the steps, I will be happy to help!

Please do let me know how could I make it even better :)





54 views0 comments

Recent Posts

See All