Elasticsearch with gluster-block

This is the updated version of my previous blog about using Gluster Block Storage with Elastic Search.

In this blog, will introduce gluster-block utility and demonstrate how simple is it to use gluster block storage with Elastic search engine.

Introduction to gluster block

gluster-block is a block device management framework which aims at making gluster backed block storage creation and maintenance as simple as possible. gluster-block provisions block devices and exports them using iSCSI.  More details while you run through the blog.

Read More about gluster-block here

Note:  I have used 4 Fedora 25 Machines for creating this howto.
Setup at a glance:

  1. We use Node1, Node2 and Node3 for creating gluster volume and use same for exporting target block storage from gluster volume.
  2. On Node 4, we enable and configure mutipath, then discover and login to the individual target portals exported from Node1, Node2 and Node3. Finally configure and run Elasticsearch.

Gluster block storage setup

Pre-requisites

  • A gluster volume  in a trusted storage pool (3 nodes, we also use the same nodes as block exports)

Creating a block device

Install gluster-block
# dnf config-manager --add-repo https://copr.fedorainfracloud.org/coprs/pkalever/gluster-block/repo/fedora-25/pkalever-gluster-block-fedora-25.repo 
# dnf install gluster-block

# systemctl start gluster-blockd
# systemctl status gluster-blockd

Create a block of size 40GiB (using same Nodes as gluster volume) 
# gluster-block create sampleVol/elasticBlock ha 3 10.70.35.109,10.70.35.104,10.70.35.51 40GiB
IQN: iqn.2016-12.org.gluster-block:c1029cc3-7c40-48a0-94bf-16c1b4fad254
PORTAL(S): 10.70.35.109:3260 10.70.35.104:3260 10.70.35.51:3260
RESULT: SUCCESS

# gluster-block list sampleVol
elasticBlock 

# gluster-block info sampleVol/elasticBlock 
NAME: elasticBlock
VOLUME: sampleVol
GBID: c1029cc3-7c40-48a0-94bf-16c1b4fad254
SIZE: 42949672960
HA: 3
BLOCK CONFIG NODE(S): 10.70.35.109 10.70.35.51 10.70.35.104
# gluster-block help
gluster-block (0.1)
usage:
 gluster-block  <volname[/blockname]> []

commands:
 create <volname/blockname> [ha ] <host1[,host2,...]> 
 create block device.

 list 
 list available block devices.

 info <volname/blockname>
 details about block device.

 delete <volname/blockname>
 delete block device.

 help
 show this message and exit.

 version
 show version info and exit.

Initiator side setup  (on Elasticsearch node) (NODE 4)

# dnf install iscsi-initiator-utils

Multipathing to achieve high availability
# mpathconf 
multipath is enabled
find_multipaths is enabled
user_friendly_names is enabled
dm_multipath module is not loaded
multipathd is not running

# modprobe dm_multipath
# lsmod | grep dm_multipath
dm_multipath 24576 0

# mpathconf --enable

# mpathconf 
multipath is enabled
find_multipaths is enabled
user_friendly_names is enabled
dm_multipath module is loaded
multipathd is running

# cat >> /etc/multipath.conf
# LIO iSCSI
devices {
        device {
                vendor "LIO-ORG"
                user_friendly_names "yes" # names like mpatha
                path_grouping_policy "failover" # one path per group
                path_selector "round-robin 0"
                path_checker "tur"
                prio "const"
                rr_weight "uniform"
        }
}
^Ctrl+C

# systemctl restart multipathd

Check existing block devices
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 40G 0 disk 
├─vda2 252:2 0 39G 0 part 
│ ├─fedora-swap 253:1 0 4G 0 lvm [SWAP]
│ └─fedora-root 253:0 0 15G 0 lvm /
└─vda1 252:1 0 1G 0 part /boot

Discovery and login to target
# iscsiadm --mode discovery --type sendtargets --portal 10.70.35.109 -l
# iscsiadm --mode discovery --type sendtargets --portal 10.70.35.104 -l
# iscsiadm --mode discovery --type sendtargets --portal 10.70.35.51 -l

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 40G 0 disk 
└─mpatha 253:2 0 40G 0 mpath 
sdc 8:32 0 40G 0 disk 
└─mpatha 253:2 0 40G 0 mpath 
sda 8:0 0 40G 0 disk 
└─mpatha 253:2 0 40G 0 mpath 
[...]

# mkfs.xfs /dev/mapper/mpatha 
meta-data=/dev/mapper/mpatha isize=512  agcount=4, agsize=2621440 blks
         =                   sectsz=512 attr=2, projid32bit=1
         =                   crc=1      finobt=1, sparse=0
data     =                   bsize=4096 blocks=10485760, imaxpct=25
         =                   sunit=0    swidth=0 blks
naming   =version 2          bsize=4096 ascii-ci=0 ftype=1
log      =internal log       bsize=4096 blocks=5120, version=2
         = sectsz=512        sunit=0    blks, lazy-count=1
realtime =none               extsz=4096 blocks=0, rtextents=0

# mount /dev/mapper/mpatha /mnt/

# df -Th
Filesystem Type Size Used Avail Use% Mounted on
[...]
/dev/mapper/mpatha xfs 40G 33M 40G 1% /mnt

Elasticsearch configuration (Node 4)

get latest release
# dnf install https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.0.2.rpm

# dnf install jq

# /usr/share/elasticsearch/bin/elasticsearch-plugin install analysis-icu

Configure Elasticsearch to use gluster block mount directory for storage
Uncomment and edit the below parameters as per your choice
# vi /etc/elasticsearch/elasticsearch.yml
cluster.name: gluster-block
node.name: blocktest-node
path.data: /mnt/data
path.logs: /mnt/logs

# mkdir /mnt/data /mnt/logs
# chown -R elasticsearch:elasticsearch /mnt/

# systemctl start elasticsearch.service 

Check the status
# systemctl status elasticsearch.service 


List the Indices
# curl -XGET http://localhost:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size

Now let’s create an index name "bank"
# curl -XPUT http://localhost:9200/bank?pretty 
{
 "acknowledged" : true,
 "shards_acknowledged" : true
}

Note that the docs.count is 0.

# curl -XGET http://localhost:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open bank hM-25KP6RvWrPJa2oG1o2g 5 1 0 0 650b 650b

Let’s now put something into our bank index.
In order to index a document, we must tell Elasticsearch which type in the index it should go to.
Let’s index a simple document into the bank index, "account" type, with an ID of 1 as follows:
# curl -XPUT http://localhost:9200/bank/account/1?pretty -d '
> {
> "account_number": "999120999",
> "name": "pkalever"
> }'
{
 "_index" : "bank",
 "_type" : "account",
 "_id" : "1",
 "_version" : 1,
 "result" : "created",
 "_shards" : {
 "total" : 2,
 "successful" : 1,
 "failed" : 0
 },
 "created" : true
}

By looking at the response we can say that a new bank document was successfully created.
# curl -XGET http://localhost:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open bank hM-25KP6RvWrPJa2oG1o2g 5 1 1 0 4.3kb 4.3kb

Query a document
# curl -XGET http://localhost:9200/bank/account/1?pretty
{
 "_index" : "bank",
 "_type" : "account",
 "_id" : "1",
 "_version" : 1,
 "found" : true,
 "_source" : {
 "account_number" : "999120999",
 "name" : "pkalever"
 }
}

Delete the entry
# curl -XDELETE http://localhost:9200/bank/account/1?pretty
{
 "found" : true,
 "_index" : "bank",
 "_type" : "account",
 "_id" : "1",
 "_version" : 2,
 "result" : "deleted",
 "_shards" : {
 "total" : 2,
 "successful" : 1,
 "failed" : 0
 }
}

If we study the above commands carefully, we can actually see a pattern of how we access data in Elasticsearch.
That pattern can be summarized as follows:
 ///

Also read about how to Loading Wikipedia’s Search Index

Conclusion

This blog showcases how block storage has been made simple with gluster-block utility. More details will come by in further posts.

References

https://www.elastic.co/blog/loading-wikipedia

https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/_create_an_index.html

Previous posts on gluster block storage

https://pkalever.wordpress.com/2016/11/18/elasticsearch-with-gluster-block-storage/

https://pkalever.wordpress.com/2016/06/23/gluster-solution-for-non-shared-persistent-storage-in-docker-container/

https://pkalever.wordpress.com/2016/06/29/non-shared-persistent-gluster-storage-with-kubernetes/

https://pkalever.wordpress.com/2016/08/16/read-write-once-persistent-storage-for-openshift-origin-using-gluster/

https://pkalever.wordpress.com/2016/11/04/gluster-as-block-storage-with-qemu-tcmu/

3 thoughts on “Elasticsearch with gluster-block

  1. here used tested environment is Fedora ,
    but i am using RHEL based Oracle linux so does gluster-block compatible with RHEL as well? What i needs to make it work?

    • Thanks Jason.
      Yes gluster-block depends on tcmu-runner for exporting the backend file in a gluster volume as an iSCSI block device and tcmu-runner needs tcmu_user kernel module which help in serving/driving the backed storage from user space (via tcmu-runner service).

      I have not tested this on CentOS yet. But if we have tcmu-runner and targetcli >= 2.1.fb43-2. gluster-block will be surely supported on CentOS as well. Else there will be an overhead on bringing those packages or take the pain in building from source.

      In case if you are planning to work on this, please drop the result here 🙂

Leave a comment