Elasticsearch Tutorial

Elasticsearch Installation Elasticsearch Tutorial

Introduction to Elasticsearch

Elasticsearch is a search server based on Apache Lucene. It was developed by Shay Banon and2010Published. Now maintained by Elasticsearch BV. Its latest version is7.0.0.

Elasticsearch is a real-time distributed open-source full-text search and analysis engine. It can be accessed via a RESTful Web service interface and uses schemaless JSON (JavaScript Object Notation) documents to store data. It is built on the Java programming language, so Elasticsearch can run on different platforms. It allows users to browse large amounts of data at high speed.

Feature

The common features of Elasticsearch are as follows-

Elasticsearch can scale up to PB-level structured and unstructured data.
Elasticsearch can replace document storage systems such as MongoDB and RavenDB.
Elasticsearch uses denormalization to improve search performance.
Elasticsearch is one of the popular enterprise search engines, currently used by many large organizations such as Wikipedia, The Guardian, StackOverflow, GitHub, and more.
Elasticsearch is an open-source project available under the Apache license version20.0 is used.

Key Concepts

The key concepts of Elasticsearch are as follows-

Nodes

It refers to a single running instance of Elasticsearch. A single physical and virtual server can accommodate multiple nodes, depending on their physical resource capabilities such as RAM, storage, and processing power.

Cluster

It is a collection of one or more nodes. A cluster provides collective indexing and search functionality across all nodes for all data.

Index

It is a collection of documents of different types and their properties. The index also uses the concept of shards to improve performance. For example, a set of documents may contain data from a social networking application.

Document

It is a collection of fields defined in a specific way in JSON format. Each document belongs to a type and is located within an index. Each document is associated with a unique identifier called UID.

Shard

Indexes are subdivided horizontally into shards. This means that each shard contains all the properties of the documents but fewer JSON objects than the index. Horizontal partitioning makes a shard an independent node that can be stored on any node. The primary shard is the original horizontal part of the index, and then these primary shards are copied to replica shards.

Replicas

Elasticsearch allows users to create index and shard replicas. Replication not only helps improve data availability in case of failure but also improves search performance by executing parallel search operations on these replicas.

Advantages

Elasticsearch is developed in Java, which makes it compatible with almost all platforms.
Elasticsearch is real-time, in other words, documents added one second ago can be searched in this engine.
Elasticsearch is distributed, making it easy to scale and integrate in any large organization.
Creating a complete backup using the gateway concept is very simple, a concept that is very common in Elasticsearch.
Compared to Apache Solr, it is very easy to handle multi-tenancy in Elasticsearch.
Elasticsearch uses JSON objects as responses, which allows for the use of a large number of different programming languages to call the Elasticsearch server.
Elasticsearch supports almost all document types except for those that do not support text rendering.

Disadvantages

In terms of handling request and response data, Elasticsearch does not provide multilingual support (available only in JSON), unlike Apache Solr, which can handle CSV, XML, and JSON formats.
Sometimes, Elasticsearch may encounter the problem of brain split.

Comparison between Elasticsearch and RDBMS

In Elasticsearch, an index is similar to a table in RDBMS (Relational Database Management System). Each table is a collection of rows, just as each index is a collection of documents in Elasticsearch.

The following table provides a direct comparison of these terms-

Elasticsearch	Relational Database Management System (RDBMS)
Cluster	Database
Shard	Shard
Index	Table
Field	Column
Document	Line

Elasticsearch Installation Elasticsearch Tutorial