English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Elasticsearch is a search server based on Apache Lucene. It was developed by Shay Banon and2010Published. Now maintained by Elasticsearch BV. Its latest version is7.0.0.
Elasticsearch is a real-time distributed open-source full-text search and analysis engine. It can be accessed via a RESTful Web service interface and uses schemaless JSON (JavaScript Object Notation) documents to store data. It is built on the Java programming language, so Elasticsearch can run on different platforms. It allows users to browse large amounts of data at high speed.
The common features of Elasticsearch are as follows-
Elasticsearch can scale up to PB-level structured and unstructured data.
Elasticsearch can replace document storage systems such as MongoDB and RavenDB.
Elasticsearch uses denormalization to improve search performance.
Elasticsearch is one of the popular enterprise search engines, currently used by many large organizations such as Wikipedia, The Guardian, StackOverflow, GitHub, and more.
Elasticsearch is an open-source project available under the Apache license version20.0 is used.
The key concepts of Elasticsearch are as follows-
It refers to a single running instance of Elasticsearch. A single physical and virtual server can accommodate multiple nodes, depending on their physical resource capabilities such as RAM, storage, and processing power.
It is a collection of one or more nodes. A cluster provides collective indexing and search functionality across all nodes for all data.
It is a collection of documents of different types and their properties. The index also uses the concept of shards to improve performance. For example, a set of documents may contain data from a social networking application.
It is a collection of fields defined in a specific way in JSON format. Each document belongs to a type and is located within an index. Each document is associated with a unique identifier called UID.
Indexes are subdivided horizontally into shards. This means that each shard contains all the properties of the documents but fewer JSON objects than the index. Horizontal partitioning makes a shard an independent node that can be stored on any node. The primary shard is the original horizontal part of the index, and then these primary shards are copied to replica shards.
Elasticsearch allows users to create index and shard replicas. Replication not only helps improve data availability in case of failure but also improves search performance by executing parallel search operations on these replicas.
Elasticsearch is developed in Java, which makes it compatible with almost all platforms.
Elasticsearch is real-time, in other words, documents added one second ago can be searched in this engine.
Elasticsearch is distributed, making it easy to scale and integrate in any large organization.
Creating a complete backup using the gateway concept is very simple, a concept that is very common in Elasticsearch.
Compared to Apache Solr, it is very easy to handle multi-tenancy in Elasticsearch.
Elasticsearch uses JSON objects as responses, which allows for the use of a large number of different programming languages to call the Elasticsearch server.
Elasticsearch supports almost all document types except for those that do not support text rendering.
In terms of handling request and response data, Elasticsearch does not provide multilingual support (available only in JSON), unlike Apache Solr, which can handle CSV, XML, and JSON formats.
Sometimes, Elasticsearch may encounter the problem of brain split.
In Elasticsearch, an index is similar to a table in RDBMS (Relational Database Management System). Each table is a collection of rows, just as each index is a collection of documents in Elasticsearch.
The following table provides a direct comparison of these terms-
Elasticsearch | Relational Database Management System (RDBMS) |
---|---|
Cluster | Database |
Shard | Shard |
Index | Table |
Field | Column |
Document | Line |