Article

    Introduction to Elasticsearch

    3 min read
    Last updated 5 months ago

    Overview

    Elasticsearch is an open-source, distributed search and analytics engine designed for horizontal scalability, reliability, and real-time search capabilities. It's commonly used as the core component of the ELK Stack (Elasticsearch, Logstash, and Kibana) for logging and observability but also supports a wide range of use cases including site search, log analytics, and business intelligence.

    Key Features

    • Distributed and Scalable: Easily handles petabytes of data and can scale horizontally by adding more nodes.

    • Full-Text Search: Powerful built-in capabilities for indexing and querying text with advanced features like stemming, synonyms, and relevance scoring.

    • Near Real-Time: Newly indexed documents become searchable within seconds.

    • RESTful API: Interact with Elasticsearch via simple HTTP/JSON APIs.

    • Schema-Free JSON Documents: Flexible data model using JSON, ideal for semi-structured or unstructured data.

    • Aggregation Framework: Enables complex analytics such as grouping, filtering, and statistical calculations.

    Common Use Cases

    • Log and event data analysis (via ELK or Elastic Stack)

    • E-commerce product search

    • Site search engines

    • Security information and event management (SIEM)

    • Application performance monitoring (APM)

    • Business intelligence dashboards

    Architecture Overview

    Elasticsearch is built around a cluster of nodes, which store data in shards and replicas:

    • Cluster: A collection of one or more nodes working together.

    • Node: A single Elasticsearch instance.

    • Index: A collection of related documents.

    • Shard: A basic unit of storage and search; each index is split into shards.

    • Replica: A copy of a shard used for high availability.

    Getting Started

    1. Installation

    2. Basic Query Example

      bash
      

      CopyEdit

      curl -X GET "localhost:9200/_search?q=user:john&pretty"

    3. Indexing a Document

      bash
      

      CopyEdit

      curl -X POST "localhost:9200/users/_doc/1" -H 'Content-Type: application/json' -d' { "user": "john", "message": "Hello, Elasticsearch!" }'

    Security and Authentication

    Elastic Stack includes built-in security features:

    • Role-based access control (RBAC)

    • TLS encryption

    • API key support

    • Integration with LDAP, SAML, and OIDC

    These features are available under Elastic’s default distribution (not in the open-source build).

    Monitoring and Maintenance

    • Use Kibana to visualize data and monitor cluster health.

    • Regularly review:

      • Disk usage and node health

      • Index lifecycle management (ILM) policies

      • Slow logs for performance bottlenecks

    Troubleshooting Tips

    SymptomPossible CauseSolutionCluster status redMissing primary shardsCheck node logs, use _cat/shardsHigh heap usageMemory pressureTune JVM settings or add nodesSlow queriesPoor mapping or no indexingUse correct field types, review mapping