ElasticSearch

Distributed Full Text Search

Evan Borden / @evanborden

What is it?

ElasticSearch is...

  • Distributed
  • Full Text Search Engine
  • Built On Apache Lucene - JVM
  • JSON everything
  • RESTful
  • Colocation

What isn't it?

"Schemaless"

Colocation

Like entities in close proximity

"Orange foxes are running wild"

to

"orang fox are run wild"

Structure

5 Deep Russian Doll

  • Cluster
    • Node
      • Index
        • Type
          • Document

Cluster

Group Of Nodes

Node

Instance Of ES

Index

Can approximate a database

Type

Can simulate a table

Document

Like a row

"Metaphors"

  • Cluster = Group of Instances
  • Node = Instance of ES
  • Index = Database
  • Type = Table
  • Document = Row

Built to Scale

Replacement for Apache Solr

Shards and Replicas

Auto balanced


$ curl -XPUT 'http://localhost:9200/twitter/' -d '{
    "settings" : {
        "index" : {
            "number_of_shards" : 3,
            "number_of_replicas" : 2
        }
    }
}'
                        

Auto Document Routing

Also configurable for perf


$ cur -XPOST 'http://localhost:9200/store/order?routing=user123' -d '
{
   "productName":"sample",
   "customerID":"user123"
}'
                        

Replication for Redundancy

  • Used when shard unreachable.
  • Promoted and balanced on node failure.

Near Realtime Updates

Synchronous update scheme

Async configurable

No Transactions

Why use it?

Fast, Flexible & Simple To Configure

Extremely robust features

Features

Es Knows JSON

Document


$ cur -XPOST 'http://localhost:9200/store/order' -d '
{
    "items": ["foo", "bar", "baz"],
    "customid": 1234,
    "approved": true,
    "coupon": {
        "type": "buy1get1"
    }
}'
                        

Query


$ cur -XPOST 'http://localhost:9200/store/order/_search' -d '
{
    "query": {
        "term": {
            "coupon.type": "buy1get1"
        }
    }
}'
                        

Robust Analyzers

Natural Language Processing

  • Stemming
  • Ngrams
  • Spelling Correction
  • Stop Words
  • Shingles
  • ...and more
  • +Build your own

First Class Geodata

Geo points and Geo shapes

Multi Field Types

Store, Search and Retreive multiple formats.

/*...*/
"properties" : {
    "name" : {
        "type" : "multi_field",
        "fields" : {
            "name" : {
                "type" : "string",
                "index" : "analyzed"},
            "stemmed" : {
                "type" : "string",
                "index" : "analyzed",
                "analyzer" : "snowball"},
            "untouched" : {
                "type" : "string",
                "index" : "not_analyzed"}
/*...*/

Facets

For Free!


  • Terms
  • Ranges
  • Histograms
  • Statistics
  • Term Stats
  • Geo Distances

Aggregations

Functionally composable DSL

Suggesters

  • term suggestion
  • phrase suggestion
  • auto complete
  • spelling correction

More Like This

Find more documents like yours

Percolation

Reverse indexing with queries

Flexibility

No Boundaries

Auto Node Discovery

Easy to Configure

Works Out of the Box

Defaults Are Performant

Documentation Is Robust

Shay Is Available

Gotchas

Static Shards

Plan ahead

Requires Forethought

Know your data

Field Type Errors

Stay consistent

Latest JVM

Still backwards compatible

Memory Hungry

Isolate it

No Security

Firewall required

Go Give it a Try

Questions?