tags: Elasticsearch Elastic search engine lucene
In my previous article article:
Elasticsearch: Useful ElasticSearch query example
Start using Elasticsearch (2)
I have listed a lot of examples about Elasticsearch queries. Holding a lot of good ideas, in today's article, I will bring more examples to everyone. I hope everyone has more understanding of Elasticsearch.
ElasticSearch provides a set of powerful options to query the documentation of various use cases, so it is useful to apply which query applied to a specific case. The following is a hands-on tutorial that helps you use the most important queries provided by Elasticsearch.
In this guide, you will learn many popular query examples with detailed explanation. Each query covered here will be divided into two types:
Please note: In this article, I will use the latest Elastic Stack 7.16.3 publishing. It is more than the article, so divided into two parts:
Let's first create a new index using some sample data so that you can operate according to each search example. Create an index called "Employees":
PUT employees
To define mappings (modios) to include fields (such as Date_OF_BIRTH):
PUT employees/_mapping
{
"properties": {
"date_of_birth": {
"type": "date",
"format": "dd/MM/yyyy"
}
}
}
As shown above, the date format of our document is DD / MM / YYYY.
Let us now take some documents into our newly created index, as shown in the example below, using Elasticsearch_bulk API:
POST _bulk
{"index":{"_index":"employees","_id":"1"}}
{"id":1,"name":"Huntlee Dargavel","email":"[email protected]","gender":"male","ip_address":"58.11.89.193","date_of_birth":"11/09/1990","company":"Talane","position":"Research Associate","experience":7,"country":"China","phrase":"Multi-channelled coherent leverage","salary":180025}
{"index":{"_index":"employees","_id":"2"}}
{"id":2,"name":"Othilia Cathel","email":"[email protected]","gender":"female","ip_address":"3.164.153.228","date_of_birth":"22/07/1987","company":"Edgepulse","position":"Structural Engineer","experience":11,"country":"China","phrase":"Grass-roots heuristic help-desk","salary":193530}
{"index":{"_index":"employees","_id":"3"}}
{"id":3,"name":"Winston Waren","email":"[email protected]","gender":"male","ip_address":"202.37.210.94","date_of_birth":"10/11/1985","company":"Yozio","position":"Human Resources Manager","experience":12,"country":"China","phrase":"Versatile object-oriented emulation","salary":50616}
{"index":{"_index":"employees","_id":"4"}}
{"id":4,"name":"Alan Thomas","email":"[email protected]","gender":"male","ip_address":"200.47.210.95","date_of_birth":"11/12/1985","company":"Yamaha","position":"Resources Manager","experience":12,"country":"China","phrase":"Emulation of roots heuristic coherent systems","salary":300000}
Now we already have an index containing the document and a specified mapping, we have prepared to start using the sample search. For everyone to read, I will list the fields of one of the documents:
GET employees/_doc/1?filter_path=_source
{
"_source" : {
"id" : 1,
"name" : "Huntlee Dargavel",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "58.11.89.193",
"date_of_birth" : "11/09/1990",
"company" : "Talane",
"position" : "Research Associate",
"experience" : 7,
"country" : "China",
"phrase" : "Multi-channelled coherent leverage",
"salary" : 180025
}
}
From the back data from above, we can see the fields included in each document. Whole index us contains only 4 documents, but it is enough to let us know each search. Less data sets can make us see more clearly.
"Match" query It is one of the most basic and most common inquiry in Elasticsearch, and it has played a role of full text. We can use this query to search for text, numbers or boolean values.
Let's search for words "Heuristic" included in the field named "phrase" in the documentation previously extracted.
POST employees/_search
{
"query": {
"match": {
"phrase": {
"query": "heuristic"
}
}
}
}
In the four documents in our index, only the word "heuristic" in the "phrase" field in 2 documents:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.6785375,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.6785375,
"_source" : {
"id" : 2,
"name" : "Othilia Cathel",
"email" : "[email protected]",
"gender" : "female",
"ip_address" : "3.164.153.228",
"date_of_birth" : "22/07/1987",
"company" : "Edgepulse",
"position" : "Structural Engineer",
"experience" : 11,
"country" : "China",
"phrase" : "Grass-roots heuristic help-desk",
"salary" : 193530
}
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.62577873,
"_source" : {
"id" : 4,
"name" : "Alan Thomas",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "200.47.210.95",
"date_of_birth" : "11/12/1985",
"company" : "Yamaha",
"position" : "Resources Manager",
"experience" : 12,
"country" : "China",
"phrase" : "Emulation of roots heuristic coherent systems",
"salary" : 300000
}
}
]
}
}
What happens if we want to search multiple words? Let us search for "Heuristic Roots Help" using the same query we just implemented.
POST employees/_search
{
"query": {
"match": {
"phrase": {
"query": "heuristic roots help"
}
}
}
}
This will return the same document as previously, because by default, Elasticsearch uses the OR operator to process each word in the search query. In our example, the query will match any documents that contain "heuristic" or "roots" or "Help".
The OR operator applied to a preparation is the default behavior of Match, but we can use the "Operator" parameter that is passed with the "Match" query. We can use the "OR" or "AND" value to specify the Operator parameter.
Let us see what will happen when we provide operator parameters "and" in the same query before.
POST employees/_search
{
"query": {
"match": {
"phrase": {
"query": "heuristic roots help",
"operator": "AND"
}
}
}
}
Now the results will return only one document (document ID = 2) because this is the only document that contains all three search keywords in the "phrase" field.
Further, we can set a threshold for the minimum match that must be included in the document. For example, if we set this parameter to 1, the query will check at least one post-matching document.
Now, if we set the "minium_should_match" parameter to 3, all three words must appear in the document to be classified as a match.
In our example, the following query will return only one document (id = 2) because it is the only document that meets our conditions:
POST employees/_search
{
"query": {
"match": {
"phrase": {
"query" : "heuristic roots help",
"minimum_should_match": 3
}
}
}
}
So far, we have been dealing with matching items on a single field - that is, we search for keywords in a single field named "phrase". However, if we need to search for the keyword in multiple fields of the document? This is more matching (multi-match) Where the query is placed.
Let us try to search for examples of "Research" in the "Position" and "Phrase" fields included in the document.
POST employees/_search
{
"query": {
"multi_match": {
"query": "research help",
"fields": [
"position",
"phrase"
]
}
}
}
This will result in the following response:
{
"took" : 24,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.2613049,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.2613049,
"_source" : {
"id" : 1,
"name" : "Huntlee Dargavel",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "58.11.89.193",
"date_of_birth" : "11/09/1990",
"company" : "Talane",
"position" : "Research Associate",
"experience" : 7,
"country" : "China",
"phrase" : "Multi-channelled coherent leverage",
"salary" : 180025
}
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.1785964,
"_source" : {
"id" : 2,
"name" : "Othilia Cathel",
"email" : "[email protected]",
"gender" : "female",
"ip_address" : "3.164.153.228",
"date_of_birth" : "22/07/1987",
"company" : "Edgepulse",
"position" : "Structural Engineer",
"experience" : 11,
"country" : "China",
"phrase" : "Grass-roots heuristic help-desk",
"salary" : 193530
}
}
]
}
}
From the above results, we can see that any documents containing research or phrase fields that contain research or HELP will be searched.
Match_phrase It is another commonly used query, as indicated by its name, it matches the phrase in the field.
If we need to search phrase "roots heuristic coherent" in the "Phrase" field of employee index, we can use "match_phrase" query:
GET employees/_search
{
"query": {
"match_phrase": {
"phrase": {
"query": "roots heuristic coherent"
}
}
}
}
This will return a document with the exact phrase "roots heuristic coherent", including the order of words. In our example, we only have the results that meet the above conditions, as shown in the following response:
{
"took" : 23,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.877336,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.877336,
"_source" : {
"id" : 4,
"name" : "Alan Thomas",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "200.47.210.95",
"date_of_birth" : "11/12/1985",
"company" : "Yamaha",
"position" : "Resources Manager",
"experience" : 12,
"country" : "China",
"phrase" : "Emulation of roots heuristic coherent systems",
"salary" : 300000
}
}
]
}
}
One useful feature we can use in the match_phrase query is "SLOP" parameter, which allows us to create a more flexible search.
Suppose we use the match_phrase query to search "roots coherent". We will not receive any documents returned from the employee index. This is because you want to match Match_phrase, these terms need to be arranged in an accurate order.
Now let's use the SLOP parameters and see what will happen:
GET employees/_search
{
"query": {
"match_phrase": {
"phrase": {
"query": "roots coherent",
"slop": 1
}
}
}
}
When Slop = 1, the query indicates that a word can be moved, so we will receive the following response. In the following response, you can see "Roots Coherent" matches the "Roots Heuristic Coherent" document. This is because the SLOP parameter allows you to skip 1 terms.
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.7873249,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.7873249,
"_source" : {
"id" : 4,
"name" : "Alan Thomas",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "200.47.210.95",
"date_of_birth" : "11/12/1985",
"company" : "Yamaha",
"position" : "Resources Manager",
"experience" : 12,
"country" : "China",
"phrase" : "Emulation of roots heuristic coherent systems",
"salary" : 300000
}
}
]
}
}
match_phrase_prefix The query is similar to the match_phrase query, but the last word of the search key is treated here as a prefix, which is used to match any words starting with the prefix word.
First, let's insert a document in an index to better understand the match_phrase_prefix query:
PUT employees/_doc/5
{
"id": 4,
"name": "Jennifer Lawrence",
"email": "[email protected]",
"gender": "female",
"ip_address": "100.37.110.59",
"date_of_birth": "17/05/1995",
"company": "Monsnto",
"position": "Resources Manager",
"experience": 10,
"country": "Germany",
"phrase": "Emulation of roots heuristic complete systems",
"salary": 300000
}
Let us apply Match_phrase_prefix now:
GET employees/_search
{
"_source": [
"phrase"
],
"query": {
"match_phrase_prefix": {
"phrase": {
"query": "roots heuristic co"
}
}
}
}
In the following results, we can see a document with CoHERENT and Complete matches the query. We can also use the SLOP parameters in the "Match_phrase" query.
{
"took" : 61,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 3.0871696,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 3.0871696,
"_source" : {
"phrase" : "Emulation of roots heuristic coherent systems"
}
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "5",
"_score" : 3.0871696,
"_source" : {
"phrase" : "Emulation of roots heuristic complete systems"
}
}
]
}
}
Note: "Match_phrase_query" attempts to match 50 extensions (CO in our example) (in our example), that is to say, 50 results containing CO. This can increase or decrease by specifying the "max_expansions" parameter.
GET employees/_search
{
"_source": [
"phrase"
],
"query": {
"match_phrase_prefix": {
"phrase": {
"query": "roots heuristic co",
"max_expansions": 1
}
}
}
}
For example, the above query will only return a document:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.805721,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.805721,
"_source" : {
"phrase" : "Emulation of roots heuristic coherent systems"
}
}
]
}
}
Due to this prefix property and the easy-to-set attribute of the Match_phrase_prefix query, it is usually used to automatically complete the function.
Let us delete the document that just added ID = 5:
DELETE employees/_doc/5
The term query is used to query structured data, which is usually precise value.
This is the simplest term query. This query is full match for the field search search keyword in the document.
For example, if we use the term "gender" query to search "MALE", it will search for this word, even if there is case in case.
This can be proved by two queries:
GET employees/_search
{
"query": {
"term": {
"gender": {
"value": "female"
}
}
}
}
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.87546873,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.87546873,
"_source" : {
"id" : 2,
"name" : "Othilia Cathel",
"email" : "[email protected]",
"gender" : "female",
"ip_address" : "3.164.153.228",
"date_of_birth" : "22/07/1987",
"company" : "Edgepulse",
"position" : "Structural Engineer",
"experience" : 11,
"country" : "China",
"phrase" : "Grass-roots heuristic help-desk",
"salary" : 193530
}
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.87546873,
"_source" : {
"id" : 4,
"name" : "Jennifer Lawrence",
"email" : "[email protected]",
"gender" : "female",
"ip_address" : "100.37.110.59",
"date_of_birth" : "17/05/1995",
"company" : "Monsnto",
"position" : "Resources Manager",
"experience" : 10,
"country" : "Germany",
"phrase" : "Emulation of roots heuristic complete systems",
"salary" : 300000
}
}
]
}
}
There are two results in the above search. If we do the following query:
GET employees/_search
{
"query": {
"term": {
"gender": {
"value": "Female"
}
}
}
}
On the top, we modified the female to Female, then we could not search any documents. In the above case, the only difference between the two queries is different from the case of search keywords. Case 1 is all lowercase, which is matched because the value of this field is saved by lowercase. But for the case 2, the search did not get any results, because there is no such token with the "Gender" field with uppercase "F".
We can also useterms queryPass the terminology that you want to search on the same field. Let's search for "female" and "Male" in the gender field. To do this, we can use the following Terms Query:
POST employees/_search
{
"query": {
"terms": {
"gender": [
"female",
"male"
]
}
}
}
The above query will return all four documents.
Sometimes the field does not have an index value, or there is no field in the document. In this case, it helps to identify such files and analyze the impact.
For example, let's index the following document to the "Employee" index
PUT employees/_doc/5
{
"id": 5,
"name": "Michael Bordon",
"email": "[email protected]",
"gender": "male",
"ip_address": "10.47.210.65",
"date_of_birth": "12/12/1995",
"position": "Resources Manager",
"experience": 12,
"country": null,
"phrase": "Emulation of roots heuristic coherent systems",
"salary": 300000
}
This document is not named "Company" field, and the value of the "Country" field is NULL.
Now, if we want to find a document for "Company", we can use the following exist:
GET employees/_search
{
"query": {
"exists": {
"field": "company"
}
}
}
The above query will list all documents with the "Company" field. The result of the above query will return 4 documents, and they all contain the Company field. The document of ID = 5 is not searched.
Perhaps the more useful solution is to list all documents that do not have the "Company" field. This can also be implemented by using the following queries:
GET employees/_search
{
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "company"
}
}
]
}
}
}
The BOOL query will be described in detail in the following sections. The above query will only return the document of ID = 5:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.0,
"_source" : {
"id" : 5,
"name" : "Michael Bordon",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "10.47.210.65",
"date_of_birth" : "12/12/1995",
"position" : "Resources Manager",
"experience" : 12,
"country" : null,
"phrase" : "Emulation of roots heuristic coherent systems",
"salary" : 300000
}
}
]
}
}
Let us remove the inserted documentation from the index, for convenience and unity, by typing the following request:
DELETE employees/_doc/5
Another most common query in the world of Elasticsearch is a range query. The range query allows us to get a document containing terms within a specified range. The range query is a query of the term level (indicating structured data), which can be used in numerical fields, date fields, and more.
For example, in the data set we created, if we need to filter out the expenence level between 5 and 10 years, we can apply the following range query:
POST employees/_search
{
"query": {
"range": {
"experience": {
"gte": 5,
"lte": 10
}
}
}
}
What is GTE, GT, LT and LT?
The result of the above query is:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"id" : 1,
"name" : "Huntlee Dargavel",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "58.11.89.193",
"date_of_birth" : "11/09/1990",
"company" : "Talane",
"position" : "Research Associate",
"experience" : 7,
"country" : "China",
"phrase" : "Multi-channelled coherent leverage",
"salary" : 180025
}
}
]
}
}
That is, only one document between Experience is between 5 and 10.
Similarly, the range query can also be applied to the date field. If we need to find someone born after 1986, we can send a query as shown below:
GET employees/_search
{
"query": {
"range" : {
"date_of_birth" : {
"gte" : "01/01/1986"
}
}
}
}
This will get a document with the Date_Of_birth field only after 1986.
IDS query is a relatively few query, but it is one of the most useful queries, so it is eligible to in this list. Sometimes we need to retrieve documents according to the ID of the document. This can be implemented using a single GET request, as follows:
GET indexname/_doc/documentId
If an ID can only get a document, this may be a good solution, but what if we have more documents?
This is where IDS queries are very convenient. With IDS queries, we can do this in a single request.
In the following example, we get the documentation of 1 and 4 from the Employees index through a single request.
POST employees/_search
{
"query": {
"ids" : {
"values" : ["1", "4"]
}
}
}
The above query will return:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"id" : 1,
"name" : "Huntlee Dargavel",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "58.11.89.193",
"date_of_birth" : "11/09/1990",
"company" : "Talane",
"position" : "Research Associate",
"experience" : 7,
"country" : "China",
"phrase" : "Multi-channelled coherent leverage",
"salary" : 180025
}
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"id" : 4,
"name" : "Alan Thomas",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "200.47.210.95",
"date_of_birth" : "11/12/1985",
"company" : "Yamaha",
"position" : "Resources Manager",
"experience" : 12,
"country" : "China",
"phrase" : "Emulation of roots heuristic coherent systems",
"salary" : 300000
}
}
]
}
}
Prefix query (Prefix query) Used to obtain documentation containing a given search string as a specified field prefix.
Suppose we need to get all documents containing "Al" as a prefix in the "Name" field, then we can use the prefix query as follows:
GET employees/_search
{
"query": {
"prefix": {
"name": "al"
}
}
}
This will result in the following response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"id" : 4,
"name" : "Alan Thomas",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "200.47.210.95",
"date_of_birth" : "11/12/1985",
"company" : "Yamaha",
"position" : "Resources Manager",
"experience" : 12,
"country" : "China",
"phrase" : "Emulation of roots heuristic coherent systems",
"salary" : 300000
}
}
]
}
}
Since the prefix query is a term query, it will pass the search string as it is. That is to search for "al" and "al" are different. If we search for "Al" in the example above, we will get 0 results because there is no token that starts with "Al" in the reverse index of the "Name" field. However, if we query the "name.keyword" field, use "Al" we will get the above results, in which case "Al" will cause zero.
This is also called wildcard query (wildcard query. A document with a term that matches the given wildcard mode is obtained. For example, let's search "C * A" on the field "country" to search "C * A" as shown below:
GET employees/_search
{
"query": {
"wildcard": {
"country": {
"value": "c*a"
}
}
}
}
The above query will get a document that begins with "C" and ending with "A" (for example: China, Canada, Cambodia, etc.).
Here * operators can match zero or more characters.
this isRegular query. This is similar to the "wildcard" query we see above, but will accept the regular expression as an input and acquire a matching document.
GET employees/_search
{
"query": {
"regexp": {
"position": "res[a-z]*"
}
}
}
The above query will result in a document that matches the word of the regular expression RES [A-Z] *.
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"id" : 1,
"name" : "Huntlee Dargavel",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "58.11.89.193",
"date_of_birth" : "11/09/1990",
"company" : "Talane",
"position" : "Research Associate",
"experience" : 7,
"country" : "China",
"phrase" : "Multi-channelled coherent leverage",
"salary" : 180025
}
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"id" : 3,
"name" : "Winston Waren",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "202.37.210.94",
"date_of_birth" : "10/11/1985",
"company" : "Yozio",
"position" : "Human Resources Manager",
"experience" : 12,
"country" : "China",
"phrase" : "Versatile object-oriented emulation",
"salary" : 50616
}
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"id" : 4,
"name" : "Alan Thomas",
"email" : "[email protected]",
"gender" : "male",
"ip_address" : "200.47.210.95",
"date_of_birth" : "11/12/1985",
"company" : "Yamaha",
"position" : "Resources Manager",
"experience" : 12,
"country" : "China",
"phrase" : "Emulation of roots heuristic coherent systems",
"salary" : 300000
}
}
]
}
}
Fuzzy queries can be used to return documents that contain words similar to search terms. This is especially useful when dealing with spelling errors. Even if we use a fuzzy query to search "chnia" instead of "China", we can also get results.
Let us look at an example:
GET employees/_search
{
"query": {
"fuzzy": {
"country": {
"value": "Chnia",
"fuzziness": "2"
}
}
}
}
The ambiguity here is the maximum editing distance that matches allowed. We can use the parameters such as "Max_expansions" as seen in the "Match_phrase" query. More related documents can behereturn up
Fuzzy queries can also appear together with the "Match" query type. The following example shows the ambiguitability used in the Multi_Match query:
POST employees/_search
{
"query": {
"multi_match": {
"query": "heursitic reserch",
"fields": [
"phrase",
"position"
],
"fuzziness": 2
}
},
"size": 10
}
Although there is a spelling error in the query, the above query will still return a document that matches "Heuristic" or "Research".
At the time of the query, first get a more popular result will usually help. The easiest way to do this is called Boosting in Elasticsearch. This will send it in a manner when we query multiple fields. For example, consider the following query:
POST employees/_search
{
"query": {
"multi_match" : {
"query" : "versatile Engineer",
"fields": ["position^3", "phrase"]
}
}
}
This will return a document that matches the "Position" field located at the top of the response instead of the document that matches the "phrase" field.

ElasticSearch returns a document according to the descending order of the "_score" field when the search request is specified. This "_score" is calculated based on the degree of query matching of the default score method using Elasticsearch. In all examples we discussed above, you can see the same behavior in the results.
Only when we use the "Filter" context, the score will not be calculated to return the result faster.

ElasticSearch provides us with a field-based option. For example, let us need to sort employees according to the employee's experience. We can use the following query that enabled sort options:
GET employees/_search
{
"_source": [
"name",
"experience",
"salary"
],
"sort": [
{
"experience": {
"order": "desc"
}
}
]
}
The results of the above query are as follows:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"name" : "Winston Waren",
"experience" : 12,
"salary" : 50616
},
"sort" : [
12
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"name" : "Alan Thomas",
"experience" : 12,
"salary" : 300000
},
"sort" : [
12
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "2",
"_score" : null,
"_source" : {
"name" : "Othilia Cathel",
"experience" : 11,
"salary" : 193530
},
"sort" : [
11
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "Huntlee Dargavel",
"experience" : 7,
"salary" : 180025
},
"sort" : [
7
]
}
]
}
}
As can be seen from the above response, the result is based on the descending order of the employee experience. In addition, two employees have the same level of empirical levels.
In the above example, we see that there are two empirical levels of employees, all 12, but we need to sort it again according to the order of salary. We can also provide multiple fields to sort, as shown in the following query:
GET employees/_search?filter_path=**.hits
{
"_source": [
"name",
"experience",
"salary"
],
"sort": [
{
"experience": {
"order": "desc"
}
},
{
"salary": {
"order": "desc"
}
}
]
}
Now we get the following results:
{
"hits" : {
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"name" : "Alan Thomas",
"experience" : 12,
"salary" : 300000
},
"sort" : [
12,
300000
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"name" : "Winston Waren",
"experience" : 12,
"salary" : 50616
},
"sort" : [
12,
50616
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "2",
"_score" : null,
"_source" : {
"name" : "Othilia Cathel",
"experience" : 11,
"salary" : 193530
},
"sort" : [
11,
193530
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "Huntlee Dargavel",
"experience" : 7,
"salary" : 180025
},
"sort" : [
7,
180025
]
}
]
}
}
In the above results, you can see that in the employee with the same experience level, the highest salary is in advance (Alan and Winston's experience, but the previous search results are different, here Alan Rank is enhanced because of his higher salary).
Note: If we change the order of the sort parameters in the sort array, let's keep the "Salary" parameter, then retain the "experience" parameters, then the search results will change. The result will first sort according to the salary parameters, and the empirical parameters will be considered without affecting the sort based on salary.
Let us reverse the order of the above queries, that is, first keep "Saliff", then "Experience", as shown below:
GET employees/_search?filter_path=**.hits
{
"_source": [
"name",
"experience",
"salary"
],
"sort": [
{
"salary": {
"order": "desc"
}
},
{
"experience": {
"order": "desc"
}
}
]
}
The results are as follows:
{
"hits" : {
"hits" : [
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"name" : "Alan Thomas",
"experience" : 12,
"salary" : 300000
},
"sort" : [
300000,
12
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "2",
"_score" : null,
"_source" : {
"name" : "Othilia Cathel",
"experience" : 11,
"salary" : 193530
},
"sort" : [
193530,
11
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "Huntlee Dargavel",
"experience" : 7,
"salary" : 180025
},
"sort" : [
180025,
7
]
},
{
"_index" : "employees",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"name" : "Winston Waren",
"experience" : 12,
"salary" : 50616
},
"sort" : [
50616,
12
]
}
]
}
}
You can see the candidate of the experience value 12 below the experience value of 7, because the salary of the latter is higher than the former.
continue reading "Elasticsearch: Elasticsearch Query Sample - Hand practice (2)”
Foreword ES as a search engine, one of the important features is the blur search of the content, we often see the words searched by web search, the results of the searched words, which is basically th...
Python operation Elasticsearch insert Inquire Traverside renew New field update Add IS_DOWNLOAD field...
demand Based on elasticsearch-based relevance queries, the returned results are often not what we want. The accuracy rate does not meet the requirements. How can I make the search engine return the re...
Article Directory Build index library Add data (document) Update data (document) Delete data (document) Get paging data Search by keywords Build index library interface address: Request content: Add d...
Prepare data Simple query Combined query...
As the core code...
First, first knowledge database The database is to save a large amount of data, which can be efficiently accessed by computer processing. This data set is called database (Database, DB).Computer syste...
This is the button control provided by Win operating system Common design attributes: text and available state Common events: Click Common code attributes: ...
Xpath study notes Introduction to xpath Xpath is used to navigate through elements and attributes in an XML document Xpath uses path expressions to pick nodes or sets of nodes in XML or HTML. Xpath no...