Index Data from a Relational Database into Elasticsearch

In my previous article, we showed a straight forward way to index data from the relational database into Elasticsearch. In this article we will show you:

Create mappings for the index
Understand the SQL syntax for creating nested objects and nested object arrays in the elasticsearch

Create mappings for the index

We would better know of the data and its structure than Elasticsearch. It’s good to start with the mapping structure created by Elasticsearch. But over time when we understand the index structure, we can create our own mapping and specify it while creating the index.

The complete index definition for the world index can be found here. It includes the index properties and the mappings structure of the data to be indexed. The syntax for the index definition is:

{
    "aliases": {},
    "mappings": {},
    "settings": {},
    "warmers": {}
}

where,

aliases – the aliases created for this index
mappings – the structure of the data in the index
settings – the settings for the index like the number of shards and the number of replicas for the index.
warmers – any index warmers. This has been removed in Elasticsearch 5.3

Let us focus on mappings. There are three nested properties in the mapping: capital, language and cities. The cities attribute is an array of objects. We differentiate its type by defining it as nested.

We will create a new index world_v2 by sending a POST request with the index definition available here as the request body.

Why do we need to declare it’s type as nested?

This is because declaring it as nested will allow us to filter the data on these nested attributes. Elasticsearch will flatten these nested objects while indexing and there by allowing us to filter on the same.

SQL syntax for creating nested object

In the SQL query we have created column alias in the query of the form: “object.property_name” and “object[property_name]“. For example “capital.name“, “cities[name]“. The “.” notation is used for nested object and the “[]” notation is used for array of nested object.

This is the syntax supported by the elasitcsearch-jdbc importer we are using.

Indexing data into new index

After creating the index, the index structure as shown by: http://localhost:9200/world_v2 looks like:

Update world-importer-config.json index property to world_v2 or the index name you have provided. And then run world-importer.bat or world-importer.sh

In our subsequent articles, we will see writing simple queries, aggregate queries, filtering on nested objects and lot more on this indexed data.

	Book Review: Digital… on Book Review: Deep Work
	Colin Bowern on The Great Indian Developer Sum…
	Martin Goodnews on Book Review: Atomic Habits by…
	narendra enamala on How to show links in ADF Messa…
	Emadul islam on Book Review: Atomic Habits by…

Index Data from a Relational Database into Elasticsearch – 2

Create mappings for the index

Why do we need to declare it’s type as nested?

SQL syntax for creating nested object

Like this:

Related

3 thoughts on “Index Data from a Relational Database into Elasticsearch – 2”

Leave a ReplyCancel reply

Create mappings for the index

Why do we need to declare it’s type as nested?

SQL syntax for creating nested object

Share this:

Like this:

Related

3 thoughts on “Index Data from a Relational Database into Elasticsearch – 2”

Leave a ReplyCancel reply

Discover more from Experiences Unlimited