HEX
Server: Apache/2.4.41 (Ubuntu)
System: Linux ip-172-31-42-149 5.15.0-1084-aws #91~20.04.1-Ubuntu SMP Fri May 2 07:00:04 UTC 2025 aarch64
User: ubuntu (1000)
PHP: 7.4.33
Disabled: pcntl_alarm,pcntl_fork,pcntl_waitpid,pcntl_wait,pcntl_wifexited,pcntl_wifstopped,pcntl_wifsignaled,pcntl_wifcontinued,pcntl_wexitstatus,pcntl_wtermsig,pcntl_wstopsig,pcntl_signal,pcntl_signal_get_handler,pcntl_signal_dispatch,pcntl_get_last_error,pcntl_strerror,pcntl_sigprocmask,pcntl_sigwaitinfo,pcntl_sigtimedwait,pcntl_exec,pcntl_getpriority,pcntl_setpriority,pcntl_async_signals,pcntl_unshare,
Upload Files
File: /var/www/vhost/disk-apps/magento.bikenow.co/vendor/elasticsearch/elasticsearch/docs/crud.asciidoc
[[indexing_documents]]
== Indexing documents

IMPORTANT: Please note that mapping types will disappear from {es}, read more 
{ref-7x}/removal-of-types.html[here]. If you migrated types from {es} 6 to 7, 
you can address these with the `type` param.

When you add documents to {es}, you index JSON documents. This maps naturally to 
PHP associative arrays, since they can easily be encoded in JSON. Therefore, in 
Elasticsearch-PHP you create and pass associative arrays to the client for 
indexing. There are several methods of ingesting data into {es} which we cover 
here.


=== Single document indexing

When indexing a document, you can either provide an ID or let {es} generate one 
for you.

{zwsp} +

.Providing an ID value
[source,php]
----
$params = [
    'index' => 'my_index',
    'id'    => 'my_id',
    'body'  => [ 'testField' => 'abc']
];

// Document will be indexed to my_index/_doc/my_id
$response = $client->index($params);
----
{zwsp} +

.Omitting an ID value
[source,php]
----
$params = [
    'index' => 'my_index',
    'body'  => [ 'testField' => 'abc']
];

// Document will be indexed to my_index/_doc/<autogenerated ID>
$response = $client->index($params);
----
{zwsp} +

If you need to set other parameters, such as a `routing` value, you specify 
those in the array alongside the `index`, and others. For example, let's set the 
routing and timestamp of this new document:

.Additional parameters
[source,php]
----
$params = [
    'index'     => 'my_index',
    'id'        => 'my_id',
    'routing'   => 'company_xyz',
    'timestamp' => strtotime("-1d"),
    'body'      => [ 'testField' => 'abc']
];


$response = $client->index($params);
----
{zwsp} +


=== Bulk Indexing

{es} also supports bulk indexing of documents. The bulk API expects JSON 
action/metadata pairs, separated by newlines. When constructing your documents 
in PHP, the process is similar. You first create an action array object (for 
example, an `index` object), then you create a document body object. This 
process repeats for all your documents.

A simple example might look like this:

.Bulk indexing with PHP arrays
[source,php]
----
for($i = 0; $i < 100; $i++) {
    $params['body'][] = [
        'index' => [
            '_index' => 'my_index',
	    ]
    ];

    $params['body'][] = [
        'my_field'     => 'my_value',
        'second_field' => 'some more values'
    ];
}

$responses = $client->bulk($params);
----

In practice, you'll likely have more documents than you want to send in a single 
bulk request. In that case, you need to batch up the requests and periodically 
send them:

.Bulk indexing with batches
[source,php]
----
$params = ['body' => []];

for ($i = 1; $i <= 1234567; $i++) {
    $params['body'][] = [
        'index' => [
            '_index' => 'my_index',
            '_id'    => $i
        ]
    ];

    $params['body'][] = [
        'my_field'     => 'my_value',
        'second_field' => 'some more values'
    ];

    // Every 1000 documents stop and send the bulk request
    if ($i % 1000 == 0) {
        $responses = $client->bulk($params);

        // erase the old bulk request
        $params = ['body' => []];

        // unset the bulk response when you are done to save memory
        unset($responses);
    }
}

// Send the last batch if it exists
if (!empty($params['body'])) {
    $responses = $client->bulk($params);
}
----

[[getting_documents]]
== Getting documents

{es} provides realtime GETs of documents. This means that as soon as the 
document is indexed and your client receives an acknowledgement, you can 
immediately retrieve the document from any shard. Get operations are performed 
by requesting a document by its full `index/type/id` path:

[source,php]
----
$params = [
    'index' => 'my_index',
    'id'    => 'my_id'
];

// Get doc at /my_index/_doc/my_id
$response = $client->get($params);
----
{zwsp} +

[[updating_documents]]
== Updating documents

Updating a document allows you to either completely replace the contents of the 
existing document, or perform a partial update to just some fields (either 
changing an existing field or adding new fields).

=== Partial document update

If you want to partially update a document (for example, change an existing 
field or add a new one) you can do so by specifying the `doc` in the `body` 
parameter. This merges the fields in `doc` with the existing document.


[source,php]
----
$params = [
    'index' => 'my_index',
    'id'    => 'my_id',
    'body'  => [
        'doc' => [
            'new_field' => 'abc'
        ]
    ]
];

// Update doc at /my_index/_doc/my_id
$response = $client->update($params);
----
{zwsp} +


=== Scripted document update

Sometimes you need to perform a scripted update, such as incrementing a counter 
or appending a new value to an array. To perform a scripted update, you need to 
provide a script and usually a set of parameters:

[source,php]
----
$params = [
    'index' => 'my_index',
    'id'    => 'my_id',
    'body'  => [
        'script' => 'ctx._source.counter += count',
        'params' => [
            'count' => 4
        ]
    ]
];

$response = $client->update($params);
----
{zwsp} +


=== Upserts

Upserts are "Update or Insert" operations. This means an upsert attempts to run 
your update script, but if the document does not exist (or the field you are 
trying to update doesn't exist), default values are inserted instead.

[source,php]
----
$params = [
    'index' => 'my_index',
    'id'    => 'my_id',
    'body'  => [
        'script' => [
            'source' => 'ctx._source.counter += params.count',
            'params' => [
                'count' => 4
            ],
        ],
        'upsert' => [
            'counter' => 1
        ],
    ]
];

$response = $client->update($params);
----
{zwsp} +


[[deleting_documents]]
== Deleting documents

Finally, you can delete documents by specifying their full `/index/_doc_/id` 
path:

[source,php]
----
$params = [
    'index' => 'my_index',
    'id'    => 'my_id'
];

// Delete doc at /my_index/_doc_/my_id
$response = $client->delete($params);
----
{zwsp} +