暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

4、ElasticSearch入门-为文档建立索引

产品与码农 2020-04-04
675

Getting started with Elasticsearch-Index some documents

  • 安装集群

  • 集群启动并运行后,就可以为某些数据建立索引了。Elasticsearch有多种摄取选项,但最终它们都做同样的事情:将JSON文档放入Elasticsearch索引中。Once you have a cluster up and running, you’re ready to index some data. There are a variety of ingest options for Elasticsearch, but in the end they all do the same thing: put JSON documents into an Elasticsearch index.

  • 您可以通过一个简单的PUT请求直接执行此操作,该请求指定要添加文档的索引,唯一的文档ID以及"field": "value"请求正文中的一对或多 对:You can do this directly with a simple PUT request that specifies the index you want to add the document, a unique document ID, and one or more "field": "value" pairs in the request body:

PUT /customer/_doc/1
{
"name": "zhang san"
}

  • 如果该请求customer尚不存在,该请求将自动创建该索引,添加ID为的新文档1,并存储该name字段并为其建立索引。This request automatically creates the customer index if it doesn’t already exist, adds a new document that has an ID of 1, and stores and indexes the name field.

  • 由于这是一个新文档,因此响应显示该操作的结果是创建了该文档的版本1:Since this is a new document, the response shows that the result of the operation was that version 1 of the document was created:

{
"_index" : "customer",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 26,
"_primary_term" : 4,
"found" : true,
"_source" : {
"name": "zhangsan"
}
}

给批量文件建立索引Indexing documents in bulk
  • 如果您有很多要编制索引的文档,则可以使用批量API批量提交。使用批量处理批处理文档操作比单独提交请求要快得多,因为它可以最大程度地减少网络往返次数。If you have a lot of documents to index, you can submit them in batches with the bulk API. Using bulk to batch document operations is significantly faster than submitting requests individually as it minimizes network roundtrips.

  • 最佳批处理大小取决于许多因素:文档大小和复杂性,索引编制和搜索负载以及群集可用的资源。一个好的起点是批处理1,000至5,000个文档,总有效负载在5MB至15MB之间。从那里,您可以尝试找到最佳位置。The optimal batch size depends on a number of factors: the document size and complexity, the indexing and search load, and the resources available to your cluster. A good place to start is with batches of 1,000 to 5,000 documents and a total payload between 5MB and 15MB. From there, you can experiment to find the sweet spot.

  • 要将一些数据导入Elasticsearch,您可以开始搜索和分析:To get some data into Elasticsearch that you can start searching and analyzing:

    {
    "account_number": 0,
    "balance": 16623,
    "firstname": "Bradshaw",
    "lastname": "Mckenzie",
    "age": 29,
    "gender": "F",
    "address": "244 Columbus Place",
    "employer": "Euron",
    "email": "bradshawmckenzie@euron.com",
    "city": "Hobucken",
    "state": "CO"
    }

    curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"
    curl "localhost:9200/_cat/indices?v"

    响应表明成功索引了1,000个文档。The response indicates that 1,000 documents were indexed successfully.

    health status index uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    yellow open bank l7sSYV2cQXmu6_4rJWVIww 5 1 1000 0 128.6kb 128.6kb

    • bank使用以下_bulk请求将帐户数据索引到索引中:Index the account data into the bank index with the following _bulk request:

    • 下载accounts.json样本数据集。此随机生成的数据集中的文档代表具有以下信息的用户帐户:Download the accounts.json sample data set. The documents in this randomly-generated data set represent user accounts with the following information:


文章转载自产品与码农,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论