暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Elasticsearch06-脚本和写入原理

大数据全家桶 2021-09-17
557

Script

-- 使用painless脚本



数量减一, 安全操作

    POST produce/_update/1
    {
    "script": {
    "source": "ctx._source.price-=1"
    }
    }




    # 数量减一, 安全的<简写>
    POST produce/_update/1
    {"script": "ctx._source.price-=1"}




    查询所有的价格

      GET produce/_search
      {
      "script_fields": {
      "test_field": {
      "script": {
      "lang": "painless",
      "source": "doc['price'].value"
      }
      }
      }
      }



      给数组中添加元素

        POST product/_update/2
        {
        "script": {
        "lang": "painless",
        "source": "ctx._source.tags.add('无线充电')"
        }
        }



        传参的方式,给数组中添加元素

          POST product/_update/2
          {
          "script": {
          "lang": "painless",
          "source": "ctx._source.tags.add(params.tag_name)",
          "params": {
          "tag_name": "新增"
          }
          }
          }



          删除数据

            POST product/_update/3
            {
            "script": {
            "lang": "painless",
            "source": "ctx.op='delete'"
            }
            }



            如果数据存在,执行部分update操作,如果数据不存在,那么执行create操作

              POST produce/_update/2
              {
              "script": {
              "source": "ctx._source.price += params.paraml",
              "lang": "painless",
              "params": {
              "paraml": 100
              }
              },
              "upsert": {
              "name": "xiaoke",
              "price": 2000
              }
              }



              _bulk更新

                POST _bulk
                { "update" : { "_id" : "5", "_index" : "produce", "retry_on_conflict" : 3} }
                { "script" : { "source": "ctx._source.price += params.param1", "lang" : "painless", "params" : {"param1" : 100}}, "upsert" : {"price" : 1999}}






                打8折价格

                  GET produce/_search
                  {
                  "script_fields": {
                  "discount_price": {
                  "script": {
                  "lang": "painless",
                  "source": "doc['price'].value * params.discount",
                  "params": {
                  "discount" : 0.8
                  }
                  }
                  }
                  }
                  }




                  GET produce/_search
                  {
                  "script_fields": {
                  "discount_price": {
                  "script": {
                  "lang": "painless",
                  "source": "doc.price.value * params.discount",
                  "params": {
                  "discount": 0.8
                  }
                  }
                  }
                  }
                  }



                  原始价格 和 多个打折价格

                    GET produce/_search
                    {
                    "script_fields": {
                    "init_price": {
                    "script": {
                    "lang": "painless",
                    "source": "doc['price'].value"
                    }
                    },
                    "discount_price":{
                    "script":{
                    "lang": "painless",
                    "source": "[doc['price'].value * params.discount_8,doc['price'].value * params.discount_7,doc['price'].value * params.discount_6,doc['price'].value * params.discount_5]",
                    "params": {
                    "discount_8": 0.8,
                    "discount_7": 0.7,
                    "discount_6": 0.6,
                    "discount_5": 0.5
                    }
                    }
                    }
                    }
                    }



                    缓存方式查询

                      # _scripts/{id}  类似存储过程  计算折扣 作用域为整个集群
                      # 默认缓存大小是100MB 没有过期时间 可以手工设置过期时间script.cache.expire 通过script.cache.max_size设置缓存大小 脚本最大64MB






                      创建
                      POST _scripts/calculate-discount
                      {
                      "script": {
                      "lang": "painless",
                      "source": "doc['price'].value * params.discount"
                      }
                      }


                      查看
                      GET _scripts/calculate-discount


                      使用
                      GET produce/_search
                      {
                      "script_fields": {
                      "discount_price": {
                      "script": {
                      "id":"calculate-discount",
                      "params": {
                      "discount": 0.8
                      }
                      }
                      }
                      }
                      }




                      删除
                      DELETE _scripts/calculate-discount



                      Date

                        PUT one/_doc/2
                        {
                        "name": "xiaoke",
                        "createtime": "2021-09-06"
                        }




                        # year、month、dayOfMonth、dayOfWeek、dayOfYear、hour、minute、second、nano
                        GET one/_search
                        {
                        "script_fields": {
                        "test_year": {
                        "script": {
                        "source": "doc.createtime.value.year"
                        }
                        }
                        }
                        }



                        复杂操作,多个column更新

                          POST produce/_update/1
                          {
                          "script": {
                          "lang": "painless",
                          "source": """
                          ctx._source.name += params.name;
                          ctx._source.price -= 1
                          """,
                          "params": {
                          "name": "无线充电",
                          "price": "1"
                          }
                          }
                          }




                          正则匹配, 开启:script.painless.regex.enabled: true

                            POST produce/_update/1
                            {
                            "script": {
                            "lang": "painless",
                            "source": """
                            if (ctx._source.name =~ [\s\S]*phone[\s\S]*/) {
                            ctx._source.name += "***|";
                            } else {
                            ctx.op = "noop";
                            }
                            """
                            }
                            }




                            POST one/_update/1
                            {
                            "script": {
                            "lang": "painless",
                            "source": """
                            if (ctx._source.createtime ==~ [0-9]{4}-[0-9]{2}-[0-9]{2}/) {
                            ctx._source.name += "|***";
                            } else {
                            ctx.op = "noop";
                            }
                            """
                            }
                            }



                            统计所有小于1000商品tag的 数量 不考虑去重

                              GET produce/_search
                              {
                              "query": {
                              "bool": {
                              "filter": [
                              {
                              "range": {
                              "price": {
                              "lt": 1000
                              }
                              }
                              }
                              ]
                              }
                              },
                              "aggs": {
                              "tag_agg_group": {
                              "sum": {
                              "script": {
                              "lang": "painless",
                              "source": """
                              int total = 0;
                              for (int i = 0; i < doc['tags'].length; i++)
                              {
                              total++
                              }
                              return total;
                              """
                              }
                              }
                              }
                              },
                              "size": 0
                              }



                              练习题

                                测试数据:
                                PUT test_indexc/_bulk?refresh
                                {"index":{"_id":1}}
                                {"ajbh": "12345","ajmc": "立案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三","NL": "30","SF": "男"},{"XM": "李四","NL": "31","SF": "男"},{"XM": "王五","NL": "30","SF": "女"},{"XM": "赵六","NL": 23,"SF": "男"}]}
                                {"index":{"_id":2}}
                                {"ajbh": "563245","ajmc": "结案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三2","NL": "30","SF": "男"},{"XM": "李四2","NL": "31","SF": "男"},{"XM": "王五2","NL": "30","SF": "女"},{"XM": "赵六2","NL": 23,"SF": "女"}]}
                                {"index":{"_id":3}}
                                {"ajbh": "12345","ajmc": "立案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三3","NL": "30","SF": "男"},{"XM": "李四3","NL": "31","SF": "男"},{"XM": "王五3","NL": "30","SF": "女"},{"XM": "赵六3","NL": 23,"SF": "男"}]}


                                求:所有案件中男性的数量
                                GET test_index1/_search
                                {
                                "aggs": {
                                "sum_person": {
                                "sum": {
                                "script": {
                                "lang": "painless",
                                "source": """
                                int total = 0;
                                for (int i = 0; i < params['_source']['jsbax_sjjh2_xz_ryjbxx_cleaning'].length; i++)
                                {
                                if (params['_source']['jsbax_sjjh2_xz_ryjbxx_cleaning'][i]['SF'] == '男') {
                                total += 1;
                                }
                                }
                                return total;
                                """
                                }
                                }
                                }
                                },
                                "size": 0
                                }





                                ES写入原理


                                写入步骤 
                                1.client请求
                                2. a.内存Buffer b.持久化到translog文件1.只是负责写入,很快会写入index segment file中 2.通过refresh写入到segment中 refresh
                                3.index segment1.用来增删改查的缓存 flush、a.刷到os cache b.触发commit
                                4.OS cachefsync
                                5.OS disk

                                 

                                 

                                refresh

                                  1.将内存Buffer中导数据刷新到segment中
                                  2.自动刷新
                                  PUT index
                                  {"settings":{"refresh_interval": "10s"}}
                                  3.手动刷新
                                  POST index/refresh





                                  流程

                                    refresh行为会立即把缓存中的文档写入segment中,但是此时新创建的segment是写在文件系统的缓存中的,如果出现断电等异常,name这部分数据就会丢失。所以es会定期执行flush操作,将缓存中segment全部写入磁盘并确保写入成功,同时创建一个commit point,整个过程就是一个完整的commit过程。




                                    flush

                                      1.手动执行
                                      POST /index/_flush

                                      2.自动执行
                                      触发flush:当文件大小达到阈值或者一定时间(默认30分钟)
                                      a.将数据刷到os cache
                                      b.同步创建commit
                                      c.fsync
                                      d.清空translog
                                      e.如果d之前出现了关机,内存数据丢失,会从translog文件加载


                                      segment删除步骤
                                      1.选择相似segment合并
                                      2.flush操作
                                      3.创建新的commit piont标记新的segment,删除旧的标记
                                      4.将新的segment搜索状态打开
                                      5.删除旧的segment文件






                                      文章转载自大数据全家桶,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                      评论