摘要:前言項目中的由升級至。已經棄用,相應功能由實現,直接替換即可。構造報文調整調整成棄用,相關功能由實現。類型表示精確查找的文本,不需要進行分詞。查詢字段時,使用表示改版后,設置了的情況下,也要設置,否則會報。
前言
項目中的es由ver.1.4.5升級至ver.5.2.0。
安裝elasticSearch#下載 wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.2.0.tar.gz # 解壓 tar zxvf elasticsearch-5.5.0.tar.gz修改elasticsearch.yml
$ES_HOME/config/elasticsearch.yml
在這里不詳細展開elasticsearch.yml的各個配置項,附上鏈接。
配置es外部鏈接
lasticsearch-head是一個很好的可視化前端框架,方便用可視化界面對es進行調用。elasticsearch-head在Github的地址如下:https://github.com/mobz/elast...,安裝也不復雜,由于它是一個前端的工具,因此需要我們預先安裝了node和npm,之后執行下面的步驟:
git clone git://github.com/mobz/elasticsearch-head.git cd elasticsearch-head npm install
安裝完成后,運行命令npm run start就行。
調整棄用api的兼容問題 1.setting1.4.5的org.elasticsearch.common.settings.ImmutableSettings已經棄用,生成配置對象setting的方式改成:
Settings settings = Settings.builder().put("cluster.name", clusterName).put("client.transport.sniff", true).build();2.InetSocketTransportAddress
org.elasticsearch.common.transport.InetSocketTransportAddress#InetSocketTransportAddress(java.lang.String, int)方法已經棄用,注入集群地址的方式改成:
clusterNodeAddressList.add(new InetSocketTransportAddress(InetAddress.getByName(host), 9300));3.TransportClient
org.elasticsearch.client.transport.TransportClient#TransportClient(org.elasticsearch.common.settings.Settings),該構造方法已經棄用,生成TransportClient實例的方式改成:
transportClient = new PreBuiltTransportClient(settings);4.ClusterHealthStatus
org.elasticsearch.action.admin.cluster.health.ClusterHealthStatus類已經棄用,相同功能由org.elasticsearch.cluster.health.ClusterHealthStatus繼承
5.ScriptSortBuilder調整原版寫法:
Map
調整為:
Map6.FilterBuilder調整
org.elasticsearch.index.query.FilterBuilder類已經棄用,基本上從2.x版本開始,Filter就已經棄用了(不包括bool查詢內的filter),所有FilterBuilder全都要用QueryBuilder的各種子類來調整:
1.org.elasticsearch.index.query.BoolFilterBuilderBoolFilterBuilder boolFilterBuilder = FilterBuilders.boolFilter();
調整為:
BoolQueryBuilder boolFilterBuilder = new BoolQueryBuilder();2.org.elasticsearch.index.query.NestedFilterBuilder
filterBuilder = FilterBuilders.nestedFilter(param.getPath(), boolFilterBuilder);
調整為:
filterBuilder = new NestedQueryBuilder(param.getPath(), boolFilterBuilder, ScoreMode.None);3.org.elasticsearch.index.query.MissingFilterBuilder
5.x版本中,missing關鍵字已經棄用,其功能由其逆運算exist繼承。
MissingFilterBuilder missingFilterBuilder = FilterBuilders.missingFilter(paramName); if (param.getNvlType() == QueryFieldType.EXISTS) { filterBuilder = FilterBuilders.boolFilter().mustNot(missingFilterBuilder); } if (param.getNvlType() == QueryFieldType.MISSING) { filterBuilder = FilterBuilders.boolFilter().must(missingFilterBuilder); }
調整為:
ExistsQueryBuilder existsQueryBuilder = new ExistsQueryBuilder(paramName); if (param.getNvlType() == QueryFieldType.EXISTS) { filterBuilder = new BoolQueryBuilder().must(existsQueryBuilder); } if (param.getNvlType() == QueryFieldType.MISSING) { filterBuilder = new BoolQueryBuilder().mustNot(existsQueryBuilder); }4.org.elasticsearch.index.query.TermFilterBuilder
filterBuilder = FilterBuilders.termFilter(paramName, param.getEqValue());
調整為:
filterBuilder = new TermQueryBuilder(paramName, param.getEqValue());5.org.elasticsearch.index.query.TermsFilterBuilder
filterBuilder = FilterBuilders.inFilter(paramName, param.getInValues());
調整為:
filterBuilder = new TermsQueryBuilder(paramName, param.getInValues());6.org.elasticsearch.index.query.RangeFilterBuilder
//gte if (null != param.getGteValue()) { filterBuilder = FilterBuilders.rangeFilter(paramName).gte(param.getGteValue()); } //gt if (null != param.getGtValue()) { filterBuilder = FilterBuilders.rangeFilter(paramName).gt(param.getGtValue()); } //lte if (null != param.getLteValue()) { filterBuilder = FilterBuilders.rangeFilter(paramName).lte(param.getLteValue()); } //lt if (null != param.getLtValue()) { filterBuilder = FilterBuilders.rangeFilter(paramName).lt(param.getLtValue()); }
調整為:
//gte if (null != param.getGteValue()) { filterBuilder = new RangeQueryBuilder(paramName).gte(param.getGteValue()); } //gt if (null != param.getGtValue()) { filterBuilder = new RangeQueryBuilder(paramName).gt(param.getGtValue()); } //lte if (null != param.getLteValue()) { filterBuilder = new RangeQueryBuilder(paramName).lte(param.getLteValue()); } //lt if (null != param.getLtValue()) { filterBuilder = new RangeQueryBuilder(paramName).lt(param.getLtValue()); }7.search_type=count
原來我們想要計算文檔的需要用到search_type=count,現在5.0已經將該API移除,取而代之你只需將size置于0即可:
GET /my_index/_search?search_type=count { "aggs": { "my_terms": { "terms": { "field": "foo" } } } }
調整為:
#5.0以后 GET /my_index/_search { "size": 0, "aggs": { "my_terms": { "terms": { "field": "foo" } } } }8.RangeBuilder
org.elasticsearch.search.aggregations.bucket.range.RangeBuilder已經棄用,相應功能由org.elasticsearch.search.aggregations.bucket.range.RangeAggregationBuilder實現,直接替換即可。
9.TopHitsAggregationBuilderorg.elasticsearch.search.aggregations.metrics.tophits.TopHitsBuilder已經棄用,相應功能由org.elasticsearch.search.aggregations.metrics.tophits.TopHitsAggregationBuilder實現,直接替換即可。
10.FiltersAggregationBuilderorg.elasticsearch.search.aggregations.bucket.filters.FiltersAggregationBuilder構造報文調整
FiltersAggregationBuilder filtersAggregationBuilder = AggregationBuilders.filters(aggregationField.getAggName()); LufaxSearchConditionBuilder tmpConditionBuilder = new LufaxSearchConditionBuilder(); for (String key : aggregationField.getFiltersMap().keySet()) { LufaxFilterCondition tmpLufaxFilterCondition = aggregationField.getFiltersMap().get(key); FilterBuilder tmpFilterBuilder = tmpConditionBuilder.constructFilterBuilder(tmpLufaxFilterCondition.getAndParams(),tmpLufaxFilterCondition.getOrParams(),tmpLufaxFilterCondition.getNotParams()); filtersAggregationBuilder.filter(key, tmpFilterBuilder); }
調整成:
List11.HighlightBuilder;keyedFilters = new LinkedList (); LufaxSearchConditionBuilder tmpConditionBuilder = new LufaxSearchConditionBuilder(); for (String key : aggregationField.getFiltersMap().keySet()) { LufaxFilterCondition tmpLufaxFilterCondition = aggregationField.getFiltersMap().get(key); QueryBuilder tmpFilterBuilder = tmpConditionBuilder.constructFilterBuilder(tmpLufaxFilterCondition.getAndParams(),tmpLufaxFilterCondition.getOrParams(),tmpLufaxFilterCondition.getNotParams()); keyedFilters.add(new FiltersAggregator.KeyedFilter(key, tmpFilterBuilder)); } FiltersAggregationBuilder filtersAggregationBuilder = AggregationBuilders.filters(aggregationField.getAggName(), keyedFilters.toArray(new FiltersAggregator.KeyedFilter[]{}));
org.elasticsearch.search.highlight.HighlightBuilder棄用,相關功能由org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder實現。
12.OptimizeRequestBuilderorg.elasticsearch.action.admin.indices.optimize.OptimizeRequestBuilder 已經棄用,聚合索引的功能由org.elasticsearch.action.admin.indices.forcemerge.ForceMergeRequestBuilder來實現。
13.IndicesAliasesRequestBuilder 1.newAddAliasAction舊版刪除了AliasAction類的newAddAliasAction方法,故而IndicesAliasesRequestBuilder添加AliasActions應該:
requestBuilder.addAliasAction(AliasAction.newAddAliasAction(toIndex, indexAlias));
調整成
requestBuilder.addAliasAction(IndicesAliasesRequest.AliasActions.add().index(toIndex).alias(indexAlias));2.newRemoveAliasAction
舊版刪除了AliasAction類的newRemoveAliasAction方法,故而IndicesAliasesRequestBuilder刪除AliasActions應該:
requestBuilder.addAliasAction(AliasAction.newRemoveAliasAction(fromIdx, indexAlias));
調整成
requestBuilder.addAliasAction(IndicesAliasesRequest.AliasActions.remove().index(fromIdx).alias(indexAlias));14.AbstractAggregationBuilder的子類變更 1.org.elasticsearch.search.aggregations.bucket.terms.TermsBuilder
org.elasticsearch.search.aggregations.bucket.terms.TermsBuilder更名為
org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder
org.elasticsearch.search.aggregations.bucket.range.date.DateRangeBuilder更名為
org.elasticsearch.search.aggregations.bucket.range.date.DateRangeAggregationBuilder
org.elasticsearch.search.aggregations.metrics.tophits.TopHitsBuilder更名為
org.elasticsearch.search.aggregations.metrics.tophits.TopHitsAggregationBuilder
org.elasticsearch.search.SearchHit#isSourceEmpty方法改為org.elasticsearch.search.SearchHit#hasSource方法,反向替換。
16.DeleteByQueryResponseorg.elasticsearch.action.deletebyquery.DeleteByQueryResponse已經棄用,
調整關鍵字等結構性問題 1. String數據類型棄用在 ES2.x 版本字符串數據是沒有 keyword 和 text 類型的,只有string類型,ES更新到5版本后,取消了 string 數據類型,代替它的是 keyword 和 text 數據類型。區別在于:
text類型定義的文本會被分析,在建立索引前會將這些文本進行分詞,轉化為詞的組合,建立索引。允許 ES來檢索這些詞語。text 數據類型不能用來排序和聚合。
keyWord類型表示精確查找的文本,不需要進行分詞。可以被用來檢索過濾、排序和聚合。keyword 類型字段只能用本身來進行檢索。
在沒有顯性定義時,es默認為“text”類型。
2. multi_field關鍵字棄用相關mapping方式改為:
#對需要設置的字段,在"type"屬性后增加"fields": #其中的"raw"為自定義的名稱,想象它是city的一個分身。 PUT /my_index { "mappings": { "my_type": { "properties": { "city": { "type": "text", "fields": { "raw": { "type": "keyword" } } } } } } } 查詢raw字段時,使用city.raw表示3. analyzer 1.改版后,設置了search_analyzer的情況下,analyzer也要設置,否則會報:
analyzer on field [name] must be set when search_analyzer is set。2.改版后,index_analyzer設置被棄用,如果設置,會報
MapperParsingException[Mapping definition for [fields] has unsupported parameters: [index_analyzer : ik_max_word]];
這里擴展一下,在原來的版本中,index_analyzer負責建立索引時的分詞器定義,search_analyzer負責搜索時的分詞器定義。
索引期間查找解析器的完整順序是這樣的:
定義在字段映射中的index_analyzer
定義在字段映射中的analyzer
定義在文檔_analyzer字段中的解析器
type的默認index_analyzer
type的默認analyzer
索引設置中default_index對應的解析器
索引設置中default對應的解析器
節點上default_index對應的解析器
節點上default對應的解析器
standard解析器
而查詢期間的完整順序則是:
直接定義在查詢中的analyzer
定義在字段映射中的search_analyzer
定義在字段映射中的analyzer
type的默認search_analyzer
type的默認analyzer
索引設置中的default_search對應的解析器
索引設置中的default對應的解析器
節點上default_search對應的解析器
節點上default對應的解析器
standard解析器
現在新版刪除index_analyzer,具體功能由analyzer關鍵字承擔,analyzer關鍵字生效與index時和search時(除非search_analyzer已經被顯性定義)。
3. _timestamp在2.0棄用_timestamp官方建議自定義一個字段,自己賦值用來表示時間戳。
4. 嵌套字段排序時字段名稱調整對于如下的數據:
PUT /my_index/blogpost/2 { "title": "Investment secrets", "body": "What they don"t tell you ...", "tags": [ "shares", "equities" ], "comments": [ { "name": "Mary Brown", "comment": "Lies, lies, lies", "age": 42, "stars": 1, "date": "2014-10-18" }, { "name": "John Smith", "comment": "You"re making it up!", "age": 28, "stars": 2, "date": "2014-10-16" } ] }
老版本中,對stars字段進行排序時,直接可以
"sort" : [ { "stars" : { "order" : "desc", "mode" : "min", "nested_path" : "comments" } ]
但在新版中,上述報文會報
No mapping found for [stars] in order to sort on
需要改成:
"sort" : [ { "comments.stars" : { "order" : "desc", "mode" : "min" } ]5. _script腳本參數名變更
老版中,_script可以這樣定義
"sort" : [ { "_script" : { "script" : { "inline" : "paramsMap.containsKey(doc["id"].value) ? params.paramsMap.get(doc["id"].value) : params.paramsMap.get("other")", "lang" : "painless", "params" : { "paramsMap" : { "1" : 1, "2" : 1, "3" : 2, "other" : 3 } } }, "type" : "number", "order" : "asc" } } ]
新版中,對于params的參數paramsMap必須用params.paramsMap
"sort" : [ { "_script" : { "script" : { "inline" : "params.paramsMap.containsKey(doc["productCategory"].value) ? params.paramsMap.get(doc["productCategory"].value) : params.paramsMap.get("other")", "lang" : "painless", "params" : { "paramsMap" : { "901" : 1, "902" : 1, "701" : 2, "other" : 3 } } }, "type" : "number", "order" : "asc" } } ]
注意:es 5.2.0默認禁用了動態語言,所以lang為painless之外的語言,默認情況下會報
ScriptException[scripts of type [inline], operation [update] and lang [groovy] are disabled];
需要在yml文件中添加配置(如groovy):
script.engine.groovy.inline:true script.engine.groovy.stored.search:true script.engine.groovy.stored.aggs:true6 .獲取特定字段返回
在舊版本中,獲取特定文檔特定字段返回,可以使用stored_fields:
{ "from" : 0, "size" : 1, "query" : {}, "stored_fields" : "timestamp", "sort" : [ { "timestamp" : { "order" : "desc" } } ] }
新版本中,引入了更為強大的_source過濾器
{ "from" : 0, "size" : 1, "query" : {}, "_source" : "timestamp", "sort" : [ { "timestamp" : { "order" : "desc" } } ] }
或者
{ "from" : 0, "size" : 1, "query" : {}, "_source" : { "includes" : [ "timestamp" ], "excludes" : [ "" ] }, "sort" : [ { "timestamp" : { "order" : "desc" } } ] }
java的api主要調用SearchRequestBuilder的setFetchSource方法
7. date字段的format定義改版后,date字段最好再mapping時定義好format信息,以防止在請求前后因為格式轉換問題報錯:
ElasticsearchParseException[failed to parse date field [Thu Jun 18 00:00:00 CST 2015] with format [strict_date_optional_time||epoch_millis]]; nested: IllegalArgumentException[Parse failure at index [0] of [Thu Jun 18 00:00:00 CST 2015]]; }
[strict_date_optional_time||epoch_millis]是es默認的date字段解析格式
8. UncategorizedExecutionException改版前,transport client發送數據之前將java代碼中的字段序列化成了json然后進行傳輸和請求,而在5.x以后,es改用使用的內部的transport protocol,這時候,如果定義一個比如bigDecimal類型,es不支持bigDecimal,數據類型不匹配會拋錯誤。
UncategorizedExecutionException[Failed execution]; nested: IOException[can not write type [class java.math.BigDecimal]];
es支持的格式如下
static { Map, Writer> writers = new HashMap<>(); writers.put(String.class, (o, v) -> { o.writeByte((byte) 0); o.writeString((String) v); }); writers.put(Integer.class, (o, v) -> { o.writeByte((byte) 1); o.writeInt((Integer) v); }); writers.put(Long.class, (o, v) -> { o.writeByte((byte) 2); o.writeLong((Long) v); }); writers.put(Float.class, (o, v) -> { o.writeByte((byte) 3); o.writeFloat((float) v); }); writers.put(Double.class, (o, v) -> { o.writeByte((byte) 4); o.writeDouble((double) v); }); writers.put(Boolean.class, (o, v) -> { o.writeByte((byte) 5); o.writeBoolean((boolean) v); }); writers.put(byte[].class, (o, v) -> { o.writeByte((byte) 6); final byte[] bytes = (byte[]) v; o.writeVInt(bytes.length); o.writeBytes(bytes); }); writers.put(List.class, (o, v) -> { o.writeByte((byte) 7); final List list = (List) v; o.writeVInt(list.size()); for (Object item : list) { o.writeGenericValue(item); } }); writers.put(Object[].class, (o, v) -> { o.writeByte((byte) 8); final Object[] list = (Object[]) v; o.writeVInt(list.length); for (Object item : list) { o.writeGenericValue(item); } }); writers.put(Map.class, (o, v) -> { if (v instanceof LinkedHashMap) { o.writeByte((byte) 9); } else { o.writeByte((byte) 10); } @SuppressWarnings("unchecked") final Map map = (Map ) v; o.writeVInt(map.size()); for (Map.Entry entry : map.entrySet()) { o.writeString(entry.getKey()); o.writeGenericValue(entry.getValue()); } }); writers.put(Byte.class, (o, v) -> { o.writeByte((byte) 11); o.writeByte((Byte) v); }); writers.put(Date.class, (o, v) -> { o.writeByte((byte) 12); o.writeLong(((Date) v).getTime()); }); writers.put(ReadableInstant.class, (o, v) -> { o.writeByte((byte) 13); final ReadableInstant instant = (ReadableInstant) v; o.writeString(instant.getZone().getID()); o.writeLong(instant.getMillis()); }); writers.put(BytesReference.class, (o, v) -> { o.writeByte((byte) 14); o.writeBytesReference((BytesReference) v); }); writers.put(Text.class, (o, v) -> { o.writeByte((byte) 15); o.writeText((Text) v); }); writers.put(Short.class, (o, v) -> { o.writeByte((byte) 16); o.writeShort((Short) v); }); writers.put(int[].class, (o, v) -> { o.writeByte((byte) 17); o.writeIntArray((int[]) v); }); writers.put(long[].class, (o, v) -> { o.writeByte((byte) 18); o.writeLongArray((long[]) v); }); writers.put(float[].class, (o, v) -> { o.writeByte((byte) 19); o.writeFloatArray((float[]) v); }); writers.put(double[].class, (o, v) -> { o.writeByte((byte) 20); o.writeDoubleArray((double[]) v); }); writers.put(BytesRef.class, (o, v) -> { o.writeByte((byte) 21); o.writeBytesRef((BytesRef) v); }); writers.put(GeoPoint.class, (o, v) -> { o.writeByte((byte) 22); o.writeGeoPoint((GeoPoint) v); }); WRITERS = Collections.unmodifiableMap(writers); }
文章版權歸作者所有,未經允許請勿轉載,若此文章存在違規行為,您可以聯系管理員刪除。
轉載請注明本文地址:http://specialneedsforspecialkids.com/yun/76587.html
摘要:當時自己在本地測試搭建集群后,給分配了另外一個任務就是去了解中的自帶分詞英文分詞中文分詞的相同與差異以及自己建立分詞需要注意的點。還有就是官網的文檔了,非常非常詳細,還有,版本的是有中文的官方文檔,可以湊合著看。 前提 人工智能、大數據快速發展的今天,對于 TB 甚至 PB 級大數據的快速檢索已然成為剛需,大型企業早已淹沒在系統生成的浩瀚數據流當中。大數據技術業已集中在如何存儲和處理這...
閱讀 2894·2021-11-23 09:51
閱讀 3404·2021-11-22 09:34
閱讀 3305·2021-10-27 14:14
閱讀 1504·2019-08-30 15:55
閱讀 3345·2019-08-30 15:54
閱讀 1066·2019-08-30 15:52
閱讀 1888·2019-08-30 12:46
閱讀 2845·2019-08-29 16:11