在官方文档中主要还是已RESTFUL API来说明较多,本文主要是来介绍ES的Java API的使用,在实验室环境使用的ES版本为5.5.1
Client
TransportClient
TransportClient是通过transport module来远程连接ES cluster。 TransportClient不是加入集群,而是获取初始交换层地址。
Client API
//初始化
public static TransportClient getClient() {
if(client!=null){
return client;
}
//设置集群名称
Settings settings = Settings.builder().put("cluster.name", "ubuntu").build();
try {
client = new PreBuiltTransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("ubuntu1"), 9300))
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("ubuntu2"), 9300))
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("ubuntu3"), 9300))
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("ubuntu4"), 9300));
} catch (UnknownNamedObjectException | UnknownHostException e) {
e.printStackTrace();
}
return client;
}
Index API
//在index=twitter,type=tweet,id=1
public void add() throws IOException{
IndexResponse response = getClient().prepareIndex("twitter", "tweet", "1")
.setSource(XContentFactory.jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "trying out Elasticsearch")
.endObject())
.get();
// Index name
String _index = response.getIndex();
// Type name
String _type = response.getType();
// Document ID (generated or not)
String _id = response.getId();
// Version (if it's the first time you index this document, you will get: 1)
long _version = response.getVersion();
// status has stored current instance statement.
RestStatus status = response.status();
}
Get API
//通过id号get document
public void getIndex(){
GetResponse response = getClient().prepareGet("twitter", "tweet", "1").get();
for(Entry<String, Object> entry : response.getSource().entrySet()){
System.out.println(entry.getKey() +"===="+entry.getValue());
}
}
Delete API
//根据document删除(同步操作)
public void deleteByDocument(){
BulkByScrollResponse response =
DeleteByQueryAction.INSTANCE.newRequestBuilder(getClient())
//query
.filter(QueryBuilders.matchQuery("user", "kimchy"))
//index
.source("twitter")
//execute the operation
.get();
//number of deleted documents
long deleted = response.getDeleted();
System.out.println(deleted);
}
// 异步删除
public void deleteByDocumentInAsy(){
DeleteByQueryAction.INSTANCE.newRequestBuilder(getClient())
.filter(QueryBuilders.matchQuery("user", "kimchy"))
//index
.source("twitter")
//监听器
.execute(new ActionListener<BulkByScrollResponse>() {
//删除的文档数
@Override
public void onResponse(BulkByScrollResponse response) {
long deleted = response.getDeleted();
}
@Override
public void onFailure(Exception e) {
}
});
}
Update API
//更新document,同时可以再对应id下再增加数据
public void update(){
try {
UpdateResponse response = getClient().prepareUpdate("twitter", "tweet", "1")
.setDoc(XContentFactory.jsonBuilder()
.startObject()
.field("gender", "male")
.endObject())
.get();
} catch (IOException e) {
e.printStackTrace();
}
}
//如果之前没有document可以使用upsert方法添加,若有,则可以只update需要update的document
IndexRequest indexRequest = new IndexRequest("index", "type", "1")
.source(jsonBuilder()
.startObject()
.field("name", "Joe Smith")
.field("gender", "male")
.endObject());
UpdateRequest updateRequest = new UpdateRequest("index", "type", "1")
.doc(jsonBuilder()
.startObject()
.field("gender", "male")
.endObject())
.upsert(indexRequest);
client.update(updateRequest).get();
Bulk API 可以一次性多次进行index操作和delete操作
BulkRequestBuilder bulkRequest = client.prepareBulk();
// either use client#prepare, or use Requests# to directly build index/delete requests
bulkRequest.add(client.prepareIndex("twitter", "tweet", "1")
.setSource(jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "trying out Elasticsearch")
.endObject()
)
);
bulkRequest.add(client.prepareIndex("twitter", "tweet", "2")
.setSource(jsonBuilder()
.startObject()
.field("user", "kimchy")
.field("postDate", new Date())
.field("message", "another post")
.endObject()
)
);
BulkResponse bulkResponse = bulkRequest.get();
EL建表说明
EL建Field的限制,EL限制最多只能有1000个Field
请求参数方式查询
http://ubuntu1:9200/el/28s/_search?q=*&pretty
TermQuery查询大‘Bug’
QueryBuilders.termQuery(colum, value)
对于英文检索value时,即使数据存储的是如‘Benz’,但是查询时value为‘Benz’是查询不到结果的,必须为小写‘benz’才可以查询到结果
原因: String类型分为text和keyword,text是要被分词的,keyword类似于es2.3的not_analyzed,String默认为text,默认分词器为standard analyzer。”Quick Brown Fox!”会被分解成[quick,brown,fox]写入倒排索引,所以检索时大写字母检索不到
解决方法: 为了应对中文的分词,需要使用ik分词器,大小写情况ik分词器还没有解决
分词查看
http://ubuntu1:9200/_analyze?&pretty=true&text=内容
使用ik分词器对中文进行分词
http://ubuntu1:9200/_analyze?analyzer=ik_smart&pretty=true&text=内容
查看Field属性
http://ubuntu1:9200/el/28s/_mapping
查看index的analyzer
GET kg/_settings
查看index中type的search
GET kg/28s/_search
matchQuery和termQuery的区别
两个query都是从分词后的token组中查询,termQuery是完全匹配,matchQuery可以匹配大小写
Join Query中的Has Child Query和Has Parent Query
两个Type做Join