node-mongodb-es-connector

基于nodejs的用来实现mongodb和ElasticSearch之间的数据实时同步 (支持附件同步) structure

支持一对一,一对多的数据传输方式.

英文文档 - English Documentation

我当前的环境版本

elasticsearch: v6.1.2
mongodb: v3.6.2
Nodejs: v8.9.3

这个工具是干什么的

node-mongodb-es-connector是用来保持你的mongoDB collections和你的elasticsearch index之间的数据实时同步.它是用mongo oplog来监听你的mongdb数据是否发生变化,无论是增删改查它都会及时反映到你的elasticsearch index上.在使用本工具之前你必须保证你的mongoDB是符合replica结构的,如果不是请先正确设置之后再使用此工具.(支持附件同步)

如何使用

npm install es-mongodb-sync

或者从GitHub上去下载.

简单的例子

创建在crawlerData文件目录下创建一个js文件,命名规则如下: ElasticSearchIndexName.json,或者任意名称.json..

如果你需要更多的配置文件需要在crawlerData目录下创建.

例子:

mybooks.json

{ "mongodb": { "m_database": "myTest", "m_collectionname": "books", "m_filterfilds": { "version" : "2.0" }, "m_returnfilds": { "bName": 1, "bPrice": 1, "bImgSrc": 1 }, "m_extendfilds": { "bA": "this is a extend fild bA", "bB": "this is a extend fild bB" }, "m_extendinit": { "m_comparefild": "_id", "m_comparefildType": "ObjectId", "m_startFrom": "2018-07-20 13:44:00", "m_endTo": "2018-07-20 13:46:59" }, "m_connection": { "m_servers": [ "localhost:29031", "localhost:29032", "localhost:29033" ], "m_authentication": { "username": "UserAdmin", "password": "pass1234", "authsource":"admin", "replicaset":"my_replica", "ssl":false } }, "m_documentsinbatch": 5000, "m_delaytime": 1000, "max_attachment_size":5242880 }, "elasticsearch": { "e_index": "mybooks", "e_type": "books", "e_connection": { "e_server": "http://localhost1:9200,http://localhost2:9200,http://localhost3:9200", "e_httpauth": { "username": "EsAdmin", "password": "pass1234" } }, "e_pipeline": "mypipeline", "e_iscontainattachment": true } }

如何启动

node app.js

start

拓展API

index.js (只用来做配置文件的增删改查)

例子

1.start() - must start up before all the APIs.


2.addWatcher() - 增加一个配置文件.

传参:

Name Type
fileName string
obj jsonObject

返回值: true or false


3.updateWatcher() - 修改一个配置文件.

传参:

Name Type
fileName string
obj jsonObject

返回值: true or false


4.deleteWatcher() - 删除一个配置文件.

传参:

Name Type
fileName string

返回值: true or false


5.isExistWatcher() - 检查当前配置文件是否存在.

传参:

Name Type
fileName string

返回值: true or false


6.getInfoArray() - 获取每个配置文件的当前状态.(waiting/initialling/running/stoped).


更新日志

如何使用elasticsearch的pipeline

PUT _ingest/pipeline/mypipeline { "description" : "Extract attachment information from arrays", "processors" : [ { "foreach": { "field": "attachments", "processor": { "attachment": { "target_field": "_ingest._value.attachment", "field": "_ingest._value.data" } } } } ] }

显示的结果

mongodb

elasticsearch

测试

test

License

The MIT License (MIT). Please see LICENSE for more information.