MongoDB and ElasticSearch sync module for node (support attachment sync)
Supports one-to-one and one-to-many relationships.
Chinese Documentation - 中文文档
elasticsearch: v6.1.2
mongodb: v3.6.2
Nodejs: v8.9.3
node-mongodb-es-connector package keeps your mongoDB collections and elastic search cluster in sync. It does so by tailing the mongo oplog and replicate whatever crud operation into elastic search cluster without any overhead. Please note that a replica set is needed for the package to tail mongoDB.(support attentment sync)
npm install es-mongodb-sync
or Download from GitHub.
Create a file in the crawlerData folder,the Naming rules is ElasticSearchIndexName.json
or any name .json
.
If you have more additional configuration in the crawlerData
folder.
For example:
mybooks.json
{
"mongodb": {
"m_database": "myTest",
"m_collectionname": "books",
"m_filterfilds": {
"version" : "2.0"
},
"m_returnfilds": {
"bName": 1,
"bPrice": 1,
"bImgSrc": 1
},
"m_extendfilds": {
"bA": "this is a extend fild bA",
"bB": "this is a extend fild bB"
},
"m_extendinit": {
"m_comparefild": "_id",
"m_comparefildType": "ObjectId",
"m_startFrom": "2018-07-20 13:44:00",
"m_endTo": "2018-07-20 13:46:59"
},
"m_connection": {
"m_servers": [
"localhost:29031",
"localhost:29032",
"localhost:29033"
],
"m_authentication": {
"username": "UserAdmin",
"password": "pass1234",
"authsource":"admin",
"replicaset":"my_replica",
"ssl":false
}
},
"m_documentsinbatch": 5000,
"m_delaytime": 1000,
"max_attachment_size":5242880
},
"elasticsearch": {
"e_index": "mybooks",
"e_type": "books",
"e_connection": {
"e_server": "http://localhost1:9200,http://localhost2:9200,http://localhost3:9200",
"e_httpauth": {
"username": "EsAdmin",
"password": "pass1234"
}
},
"e_pipeline": "mypipeline",
"e_iscontainattachment": true
}
}
null
). (required)null
). (required)null
). (selective)
_id
or other). (selective)ObjectId
or DateTime
). (selective)null
. (required)
false
). (selective)m_connection
(Either-or) (selective).1000
ms. (required)5242880
byte. (selective)null
. (selective)
false
. (selective)node app.js
index.js (only crud config json )
1.start() - must start up before all the APIs.
2.addWatcher() - add a config json.
Parameters:
Name | Type |
---|---|
fileName | string |
obj | jsonObject |
return: true or false
3.updateWatcher() - update a config json.
Parameters:
Name | Type |
---|---|
fileName | string |
obj | jsonObject |
return: true or false
4.deleteWatcher() - delete a config json.
Parameters:
Name | Type |
---|---|
fileName | string |
return: true or false
5.isExistWatcher() - check out this config json exist.
Parameters:
Name | Type |
---|---|
fileName | string |
return: true or false
6.getInfoArray() - get every config status.(waiting/initialling/running/stoped).
getInfoArray()
).m_extendfilds
and m_extendinit
.Install Ingest Attachment Processor Plugin
https://www.elastic.co/guide/en/elasticsearch/plugins/6.3/ingest-attachment.html
more Elasticsearch Pipeline knowledge: https://hacpai.com/article/1512990272091
prepare make a pipeline in elasticsearch
PUT _ingest/pipeline/mypipeline
{
"description" : "Extract attachment information from arrays",
"processors" : [
{
"foreach": {
"field": "attachments",
"processor": {
"attachment": {
"target_field": "_ingest._value.attachment",
"field": "_ingest._value.data"
}
}
}
}
]
}
The MIT License (MIT). Please see LICENSE for more information.