{"_id":"spider","_rev":"9-37747e6ef7437c7e281d225efdb589c2","name":"spider","description":"Programmable spidering of web sites with node.js and jQuery","dist-tags":{"latest":"0.1.0"},"versions":{"0.0.2":{"name":"spider","description":"Programmable spidering of web sites with node.js and jQuery","tags":["dom","javascript","crawling","jquery","spider","spidering"],"version":"0.0.2","author":{"name":"Mikeal Rogers","email":"mikeal.rogers@gmail.com"},"repository":{"type":"git","url":"git://github.com/mikeal/spider.git"},"bugs":{"url":"http://github.com/mikeal/spider/issues"},"engines":["node >= 0.4.4"],"main":"./main","dependencies":{"request":">= 1.9.3","jsdom":">= 0.2.0","routes":">= 0.1.0","cookiejar":">= 1.2.0"},"_id":"spider@0.0.2","_engineSupported":true,"_npmVersion":"0.3.18","_nodeVersion":"v0.4.7-pre","directories":{},"files":[""],"_defaultsLoaded":true,"dist":{"shasum":"337cbec0e884d47beb1483363fd9d6531a0e3827","tarball":"https://registry.npmjs.org/spider/-/spider-0.0.2.tgz","integrity":"sha512-VFPjaKimVGQs37UwfoRklBXYCzOmCHwh5RHBAOtGYBRRL+n8RQbSxuiuHew8b2YlSS8t+m4dYE4jUOVzpR4ovg==","signatures":[{"keyid":"SHA256:jl3bwswu80PjjokCgh0o2w5c2U4LhQAE57gj9cz1kzA","sig":"MEUCIAT41Ut3VOW90Yl1+XJlv675KI3wGQubGyLv9kKWqEqnAiEAmB2uYMR2OM/I7g8cmAghxwIBkzvb5AA5jp6OTWM7S6w="}]}},"0.1.0":{"name":"spider","description":"Programmable spidering of web sites with node.js and jQuery","tags":["dom","javascript","crawling","jquery","spider","spidering"],"version":"0.1.0","author":{"name":"Mikeal Rogers","email":"mikeal.rogers@gmail.com"},"repository":{"type":"git","url":"git://github.com/mikeal/spider.git"},"bugs":{"url":"http://github.com/mikeal/spider/issues"},"engines":["node >= 0.6.4"],"main":"./main","dependencies":{"request":">= 1.9.3","jsdom":">= 0.2.13","routes":">= 0.1.0","cookiejar":">= 1.3.0"},"_npmUser":{"name":"mikeal","email":"mikeal.rogers@gmail.com"},"_id":"spider@0.1.0","devDependencies":{},"_engineSupported":true,"_npmVersion":"1.1.0-beta-10","_nodeVersion":"v0.6.8-pre","_defaultsLoaded":true,"dist":{"shasum":"2b83e6b82df2f424fb0a9ab635cb9d74fd9a63da","tarball":"https://registry.npmjs.org/spider/-/spider-0.1.0.tgz","integrity":"sha512-pBj5nSuUmktUpTq7aLmfcRx/GAyEPid6MqOky6GwGJ0XzyX36SaZu0oCJGhqGmrRH7RLrohyHf2ZJBfsp1brHg==","signatures":[{"keyid":"SHA256:jl3bwswu80PjjokCgh0o2w5c2U4LhQAE57gj9cz1kzA","sig":"MEQCIB6U9BwTw5FGELQgaa9Y9h9JdP0y1xQW7L7XVDY2U5nlAiAdOAxi2+Dy+oJR0G2RHH6d28OS+UOfNMGJ+GU/15fE0g=="}]},"maintainers":[{"name":"mikeal","email":"mikeal.rogers@gmail.com"}]}},"maintainers":[{"name":"mikeal","email":"mikeal.rogers@gmail.com"}],"time":{"modified":"2022-06-26T22:50:58.572Z","created":"2011-04-21T03:17:48.283Z","0.0.2":"2011-04-21T03:17:48.653Z","0.1.0":"2012-03-10T23:13:40.407Z"},"author":{"name":"Mikeal Rogers","email":"mikeal.rogers@gmail.com"},"repository":{"type":"git","url":"git://github.com/mikeal/spider.git"},"readme":"# Spider -- Programmable spidering of web sites with node.js and jQuery\n\n## Install\n\nFrom source:\n\n<pre>\n  git clone git://github.com/mikeal/spider.git \n  cd spider\n  npm link .\n</pre>\n\n## (How to use the) API\n\n### Creating a Spider\n<pre>\n  var spider = require('spider');\n  var s = spider();\n</pre>\n\n#### spider(options)\n\nThe `options` object can have the following fields:\n\n* `maxSockets` - Integer containing the maximum amount of sockets in the pool. Defaults to `4`.\n* `userAgent` - The User Agent String to be sent to the remote server along with our request. Defaults to `Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-US) AppleWebKit/534.7 (KHTML, like Gecko) Chrome/7.0.517.41 Safari/534.7` (firefox userAgent String).\n* `cache` -  The Cache object to be used as cache. Defaults to NoCache, see code for implementation details for a new Cache object.\n* `pool` - A hash object containing the agents for the requests. If omitted the requests will use the global pool which is set to maxSockets.\n\n### Adding a Route Handler\n\n#### spider.route(hosts, pattern, cb)\nWhere the params are the following : \n\n* `hosts` - A string -- or an array of string -- representing the `host` part of the targeted URL(s).\n* `pattern` - The pattern against which spider tries to match the remaining (`pathname` + `search` + `hash`) of the URL(s).\n* `cb` - A function of the form `function(window, $)` where\n  * `this` - Will be a variable referencing the `Routes.match` return object/value with some other goodies added from spider. For more info see https://github.com/aaronblohowiak/routes.js\n  * `window` - Will be a variable referencing the document's window.\n  * `$` - Will be the variable referencing the jQuery Object.\n\n### Queuing an URL for spider to fetch.\n\n`spider.get(url)` where `url` is the url to fetch.\n\n### Extending / Replacing the MemoryCache \n\nCurrently the MemoryCache must provide the following methods:\n\n* `get(url, cb)` - Returns `url`'s `body` field via the `cb` callback/continuation if it exists. Returns `null` otherwise.\n  * `cb` - Must be of the form `function(retval) {...}`\n* `getHeaders(url, cb)` - Returns `url`'s `headers` field via the `cb` callback/continuation if it exists. Returns `null` otherwise.\n  * `cb` - Must be of the form `function(retval) {...}`\n* `set(url, headers, body)` - Sets/Saves `url`'s `headers` and `body` in the cache.\n\n### Setting the verbose/log level\n`spider.log(level)` - Where `level` is a string that can be any of `\"debug\"`, `\"info\"`, `\"error\"`\n","bugs":{"url":"http://github.com/mikeal/spider/issues"},"readmeFilename":"","users":{"ninozhang":true}}