{"_id":"baoyu-fetch","_rev":"3-d26b67a2541e45a9d08c67e5a9240288","name":"baoyu-fetch","dist-tags":{"latest":"0.1.2"},"versions":{"0.1.1":{"name":"baoyu-fetch","version":"0.1.1","_id":"baoyu-fetch@0.1.1","maintainers":[{"name":"jimliu","email":"junminliu@gmail.com"}],"homepage":"https://github.com/JimLiu/baoyu-skills/tree/main/packages/baoyu-fetch#readme","bugs":{"url":"https://github.com/JimLiu/baoyu-skills/issues"},"bin":{"baoyu-fetch":"src/cli.ts"},"dist":{"shasum":"0309a256e0fbbdfee045c3a70ec2c4d09f8bdb70","tarball":"https://registry.npmjs.org/baoyu-fetch/-/baoyu-fetch-0.1.1.tgz","fileCount":44,"integrity":"sha512-502y9LhfPxI5SRpBJTspbYxPWY9hrAgM0gpCsGdW/B8w0doDfQYdqEGk6eHuLJ6ZxWiXnMmFn+/Oo70nNu9TBQ==","signatures":[{"sig":"MEUCIQC8L8YRnDlq8YXtrlNjjpRi/P+jihginad/iFcnRv3tKQIgGXwXrVlYSKo9NJpngSfKG8VhSsTA/uTJWCUy7Ctou+g=","keyid":"SHA256:DhQ8wR5APBvFHLF/+Tc+AYvPOdTpcIDqOhxsBHRwC7U"}],"unpackedSize":234611},"type":"module","engines":{"bun":">=1.2.0"},"gitHead":"28e0deae20df9d2413aadfc56b40bdc4861e73fc","scripts":{"dev":"bun run ./src/cli.ts","test":"bun test","build":"rm -rf dist && bun build ./src/cli.ts --target bun --outfile ./dist/cli.js && chmod +x ./dist/cli.js","check":"tsc --noEmit","release":"changeset publish","version-packages":"changeset version"},"_npmUser":{"name":"jimliu","email":"junminliu@gmail.com"},"repository":{"url":"git+https://github.com/JimLiu/baoyu-skills.git","type":"git","directory":"packages/baoyu-fetch"},"_npmVersion":"11.8.0","description":"Read URLs into high-quality Markdown or JSON with Chrome CDP and site adapters.","directories":{},"_nodeVersion":"24.13.1","dependencies":{"ws":"^8.18.3","jsdom":"^26.0.0","unified":"^11.0.5","defuddle":"^0.14.0","turndown":"^7.2.0","remark-gfm":"^4.0.1","remark-parse":"^11.0.0","chrome-launcher":"^1.2.1","turndown-plugin-gfm":"^1.0.2","@mozilla/readability":"^0.6.0"},"publishConfig":{"access":"public"},"_hasShrinkwrap":false,"devDependencies":{"@types/ws":"^8.18.1","@types/bun":"^1.2.23","typescript":"^5.9.2","@types/jsdom":"^21.1.7","@changesets/cli":"^2.30.0"},"_npmOperationalInternal":{"tmp":"tmp/baoyu-fetch_0.1.1_1776563221996_0.553722615653595","host":"s3://npm-registry-packages-npm-production"},"deprecated":"Broken CLI bin metadata; use baoyu-fetch@0.1.2 or later."},"0.1.2":{"name":"baoyu-fetch","version":"0.1.2","_id":"baoyu-fetch@0.1.2","maintainers":[{"name":"jimliu","email":"junminliu@gmail.com"}],"homepage":"https://github.com/JimLiu/baoyu-skills/tree/main/packages/baoyu-fetch#readme","bugs":{"url":"https://github.com/JimLiu/baoyu-skills/issues"},"bin":{"baoyu-fetch":"dist/cli.js"},"dist":{"shasum":"4103fbcfda7eeea940c01a943faeefa52edd023c","tarball":"https://registry.npmjs.org/baoyu-fetch/-/baoyu-fetch-0.1.2.tgz","fileCount":45,"integrity":"sha512-VB4CEtIcoiJo3m80RrwBuYOdC257VI6kptoI1kiRcZV78VWEicWdznRoByrjddgyHTq0WWEK5i/7L67mHcVPvQ==","signatures":[{"sig":"MEUCIEVzbpzg9Y5H1cex/p9/92aj9WdcE4/j4QKBRtNiNfqMAiEAmdmHxphDh2vSQ9N4rFIcTkjpUDGhoTEb9FiGjIdSs8E=","keyid":"SHA256:DhQ8wR5APBvFHLF/+Tc+AYvPOdTpcIDqOhxsBHRwC7U"}],"unpackedSize":7619440},"type":"module","engines":{"bun":">=1.2.0"},"gitHead":"28e0deae20df9d2413aadfc56b40bdc4861e73fc","scripts":{"dev":"bun run ./src/cli.ts","test":"bun test","build":"rm -rf dist && bun build ./src/cli.ts --target bun --outfile ./dist/cli.js && chmod +x ./dist/cli.js","check":"tsc --noEmit","prepack":"bun run build","release":"changeset publish","version-packages":"changeset version"},"_npmUser":{"name":"jimliu","email":"junminliu@gmail.com"},"repository":{"url":"git+https://github.com/JimLiu/baoyu-skills.git","type":"git","directory":"packages/baoyu-fetch"},"_npmVersion":"11.8.0","description":"Read URLs into high-quality Markdown or JSON with Chrome CDP and site adapters.","directories":{},"_nodeVersion":"24.13.1","dependencies":{"ws":"^8.18.3","jsdom":"^26.0.0","unified":"^11.0.5","defuddle":"^0.14.0","turndown":"^7.2.0","remark-gfm":"^4.0.1","remark-parse":"^11.0.0","chrome-launcher":"^1.2.1","turndown-plugin-gfm":"^1.0.2","@mozilla/readability":"^0.6.0"},"publishConfig":{"access":"public"},"_hasShrinkwrap":false,"devDependencies":{"@types/ws":"^8.18.1","@types/bun":"^1.2.23","typescript":"^5.9.2","@types/jsdom":"^21.1.7","@changesets/cli":"^2.30.0"},"_npmOperationalInternal":{"tmp":"tmp/baoyu-fetch_0.1.2_1776563444419_0.42486343826675066","host":"s3://npm-registry-packages-npm-production"}}},"time":{"created":"2026-04-19T01:47:01.902Z","modified":"2026-04-19T01:50:57.121Z","0.1.1":"2026-04-19T01:47:02.167Z","0.1.2":"2026-04-19T01:50:44.659Z"},"bugs":{"url":"https://github.com/JimLiu/baoyu-skills/issues"},"homepage":"https://github.com/JimLiu/baoyu-skills/tree/main/packages/baoyu-fetch#readme","repository":{"url":"git+https://github.com/JimLiu/baoyu-skills.git","type":"git","directory":"packages/baoyu-fetch"},"description":"Read URLs into high-quality Markdown or JSON with Chrome CDP and site adapters.","maintainers":[{"name":"jimliu","email":"junminliu@gmail.com"}],"readme":"# baoyu-fetch\n\n[English](./README.md) | 简体中文 | [更新日志](./CHANGELOG.zh-CN.md) | [English Changelog](./CHANGELOG.md)\n\n`baoyu-fetch` 是一个基于 Chrome CDP 的 Bun CLI。输入 URL，它会输出高质量\n`markdown` 或 `json`；命中站点 adapter 时优先消费 API 返回或页面内结构化\n数据，未命中时回退到通用 HTML 提取。\n\n## 当前能力\n\n- 通过 Chrome CDP 抓取渲染后的页面内容\n- 监听网络请求与响应，按需拉取响应体\n- adapter registry，支持按 URL 自动命中站点处理器\n- 内置 `x`、`youtube`、`hn` adapters\n- 通用 fallback：Defuddle 优先，Readability + HTML to Markdown 回退；`--format markdown` 时会再尝试 `defuddle.md` 兜底\n- `stdout` 或 `--output` 输出 `markdown` / `json`\n- 可选下载提取出的图片/视频并重写 Markdown 链接\n- 提供登录/验证场景下的交互等待模式\n- Chrome profile 默认对齐 `baoyu-skills/chrome-profile`\n\n## 安装\n\n```bash\nbun install\n```\n\n作为包使用时，推荐直接这样运行：\n\n```bash\nbunx baoyu-fetch https://example.com\n```\n\n也可以全局安装：\n\n```bash\nnpm install -g baoyu-fetch\n```\n\nnpm 包发布的是 TypeScript 源码入口，不包含预编译的 `dist`，所以运行时需要\nBun。\n\n## 用法\n\n```bash\nbun run src/cli.ts https://example.com\nbunx baoyu-fetch https://example.com\nbaoyu-fetch https://example.com\nbaoyu-fetch https://example.com --format markdown --output article.md\nbaoyu-fetch https://example.com --format markdown --output article.md --download-media\nbaoyu-fetch https://x.com/jack/status/20 --format json --output article.json\nbaoyu-fetch https://x.com/jack/status/20 --json\nbaoyu-fetch https://x.com/jack/status/20 --wait-for interaction\nbaoyu-fetch https://x.com/jack/status/20 --wait-for force\nbaoyu-fetch https://x.com/jack/status/20 --chrome-profile-dir ~/Library/Application\\\\ Support/baoyu-skills/chrome-profile\n```\n\n## 主要参数\n\n```bash\nbaoyu-fetch <url> [options]\n\nOptions:\n  --output <file>       保存输出内容到文件\n  --format <type>       输出格式：markdown | json\n  --json                `--format json` 的兼容别名\n  --adapter <name>      强制使用指定 adapter（如 x / hn / generic）\n  --download-media      下载 adapter 返回的媒体到 ./imgs 和 ./videos，并重写 markdown 链接\n  --media-dir <dir>     指定媒体下载根目录；默认使用输出文件所在目录\n  --debug-dir <dir>     导出调试信息（html、document.json、network.json）\n  --cdp-url <url>       连接现有 Chrome 调试地址\n  --browser-path <path> 指定 Chrome 可执行文件\n  --chrome-profile-dir <path>\n                        指定 Chrome profile 目录。默认使用 BAOYU_CHROME_PROFILE_DIR，\n                        否则回退到 baoyu-skills/chrome-profile\n  --headless            启动临时 headless Chrome（未连现有实例时）\n  --wait-for <mode>     等待模式：interaction | force\n  --wait-for-interaction\n                        `--wait-for interaction` 的别名\n  --wait-for-login      `--wait-for interaction` 的别名\n  --interaction-timeout <ms>\n                        手动交互等待超时，默认 600000\n  --interaction-poll-interval <ms>\n                        等待期间的轮询间隔，默认 1500\n  --login-timeout <ms>  `--interaction-timeout` 的别名\n  --login-poll-interval <ms>\n                        `--interaction-poll-interval` 的别名\n  --timeout <ms>        页面加载超时，默认 30000\n  --help                显示帮助\n```\n\n## 设计\n\n核心链路：\n\n1. CLI 解析 URL 和选项\n2. 建立 CDP 会话并创建受控 tab\n3. 启动 `NetworkJournal` 收集所有请求/响应\n4. 由 adapter registry 匹配站点 adapter\n5. adapter 返回结构化 `ExtractedDocument`\n6. 没命中则走通用 HTML 提取\n7. 按请求输出 Markdown，或输出包含 `document` 和 `markdown` 的 JSON\n\n## 开发\n\n```bash\nbun run check\nbun run test\nbun run build\n```\n\n## 发版\n\n新增用户可见改动后，先添加一个 changeset：\n\n```bash\nbunx changeset\n```\n\n把生成的 `.changeset/*.md` 一起合并到 `main` 后，GitHub Actions 会自动创建或\n更新 release PR；合并 release PR 之后，会自动发布到 npm。\n\n发布流程不会编译 `dist`，而是直接把 `src/*.ts` 发布出去供 Bun 执行。\n","readmeFilename":"README.zh-CN.md"}