§ 0给「帮我写剧本」的 AI 的元提示词
§ 0Meta-prompt for the AI that writes your brief
把这段话直接复制粘贴给 ChatGPT / Claude / Gemini / DeepSeek 或任何在线大模型,它就能为你写出 Artemis 兼容的剧本:
Copy and paste the block below to ChatGPT / Claude / Gemini / DeepSeek or any chat LLM. It will then produce briefs that Artemis can parse cleanly.
你正在为我撰写一份 Artemis Saga 长视频工作流的剧本提示词。
Artemis 会自动分析剧本、自动构造导演级 prompt 块、自动驱动 video 模型。
我需要的是「Artemis 能识别的结构化 brief」,而不是「video 模型直接读的 prompt」。
请严格遵守以下结构约定:
1. 用 [X-Y秒] 或 [MM:SS-MM:SS] 标记每段时间段。例如 [0-8秒] / [0:00-0:08]
2. 对白必须用以下任一 marker 显式标注(裸引号不算对白):
**对白(说明)**: "实际台词"
**台词**: "实际台词"
旁白: "实际旁白"
字幕: "实际字幕"
dialogue: "..." / voiceover: "..."
3. 锁机位写明确关键词:「锁死三脚架机位 / NO pan/tilt/zoom/dolly/handheld」
4. 不要 AI 生成音乐时写明:「AI 生成阶段只出环境音」或「音乐都是后期叠加」
5. 段落顺序就是视频顺序,每段时间码内的内容是该段的完整说明
6. 写中文剧本不要把镜头、地点、动作翻译成英文 — Artemis 自己会处理多语言
7. 不要假设字幕会被渲染 — 字幕渲染是 Artemis 自动按用户字幕模式控制的
8. 单段时长按当前视频模型动态上限控制,以 Artemis 工作流提示的单段上限为准(通常每段 6-10 秒最稳定)
9. 明确画幅比例 / ratio:16:9 横屏、9:16 竖屏、1:1 方屏;构图与运动必须匹配画幅
10. 对白长度必须匹配时间码:中文约 4-5 字/秒,英文约 2-3 词/秒;5 秒镜头不要塞超过约 20 个中文字
11. 避免复杂物理交互、近身格斗、精细手部特写、吃东西、递接小物件;改写为走位、眼神、情绪、简单手势
12. 如果同一个长镜头被拆成多段,后一段开头必须承接上一段结尾状态:同机位、同动作方向、同人物姿态
13. 如果没有参考图,CHARACTER LOCK 必须写成角色卡:年龄、人种/肤色、脸型、发型发色、标志性服装、固定配饰/痣/疤
14. 不要堆垃圾画质词或模型站套话,例如 "8K / masterpiece / best quality / photorealistic / ultra detailed";只写真实摄影、光线、镜头、材质和故事意图
15. 不要重复写 "防塑料感 / 防变形 / 防多手指" — Artemis 自动加更专业的版本
请基于这些约定,根据我下面给出的题材/构思写完整剧本:
[在这里写你的具体题材]
You are writing a brief for the Artemis Saga long-video workflow.
Artemis will parse the brief, build all director-level prompt blocks automatically, and drive the video model.
What I need from you is a STRUCTURED BRIEF that Artemis can parse — not a raw prompt the video model reads directly.
Strict rules:
1. Tag each segment with [X-Ysec] or [MM:SS-MM:SS], e.g. [0-8sec] / [0:00-0:08]
2. Every spoken line MUST be wrapped in one of the following markers (bare quotes are NOT recognised as dialogue):
**dialogue (note)**: "actual line"
**line**: "actual line"
voiceover: "actual narration"
subtitle: "actual subtitle"
对白: "..." / 旁白: "..." (Chinese equivalents)
3. To lock the camera, use explicit keywords: "locked-off tripod / NO pan/tilt/zoom/dolly/handheld"
4. To prevent AI music generation, write: "AI generation outputs ambience only" or "all music is added in post"
5. Segment order = video order; the contents inside each timestamp ARE that segment's full spec
6. If writing in Chinese, do NOT translate shots / locations / actions to English — Artemis handles multilingual content
7. Do NOT assume subtitles will render — subtitle rendering is controlled by Artemis's subtitle-mode flag set by the user
8. Per-segment duration is bound by the current video model's max — follow whatever cap Artemis surfaces in the workflow (6-10 sec per segment is the safest default)
9. State the aspect ratio explicitly: 16:9 landscape / 9:16 portrait / 1:1 square; composition and motion MUST match
10. Dialogue length MUST match its timecode: ~4-5 Chinese chars/sec, ~2-3 English words/sec; a 5-sec shot should not carry more than ~20 Chinese chars
11. Avoid complex physical interactions, hand-to-hand combat, fine hand close-ups, eating, passing small objects; rewrite as blocking, eye contact, emotion, simple gestures
12. If a long take is split across segments, the next segment MUST continue: same camera, same motion vector, same body pose at the seam
13. Without a reference image, CHARACTER LOCK MUST be a full character card: age, ethnicity / skin tone, face shape, hair style and colour, signature wardrobe, fixed accessories / moles / scars
14. Do NOT stuff prompt-engineering filler such as "8K / masterpiece / best quality / photorealistic / ultra detailed" — only describe real photography, light, lenses, materials, and story intent
15. Do NOT repeat "no plastic skin / no distortion / no extra fingers" — Artemis adds more professional versions automatically
Now write the full brief from the topic / concept below:
[your topic goes here]
§ 1Artemis Saga 工作流速览
§ 1Artemis Saga workflow overview
§ 2Brief 标准骨架
§ 2Standard brief skeleton
每份 Brief 推荐按这个骨架写:
Use this skeleton for every brief:
【整片叙事】
(200-500 字写清楚整片讲什么、主角是谁、最终高潮是什么、情绪走向)
═════════════════════════════════════════════
【画质规格】
(画幅比例 / ratio、参考画质等级、摄影机感、镜头、色彩、光照风格、绝对禁止项)
【全局基调】
· VIBE: ...
· 镜头机位: ... ← 「锁死三脚架机位 / NO pan/tilt/zoom/dolly」
· CHARACTER LOCK: 主角永久不变的外貌特征;无参考图时写成完整角色卡
· 声音层级: 写明 AI 生成阶段输出什么,哪些是后期叠加;对白长度匹配时间码
═════════════════════════════════════════════
[0-8秒] 段 1 · 段标题
(段内具体内容:镜头角度、人物动作、背景细节、节奏、转场)
【声音】 ...
**对白(约 N 秒,说明)**: "实际中文/外文台词"
[8-16秒] 段 2 · ...
...
═════════════════════════════════════════════
【附录:对白回顾】(可选 · 仅供人类参考)
【附录:地点清单】(可选)
[Story]
(200-500 chars: what the video is about, who the protagonist is, what the final beat is, emotional arc)
═════════════════════════════════════════════
[Picture specs]
(aspect ratio, reference quality tier, camera feel, lens, palette, lighting style, hard exclusions)
[Global tone]
· VIBE: ...
· Camera position: ... ← "locked-off tripod / NO pan/tilt/zoom/dolly"
· CHARACTER LOCK: permanent appearance of the protagonist; write a full character card when no reference image is supplied
· Audio layers: what the AI stage outputs, what is layered in post; dialogue length must match its timecode
═════════════════════════════════════════════
[0-8sec] Segment 1 · title
(segment specifics: camera angle, character actions, background details, pacing, transition)
[Sound] ...
**dialogue (~N sec, note)**: "actual spoken line in original language"
[8-16sec] Segment 2 · ...
...
═════════════════════════════════════════════
[Appendix: dialogue recap] (optional · for human reference only)
[Appendix: location list] (optional)
§ 3Artemis 识别的所有结构标记
§ 3Every structural marker Artemis recognises
3.1 时间码(用来分段)
3.1 Timecodes (used for segmentation)
| 格式 | 例子 | 适用 |
|---|---|---|
| 整数秒 + 中文/英文单位 | [0-8秒] [8-16s] [0-8] | 短视频 (≤180s) |
| MM:SS 时分秒 | [0:00-0:08] [1:30-1:38] | 长视频或更精细 |
| 行首松散格式 | 0-8s: 0:00-0:08: | 不用方括号 |
| 英文场景标记 | Scene 1 Shot 1 Segment 1 | 无时间码时(平均分配) |
| 中文场景标记 | 镜头 1 第 1 段 段 1 | 无时间码时(平均分配) |
| Format | Example | When to use |
|---|---|---|
| Integer seconds + unit | [0-8秒] [8-16s] [0-8] | Short videos (≤180s) |
| MM:SS clock | [0:00-0:08] [1:30-1:38] | Long videos / finer granularity |
| Loose line-leading form | 0-8s: 0:00-0:08: | No brackets |
| English scene marker | Scene 1 Shot 1 Segment 1 | When no timecodes exist (durations split evenly) |
| Chinese scene marker | 镜头 1 第 1 段 段 1 | When no timecodes exist (durations split evenly) |
3.2 对白 marker(必须带 marker 才识别为对白)
3.2 Dialogue markers (only marked lines count as dialogue)
| 写法 | 用途 |
|---|---|
**对白(说明)**: "台词" | 口语对白(带说明) |
**对白**: "台词" | 口语对白(不带说明) |
对白: "台词" | 同上(不带 markdown) |
**旁白**: "..." | 旁白(不出口型) |
**字幕**: "..." | 屏幕字幕(按字幕模式决定渲染) |
dialogue: "..." | 英文等价 |
voiceover: "..." | 英文旁白 |
she says: "..." / 他低声说: "..." | 自然语言变体 |
| Form | Purpose |
|---|---|
**dialogue (note)**: "line" | Spoken dialogue with a note |
**dialogue**: "line" | Spoken dialogue, no note |
dialogue: "line" | Same as above, no markdown |
**voiceover**: "..." | Voiceover (no lip-sync) |
**subtitle**: "..." | On-screen subtitle (rendered per subtitle mode) |
**对白**: "..." / **旁白**: "..." | Chinese equivalents |
she says: "..." / he whispers: "..." | Natural-language variants |
[0-5秒] 镜头里,中文对白建议不超过约 20 个字;否则 lip-sync 容易失败、被截断或强行拉伸。
The "~N seconds" in a dialogue marker must match real speech rate: ~4-5 Chinese chars/sec, ~2-3 English words/sec. In a [0-5sec] shot, keep Chinese dialogue under ~20 chars; otherwise lip-sync fails, gets truncated, or is forcibly stretched.
"歌词引用"— 段落副标题里的歌词"品牌名"— "Parts Unknown" 这种节目引用"设计概念"— "中国街道" "霓虹城市" 这种引号包裹的概念
"quoted lyrics"— song lyrics inside a sub-heading"brand name"— references like "Parts Unknown""design concepts"— quoted ideas like "Chinese street" or "neon city"
3.3 镜头锁定关键词
3.3 Camera-lock keywords
Brief 任意位置出现以下任一关键词,Artemis 自动把 CAMERA 块和每段 camera 字段都改为锁死:
Any of the following keywords anywhere in the brief flips the CAMERA block and each segment's camera field to locked-off:
| 中文 | Chinese | 英文 | English |
|---|---|---|---|
| 锁死三脚架 / 锁死机位 / 完全锁死 | locked-off tripod | ||
| 镜头钉死 / 无任何镜头运动 | no camera movement (whatsoever) | ||
| NO pan/tilt/zoom/dolly/handheld | NO pan, NO tilt, NO zoom, NO dolly, NO handheld |
3.4 音频意图关键词(自动触发 [AUDIO-LOCK])
3.4 Audio-intent keywords (auto-trigger [AUDIO-LOCK])
出现以下任一关键词,Artemis 强制 video 模型 只生成环境音、绝不合成音乐:
When any of these appear, Artemis forces the video model to output ambience only — never synthesise music:
| 关键词模式 |
|---|
| 音乐 + 后期叠加(任一组合) |
| 只出环境音 / 仅环境音 / 只生成环境音 |
| 不要 BGM / 无 BGM / 无背景音乐 / 无配乐 |
| AI 生成阶段只出环境音 |
no music / no BGM / no soundtrack / no instrumental |
environmental audio only / ambient sounds only |
music/score/soundtrack is added/overlaid in post(-production) |
| Keyword pattern |
|---|
| "music" + "added/overlaid in post" (any combination) |
| "environmental audio only" / "ambient sounds only" |
| "no music" / "no BGM" / "no soundtrack" / "no instrumental" |
| "music/score/soundtrack is added/overlaid in post(-production)" |
Chinese equivalents such as 只出环境音, 不要 BGM, AI 生成阶段只出环境音 |
3.4b 本地 BGM / Soundtrack 后期混音
3.4b Local BGM / soundtrack post-mix
在 Saga 工作流中,用户确认总时长后会出现 BGM 菜单。BGM 是最终 FFmpeg 后期混音,不是让视频模型临场生成音乐。典型流程是:先选“添加 / 不添加”,再发送本地音频路径或直接音频 URL;如果要调参数,可以和路径写在同一行,也可以在后续混音参数步骤里补充。提供 BGM 后自动输出三个版本:
After you confirm total duration, the Saga workflow shows the BGM menu. BGM is a final FFmpeg post-mix step — never the video model improvising music. The usual flow is: choose add / no add first, then send a local audio path or direct audio URL; if you want to adjust parameters, you may put them on the same line as the path or provide them in the later mix-settings step. Providing a BGM produces three output variants automatically:
| 输出文件 | 音轨内容 | 适用场景 |
|---|---|---|
pre-soundtrack-*.mp4 | 原声版(只有视频环境音 + 对白,不混 BGM) | 需要纯净对白做后期再叠混音 |
*.mp4(主交付文件) | BGM 全程混音版(音乐持续在前景) | 纯音乐 vibe 的氛围片,对白少或不重要 |
*_bgm_ducked.mp4 | 智能避让版:BGM 在对白/旁白时间码自动降到约 32% 音量,对白结束自动恢复 | 推荐。任何有对白的剧本都用这个 |
| File | Audio content | When to use |
|---|---|---|
pre-soundtrack-*.mp4 | Original-audio version (video ambience + dialogue only, no BGM) | You want clean dialogue to mix yourself in post |
*.mp4 (primary deliverable) | BGM mixed throughout, music in the foreground | Pure mood / vibe pieces where dialogue is minor or absent |
*_bgm_ducked.mp4 | Intelligent ducking: BGM drops to ~32% during dialogue / voiceover timecodes and restores when speech ends | Recommended. Use this for any brief with dialogue. |
可选参数与音频路径可以写在同一行(中英文都识别);如果先只发路径,后面也会再问一次混音参数:
Optional parameters may go on the same line as the audio path (Chinese and English are both recognised); if you only send the path first, the workflow will ask again for mix parameters later:
| 参数 | 写法 | 默认 | 说明 |
|---|---|---|---|
| 起点 | 从 1:19 开始 / start 1:19 / start 90s | 0 | 音乐从第几秒开始播 |
| 音乐音量 | 音量 -12dB / volume -12dB | cover -12dB / ducked -16dB | BGM 自身音量 |
| 环境音音量 | 环境音 -18dB / ambience -18dB | cover -18dB / ducked 0dB | 视频原声音量 |
| 淡入 | 淡入 0.5 秒 / fade in 0.5s | 0.3s | 开头淡入秒数 |
| 淡出 | 淡出 1.2 秒 / fadeout 1.2s | 1.0s | 结尾淡出秒数 |
| Parameter | Syntax | Default | Notes |
|---|---|---|---|
| Start offset | start 1:19 / start 90s / 从 1:19 开始 | 0 | Where in the track the music begins |
| Music volume | volume -12dB / 音量 -12dB | cover -12dB / ducked -16dB | BGM gain |
| Ambience volume | ambience -18dB / 环境音 -18dB | cover -18dB / ducked 0dB | Original video audio level |
| Fade in | fade in 0.5s / 淡入 0.5 秒 | 0.3s | Fade-in seconds at the head |
| Fade out | fadeout 1.2s / 淡出 1.2 秒 | 1.0s | Fade-out seconds at the tail |
-3dB 较响,-12dB 适中(默认垫底音量),-20dB 很轻,-30dB 以下基本听不到。不是「-1 代表最小音量」——恰好相反,-1dB 几乎等于原音量。
dB is logarithmic. Closer to 0 means louder: -3dB is fairly loud, -12dB sits as a comfortable underscore (default), -20dB is quiet, below -30dB is almost inaudible. It is NOT "-1 = quietest" — the opposite: -1dB is essentially original volume.
3.5 字幕模式(在工作流里选)
3.5 Subtitle mode (chosen in the workflow)
| 模式 | 行为 |
|---|---|
| 自动(推荐) | 只有你 brief 明确要求字幕/屏幕文字时才渲染;这是当前工作流默认推荐项 |
| 带字幕 | 把对白/旁白渲染为可读字幕,原文保留不翻译 |
| 无字幕 | 对白只走音频/口型,不主动渲染成屏幕文字;适合怕模型把歌词/品牌名误渲染时使用 |
| Mode | Behaviour |
|---|---|
| Auto (recommended) | Renders subtitles / on-screen text only when the brief explicitly asks; this is the default the workflow recommends. |
| With subtitles | Renders dialogue / voiceover as readable subtitles, preserving the original language without translation. |
| No subtitles | Dialogue stays audio / lip-sync only; nothing is rendered as on-screen text. Use this when worried the model might misrender lyrics or brand names. |
3.6 首帧定位关键词(让人物朝向 / 位置 / 运动方向稳定可控)
3.6 Opening-frame keywords (lock orientation / position / motion direction reliably)
每段开头明确写出人物在画面里的位置、身体朝向、运动方向、取景、机位,Artemis 会把它们锁成首帧的硬规则。不写或写得模糊时,每次生成可能出现「人物朝向左右翻」「位置漂移到中央」之类的不稳定结果。
At the top of each segment, state the subject's frame position, body orientation, motion direction, shot size, and camera explicitly. Artemis then locks them as hard rules for the opening keyframe. Without these cues — or with vague ones — every run risks left/right facing flips or position drift to centre.
| 维度 | 怎么写(中文) | 怎么写(英文) |
|---|---|---|
| 横向位置 | 画面左边缘 5% / 画面右边缘 10% / 画面 30% 处 / 画面中央 |
left edge 5% / 30% horizontal / centred in the frame |
| 身体朝向 | 侧面 / 正面朝镜头 / 背影 / 3/4 侧背 |
profile / facing camera / back to camera / three-quarter back |
| 运动方向 | 向画面右侧走 / 向左移动 / 朝镜头走来 |
moving rightward / drifting left / walking toward the camera |
| 取景 | 中景全身 / 中景 / 特写 / 极近特写 / 大远景 |
medium wide / medium shot / close-up / extreme close-up / long shot |
| 机位 | 锁死三脚架 / 推镜 |
locked-off tripod / slow dolly-in |
| Dimension | How to write (Chinese) | How to write (English) |
|---|---|---|
| Horizontal position | 画面左边缘 5% / 画面右边缘 10% / 画面 30% 处 / 画面中央 |
left edge 5% / 30% horizontal / centred in the frame |
| Body orientation | 侧面 / 正面朝镜头 / 背影 / 3/4 侧背 |
profile / facing camera / back to camera / three-quarter back |
| Motion direction | 向画面右侧走 / 向左移动 / 朝镜头走来 |
moving rightward / drifting left / walking toward the camera |
| Shot size | 中景全身 / 中景 / 特写 / 极近特写 / 大远景 |
medium wide / medium shot / close-up / extreme close-up / long shot |
| Camera | 锁死三脚架 / 推镜 |
locked-off tripod / slow dolly-in |
推荐写法:在每段的第一句用「首帧定位」整理这 5 类信息,再写场景细节。这样首帧画面方向不会再翻车。
Recommended pattern: open each segment with an "Opening framing" line that consolidates these 5 cues, then describe the scene. The opening frame will stop flipping direction across runs.
[0-8秒] 段 1 · 东京迷失
**首帧定位**:女主侧面剪影位于画面左边缘 5%,向画面右侧缓慢迈步;中景全身,锁死三脚架。
(接下来正文场景细节...)
[0-8sec] Segment 1 · Tokyo lost
**Opening framing**: the protagonist is in profile, positioned at left edge ~5%, moving rightward in a slow stride; medium wide / full body; locked-off tripod.
(then continue with scene specifics...)
§ 4Artemis 自动为你生成的导演级模块
§ 4Director-level modules Artemis generates for you
你 不需要在 brief 里写 下列任何东西,Artemis 会自动加上:
You do not need to write any of the following in your brief — Artemis adds them automatically:
| 自动模块 | 作用 |
|---|---|
[SAGA-CONTINUITY-POLICY] | 全片角色身份一致性硬规则 |
[CHARACTERS] / [LOCKED-CHARACTERS] | 角色 lock |
[ACCESSORY-LOCK] | 永久配饰(墨镜、头巾、手镯,自动从 brief 提取) |
[WARDROBE] / [LOCKED-WARDROBE] | 服装 lock |
[LOCKED-PROPS] | 道具 lock(已自动过滤 reference 照片偶发背景) |
[LOCKED-LOCATIONS] | 场景/地点 lock |
[PALETTE] [LIGHTING] [CAMERA] [MOOD] | 调色 / 光照 / 镜头 / 情绪 |
[NEGATIVE] | 负面约束(含字幕模式联动) |
[AUDIO-LOCK] | 音频锁(按 brief 意图自动加) |
[STYLE-LOCK] [SCENE-PRIORITY] | 跨段风格 / storyBeat 优先级 |
[EXPLICIT USER BRIEF LOCK] | 自动提取的具体地点/道具/对白锚 |
[REFERENCE-ROLE-SEPARATION] | 参考图职责分离(身份用 / 不要继承塑料皮肤) |
[AESTHETIC-LOCK: HUMAN-EDITORIAL] 或 PRODUCT-CINEMATIC | 美学锁 |
[Saga Narrative Entity Map] | 主角/商品/环境“上帝实体”地图,传给 Saga Critic 防跑题 |
| 首帧定位锁 | 从你 brief 里自动提取每段的位置 / 朝向 / 运动方向 / 取景 / 机位,固化为首帧硬规则(详见 §3.6) |
| BGM 三版本输出 | 提供 BGM 后自动生成原声、混音、智能避让对白三个最终文件(详见 §3.4b) |
cleanDirect(显式关键词才启用) | 原始/少滤镜模式,去掉美学修饰;角色、配饰、服装、场景、字幕防护这些一致性锁仍然保留,可以放心使用 |
| Auto module | Purpose |
|---|---|
[SAGA-CONTINUITY-POLICY] | Whole-film character-identity hard rule |
[CHARACTERS] / [LOCKED-CHARACTERS] | Character lock |
[ACCESSORY-LOCK] | Permanent accessories (sunglasses, headscarf, bracelets — auto-extracted from the brief) |
[WARDROBE] / [LOCKED-WARDROBE] | Wardrobe lock |
[LOCKED-PROPS] | Prop lock (incidental reference-photo backgrounds are auto-filtered) |
[LOCKED-LOCATIONS] | Scene / location lock |
[PALETTE] [LIGHTING] [CAMERA] [MOOD] | Palette / lighting / camera / mood |
[NEGATIVE] | Negative constraints (subtitle-mode aware) |
[AUDIO-LOCK] | Audio lock (added automatically based on brief intent) |
[STYLE-LOCK] [SCENE-PRIORITY] | Cross-segment style / storyBeat priority |
[EXPLICIT USER BRIEF LOCK] | Auto-extracted concrete location / prop / dialogue anchors |
[REFERENCE-ROLE-SEPARATION] | Reference-image role separation (identity only / do not inherit plastic skin) |
[AESTHETIC-LOCK: HUMAN-EDITORIAL] or PRODUCT-CINEMATIC | Aesthetic lock |
[Saga Narrative Entity Map] | "God-entity" map of protagonist / product / environment, fed to Saga Critic to prevent drift |
| Opening-frame lock | Auto-extracts each segment's position / orientation / motion / shot size / camera from your brief and hard-locks them for the opening keyframe (see §3.6) |
| 3-variant BGM output | When a BGM is supplied, three final files are produced: original, mixed, intelligent-ducking (see §3.4b) |
cleanDirect (explicit keyword only) | Raw / low-filter mode: removes aesthetic dressing; character, accessory, wardrobe, scene, and subtitle-protection locks all stay in place — safe to use |
§ 5Do & Don't
§ 5Do & Don't
把时间码写在段落起始的方括号里
Put the timecode in brackets at the start of each segment
[8-16秒] 段 2 · 段标题
不要把对白写成裸引号
Do NOT use bare quotes for dialogue
她说"我来了"
↑ Artemis 不会识别为对白
She said "I'm here"
↑ Artemis will not recognise this as dialogue
对白必须用 marker 包裹
Wrap every line with an explicit dialogue marker
**对白**: "我来了"
不要在 brief 中重复 Artemis 自动加的套话
Don't repeat boilerplate Artemis already adds
NO waxy skin, NO plastic skin,
no AI smoothing, anatomically coherent...
↑ Artemis 自动加更专业版本
NO waxy skin, NO plastic skin,
no AI smoothing, anatomically coherent...
↑ Artemis adds a more professional version automatically
锁机位用明确关键词,不要绕
Use direct camera-lock keywords; don't paraphrase
完全锁死的三脚架机位
NO pan, NO tilt, NO zoom, NO dolly
locked-off tripod, no camera movement whatsoever
NO pan, NO tilt, NO zoom, NO dolly
不要把角色参考照片的偶发背景写进剧本
Don't bring incidental background from the reference photo into the brief
我上传的参考照片里有黄色座椅,
把它放进每段背景
↑ Artemis 会自动过滤这条
The reference photo I uploaded has yellow seats —
put them in every background
↑ Artemis filters this kind of injection automatically
想用后期 BGM 时写明环境音意图
State ambience-only intent when you plan to overlay BGM in post
AI 生成阶段只出环境音;
音乐和对白都是后期叠加
AI stage outputs ambience only;
music and dialogue are layered in post
不要要求生成特定品牌 BGM
Don't ask the model to generate a specific licensed track
用 JVKE Golden Hour 当 BGM
↑ AI 不会用,且会触发版权问题
Use JVKE "golden hour" as BGM
↑ The model will not match it and may trigger copyright issues
引用真实地标 + 标志性细节
Reference real landmarks plus signature details
成都太古里街头,远处可见
成都IFS熊猫爬楼雕塑
Sino-Ocean Taikoo Li in Chengdu, with the
IFS climbing-panda sculpture visible in the distance
不要在节标题里加英文歌词引号
Don't put quoted English lyrics into section titles
段 1 · "It was just two lovers"
↑ 会被渲染成屏幕字幕
Segment 1 · "It was just two lovers"
↑ Risks being rendered as on-screen text
把动作拆成稳定、可生成的物理状态
Break action into stable, generatable physical states
她停在门口,抬眼看向对方,
手指轻触门框,风吹动头发
She pauses at the doorway, lifts her gaze,
fingertip brushing the doorframe, wind in her hair
不要设计复杂交互或手部特写
Don't design complex interactions or fine hand close-ups
两人近身格斗并交换硬币;
主角用筷子精准夹起花生米
↑ 容易肢体融合、手指错乱、物体漂浮
Two characters fighting and exchanging a coin;
protagonist picking up a peanut with chopsticks
↑ Causes limb merging, broken finger counts, floating props
§ 6完整剧本模板(拿来即用)
§ 6Full brief template (copy & fill)
所有 [填入...] 都要替换成你自己的内容。
Replace every [fill in...] with your own content.
【整片叙事】
[填入 200-500 字整片故事:主角是谁、视觉风格、整体节奏、最终高潮]
═══════════════════════════════════════════════════════════════
【画质规格】
· 画幅比例 / ratio: [16:9 横屏 / 9:16 竖屏 / 1:1 方屏]
· 参考画质等级: [填入参考片名 / 风格]
· 摄影机感: [填入摄影机品牌 / 大画幅 / iPhone 等]
· 镜头: [35mm / 50mm 定焦 / 鱼眼 / 长焦 / etc]
· 色彩: [teal-orange / 高饱和暖色 / 低饱和冷调 / etc]
· 光照: [自然光 / 实拍灯位 / 霓虹 / 烛光 / etc]
· 运动强度: [微幅 / 舒缓 / 中等 / 剧烈 / 爆发]
· 高级摄影质感(可选): [anamorphic flare / volumetric light / 35mm film grain / halation / practical lighting]
═══════════════════════════════════════════════════════════════
【全局基调】
· VIBE: [一句话概括整片调性]
· 镜头机位: [如果要锁机位,写:完全锁死的三脚架机位 (locked-off tripod);
NO pan, NO tilt, NO zoom, NO dolly, NO handheld shake]
· CHARACTER LOCK: [主角永久不变的年龄、人种/肤色、脸型、发型发色、服装、配饰、妆容;无参考图时必须写成完整角色卡]
· 声音层级:
- 背景音乐: [不加 BGM / 后期叠加本地 BGM / 使用直接音频 URL]
- BGM 参数(可选): 本地路径或音频直链、从第几秒开始、BGM 音量、环境音音量、淡入淡出;也可先只发路径,后续再补参数
- 环境音: [描述每段的环境音风格]
- 对白: [描述对白整体音量、口吻、几句话]
- AI 生成阶段只出环境音;音乐和对白都是后期叠加
═══════════════════════════════════════════════════════════════
[0-N秒] 段 1 · [段标题]
[填入这段镜头细节:
- 摄影机机位
- 人物在画面里的位置、动作
- 背景细节(具体地标 + 特征)
- 节奏(XX 秒一硬切 / 持续 / 慢放)
- 转场(INSTANT HARD CUT / cross-fade / etc)]
【声音】 [该段的具体环境音清单]
**对白(约 N 秒,[情绪描述])**: "[实际中文对白]"
═══════════════════════════════════════════════════════════════
[N-M秒] 段 2 · [段标题]
[同上]
═══════════════════════════════════════════════════════════════
... 重复需要的段数 ...
═══════════════════════════════════════════════════════════════
【附录:对白回顾】(可选 · 仅供人类参考)
| 时间 | 段 | 对白 | 情绪 |
|---|---|---|---|
| ~Ns | 段 X | "[对白]" | [情绪] |
[Story]
[Fill in 200-500 chars of overall story: who the protagonist is, the visual style, the pacing, the final beat]
═══════════════════════════════════════════════════════════════
[Picture specs]
· Aspect ratio: [16:9 landscape / 9:16 portrait / 1:1 square]
· Reference quality tier: [film / show name / style label]
· Camera feel: [camera body / large format / iPhone / etc.]
· Lens: [35mm / 50mm prime / fisheye / telephoto / etc.]
· Palette: [teal-orange / saturated warm / desaturated cool / etc.]
· Lighting: [natural / practical lamps / neon / candlelight / etc.]
· Motion intensity: [micro / smooth / medium / intense / explosive]
· Advanced look (optional): [anamorphic flare / volumetric light / 35mm film grain / halation / practical lighting]
═══════════════════════════════════════════════════════════════
[Global tone]
· VIBE: [one-sentence tone summary]
· Camera position: [If locked, write: "locked-off tripod;
NO pan, NO tilt, NO zoom, NO dolly, NO handheld shake"]
· CHARACTER LOCK: [Permanent age / ethnicity-skin / face shape / hair / wardrobe / accessories / makeup; if no reference image, write a full character card]
· Audio layers:
- Background music: [no BGM / overlay local BGM in post / direct audio URL]
- BGM params (optional): local path or direct audio URL, start offset, music volume, ambience volume, fade in / out; you can send the path first and add params later
- Ambience: [per-segment ambience styles]
- Dialogue: [overall volume, tone, number of lines]
- AI generation outputs ambience only; music and dialogue are layered in post
═══════════════════════════════════════════════════════════════
[0-Nsec] Segment 1 · [title]
[Fill in shot details:
- camera position
- character position in frame, action
- background specifics (real landmark + signature feature)
- pacing (cuts every X sec / sustained / slowed)
- transition (INSTANT HARD CUT / cross-fade / etc.)]
[Sound] [Specific ambience cues for this segment]
**dialogue (~N sec, [emotion])**: "[actual line in original language]"
═══════════════════════════════════════════════════════════════
[N-Msec] Segment 2 · [title]
[same structure]
═══════════════════════════════════════════════════════════════
... repeat for as many segments as you need ...
═══════════════════════════════════════════════════════════════
[Appendix: dialogue recap] (optional · for human reference only)
| Time | Seg | Line | Emotion |
|---|---|---|---|
| ~Ns | Seg X | "[line]" | [emotion] |
§ 7常见错误示例(带注释)
§ 7Common mistakes (annotated)
错误 1 · 对白没 marker
Mistake 1 · Dialogue without a marker
- 这时候她轻声说:"我等了你好久。" + **对白(约 12 秒,温柔低语)**: "我等了你好久。"
- Then she whispered: "I've waited so long." + **dialogue (~12 sec, soft whisper)**: "I've waited so long."
原因:裸引号不会被识别为对白,模型不会做口型同步,音频也不会带出来。
Why: bare quotes are not recognised as dialogue. The model won't lip-sync and no audio will be emitted.
错误 2 · 在节标题里加英文歌词引号
Mistake 2 · Putting quoted English lyrics in a section title
- [8-16秒] 段 2 · "Listenin' to Blonde / fallin' for each other" + [8-16秒] 段 2 · 中东北非四连穿越(对应副歌 1:24-1:32)
- [8-16sec] Segment 2 · "Listenin' to Blonde / fallin' for each other" + [8-16sec] Segment 2 · Middle East & North Africa quadruple cut (aligns with chorus 1:24-1:32)
原因:节标题里的英文引号会被 video 模型当成"屏幕字幕指令",渲染到画面上变成花字。
Why: quoted English in a section title is treated as a "render this on-screen" instruction by the video model and ends up as decorative typography on the frame.
错误 3 · 要求 AI 生成特定 BGM
Mistake 3 · Asking the model to generate a specific licensed track
- AI 在段 5 高潮处加入 JVKE - golden hour 副歌 + AI 生成阶段只出环境音(海浪 + 棕榈叶风);BGM 后期 DaVinci 中叠加
- AI adds the JVKE "golden hour" chorus at the climax of segment 5 + AI stage outputs ambience only (surf + palm-frond wind); BGM is layered in DaVinci in post
原因:AI 视频模型不会版权安全地"重现"具体歌曲,且写明音乐名反而让模型 hallucinate 一段不像的音乐。
Why: video models can't reproduce a specific licensed track safely, and naming one tends to make the model hallucinate a poor imitation.
错误 4 · 混用时间码格式
Mistake 4 · Mixing timecode formats
- [0-8秒] 段 1 - Scene 2: 8-16s - [16-24] 段 3 + [0-8秒] 段 1 + [8-16秒] 段 2 + [16-24秒] 段 3
- [0-8sec] Segment 1 - Scene 2: 8-16s - [16-24] Segment 3 + [0-8sec] Segment 1 + [8-16sec] Segment 2 + [16-24sec] Segment 3
原因:Artemis 优先按方括号时间码解析,混用会让某些段被误识别为场景标记从而错误平均分配时长。
Why: Artemis parses bracketed timecodes first; mixing formats causes some segments to fall through to the scene-marker fallback and be reassigned wrong durations.
错误 5 · 把"参考照片背景"写进剧本
Mistake 5 · Carrying reference-photo backgrounds into the brief
- 我上传的参考照片里有一辆黄色座椅的车,把它放进每段背景 + [删掉这条] — 角色参考照只用于身份提取(脸 / 服装 / 配饰), + 不要把照片里的偶发背景元素带进视频
- The reference photo I uploaded has a car with yellow seats — put it in every background + [delete this line] — reference photos are for identity only (face / wardrobe / accessories); + do NOT pull incidental background elements from the photo into the video
原因:Artemis 已经自动过滤"参考图中"开头的道具,但不要自己手动写出来诱导。
Why: Artemis already filters props tagged "from the reference image" automatically — don't manually write them in.
错误 6 · 单段超过当前模型上限
Mistake 6 · Single segment exceeds the current model's cap
- [0-20秒] 段 1(20 秒长镜头;若当前模型上限为 15 秒会被截断) + [0-15秒] 段 1(前 15 秒) + [15-30秒] 段 2(紧接上文:同机位、同场景、主角延续上一段结尾动作)
- [0-20sec] Segment 1 (20-sec long take — will be truncated if the model cap is 15 sec) + [0-15sec] Segment 1 (first 15 sec) + [15-30sec] Segment 2 (continues from above: same camera, same scene, protagonist resumes the previous ending action)
原因:Artemis 会读取当前视频模型允许的单段最大秒数(工作流会提示具体数值),超过会被截断。所以同一长镜头建议主动拆成 6-10 秒小段。拆分时后一段必须写清"紧接上文"的机位、姿态和动作方向,否则模型可能换机位或重置动作。
Why: Artemis reads the current video model's per-segment cap (shown in the workflow); anything longer is truncated. Split long takes into 6-10 sec segments yourself; each follow-up segment must restate the continued camera, pose, and motion direction, or the model may reset.
错误 7 · 要求过于抽象
Mistake 7 · Asking for something too abstract
- 拍出"自由"的感觉 + 镜头跟随主角向画面右上 45° 仰角奔跑,长发被风吹起, + 背景从城市灰色渐变为开阔金色麦田,光斑闪烁
- Capture the feeling of "freedom" + Camera tracks the protagonist running upward to frame-right at a 45° low angle, long hair lifted by the wind, + background fades from grey cityscape to an open golden wheat field with shimmering bokeh
原因:Artemis 把 storyBeat 原文丢给 video 模型,越具体的视觉描述出片越准。
Why: Artemis passes the storyBeat to the video model verbatim — the more concrete the visual description, the more accurate the output.
§ 8自查 Checklist
§ 8Self-check checklist
写完 brief 后逐条对照:
Run through every item after finishing your brief:
- 每段有时间码(
[X-Y秒]或[MM:SS-MM:SS]或Scene N) - 已明确画幅比例 / ratio;横屏、竖屏、方屏的构图和运动方式没有互相打架
- 所有对白都有
**对白**:/对白:/dialogue:等 marker,且字数与时间码匹配 - CHARACTER LOCK 段已写明主角永久外观;无参考图时已写成完整角色卡
- 如果要锁机位,brief 至少出现一次「锁死三脚架」或「locked-off tripod」
- 如果不要 AI 生成音乐,brief 至少出现一次「只出环境音」或「后期叠加」
- 没有把英文歌词引号写在段标题里
- 没有写 Artemis 自动会加的"防塑料感、防多手指"等套话
- 单段时长不超过当前模型上限(工作流会提示);不确定时按 6-10 秒拆段最稳
- 如果把同一长镜头拆成多段,后一段开头写明紧接上文的机位、姿态、动作方向
- 没有设计复杂物理交互、精细手部特写、近身格斗、递接小物件、吃东西等高翻车动作
- 选了正确的字幕模式:默认选「自动」;明确要烧录字幕才选「带字幕」;完全不要屏幕文字才选「无字幕」
- Every segment has a timecode (
[X-Ysec]or[MM:SS-MM:SS]orScene N) - Aspect ratio is stated; composition and motion match the chosen ratio (landscape / portrait / square)
- Every spoken line uses an explicit marker (
**dialogue**:/dialogue:/对白:) and its length matches the timecode - CHARACTER LOCK describes the protagonist's permanent appearance; without a reference image, a full character card is written out
- If the camera should be locked, the brief contains at least one "locked-off tripod" or 锁死三脚架 keyword
- If AI music must be suppressed, the brief contains at least one "ambience only" or "added in post" phrase
- No quoted English lyrics in section titles
- No quality-prompt boilerplate Artemis adds automatically (anti-plastic, anti-extra-fingers, etc.)
- Per-segment duration stays under the current model's cap (shown in the workflow); when in doubt, 6-10 sec is safest
- If a long take is split across segments, the next segment restates the continued camera, pose, and motion direction
- No complex interactions, fine hand close-ups, hand-to-hand combat, object hand-offs, or eating shots
- Subtitle mode is correct: default to Auto; choose "With subtitles" only when subtitles must burn in; choose "No subtitles" when you want zero on-screen text
§ 9高级用法
§ 9Advanced usage
9.1 多角色身份锁
9.1 Multi-character identity locks
Brief 里直接列:
List them directly in the brief:
· CHARACTER LOCK:
A(女主 1): 黑长直、白皮、红裙
B(女主 2): 金发短发、橄榄肤、黑色皮夹克
C(男主): 棕色卷发、络腮胡、亚麻色衬衫
· CHARACTER LOCK:
A (lead 1): long straight black hair, fair skin, red dress
B (lead 2): short blonde hair, olive skin, black leather jacket
C (lead 3, male): curly brown hair, full beard, linen shirt
每段引用时用 A / B / C 即可:
Reference them as A / B / C in every segment:
[0-8秒] 段 1
A 走进画面,B 从右边出现拥抱她。C 在远处看。
[0-8sec] Segment 1
A walks into frame, B appears from the right and hugs her. C watches from afar.
9.2 子弹时间(局部慢放)
9.2 Bullet time (localised slow motion)
· 11–11.4 秒(0.3 秒丝滑慢放 · 子弹时间):
时间丝滑降至 25%,她的步态、长发、衣摆都在缓慢流动。
NOT a freeze, motion smoothly slowed only.
· 11–11.4 sec (0.3-sec silky slow motion · bullet time):
time slows smoothly to 25%; her stride, long hair, and hemline still flow.
NOT a freeze frame, motion smoothly slowed only.
9.3 多语言对白混用
9.3 Mixed-language dialogue
**对白(中文,温柔)**: "我等你很久了。"
**对白(French, intimate)**: "Je t'ai attendu si longtemps."
Artemis 自动检测每句语言,分别送 lip-sync 引擎。
Artemis detects the language of each line and routes them to lip-sync separately.
9.4 非语言发声 / 歌唱意图
9.4 Non-verbal vocalisations / singing intent
【声音】
- 非语言发声:她短促叹气,随后压低声音啜泣。
- 歌唱/哼唱意图:她用很轻的气声哼唱旋律;不要当成普通对白节奏。
[Sound]
- Non-verbal: she sighs briefly, then suppresses a soft sob.
- Singing / humming: she hums a melody in a light breathy voice; do not treat it as ordinary dialogue rhythm.
这类内容建议写在【声音】里,不要冒充对白 marker。唱歌/哼唱对不同视频模型和 lip-sync 能力要求更高,必须明确写成声音意图,而不是默认当普通说话处理。
Put this kind of content under the [Sound] block, not under a dialogue marker. Singing / humming pushes lip-sync harder across video models and must be flagged as a sound intent, not treated as normal speech by default.
9.5 INSTANT HARD CUT 显式标注
9.5 Mark INSTANT HARD CUT explicitly
· 2 秒(INSTANT HARD CUT,背景切到东京)
· 2 sec (INSTANT HARD CUT, background switches to Tokyo)
帮 Artemis 内部规划帧 chain 时知道这里是硬切而不是渐变。
Helps Artemis's internal frame-chain planner know the boundary is a hard cut rather than a fade.
9.6 主体模式 / 身份来源显式备注
9.6 Declare subject mode and identity source up front
主体模式:纯视觉 / 无主角。
主体模式:有主角。身份来源:三视图参考图;不要继承参考图背景。
Subject mode: pure visual / no protagonist.
Subject mode: has protagonist. Identity source: turnaround reference sheet; do not inherit reference-photo backgrounds.
当前 Saga 会在工作流里询问「有主角 / 纯视觉」以及身份来源;brief 里提前写清楚可以减少澄清轮次。
The workflow currently asks "has protagonist / pure visual" and the identity source; stating them in the brief skips clarification rounds.
9.7 同场景续写 / 长镜头拆分
9.7 Continuing the same scene / splitting a long take
[0-8秒] 段 1 · 走廊奔跑前半段
机位锁死在走廊尽头,主角从画面左侧向右前方奔跑,右手扶过墙面。
[8-16秒] 段 2 · 走廊奔跑后半段
紧接上文:机位保持锁死,走廊、光线、服装完全一致;主角延续上一段结尾的向右前方奔跑姿态,右手刚离开墙面。
[0-8sec] Segment 1 · Hallway sprint, first half
Camera locked at the end of the hallway; protagonist sprints from frame-left toward frame upper-right, right hand brushing the wall.
[8-16sec] Segment 2 · Hallway sprint, second half
Continues from above: camera stays locked; hallway, lighting, wardrobe identical; protagonist resumes the upper-right sprint from the previous segment, right hand having just left the wall.
如果只是把长镜头按时间切开,后一段容易换机位或重置动作。续写段开头必须明确承接上一段的机位、场景、姿态、动作方向。
If you simply slice a long take by time, the next segment tends to switch cameras or reset the pose. Continuation segments must restate the inherited camera, scene, pose, and motion direction up front.
9.8 无参考图时的纯文本角色卡
9.8 Text-only character card (no reference image)
主体模式:有主角。身份来源:纯文字。
· CHARACTER LOCK:
25 岁亚洲男性,冷白肤色,窄长脸,下颌线分明;
黑色寸头,左眼角一颗小泪痣;
始终穿黑色高领毛衣、银色细项链、深灰长风衣;
表情克制,眼神疲惫但清醒。
Subject mode: has protagonist. Identity source: text only.
· CHARACTER LOCK:
25-year-old East Asian male, cool fair skin, narrow long face, sharp jawline;
black crew cut, a tiny tear-shaped mole at the outer corner of the left eye;
always in a black turtleneck, a slim silver chain, and a charcoal long trench coat;
restrained expression, tired but clear-eyed gaze.
不要只写“帅哥 / 美女 / 老人”。纯文本驱动时,角色卡越具体,跨段一致性越稳。
Avoid generic labels like "handsome guy / pretty girl / old man". With text-only drive, the more specific the character card, the more stable the cross-segment identity.
9.9 时空锚点 / WORLD ANCHOR(长视频)
9.9 World anchor (for long videos)
【时空锚点 / WORLD ANCHOR】
[锚点·黄昏大雨 | 50-120秒]
从第 50 秒到第 120 秒,所有相关段落都继承:
- 地面湿滑,雨水反光
- 角色头发贴在额头,外套边缘被雨打湿
- 远处车灯在雨幕中散开
- 整体光线保持冷蓝 + 暖车灯反差
[World anchor]
[Anchor · Dusk in heavy rain | 50-120 sec]
From sec 50 to sec 120, every related segment inherits:
- wet, reflective ground
- character's hair clings to the forehead, jacket edge soaked
- distant car headlights diffuse through the rain
- overall lighting stays cool blue with warm headlight contrast
用于超长视频里的长期环境状态提醒。把它当作写剧本 AI 的全局记忆锚,不要依赖单个段落临时重复。
Use this as a long-running environmental memory anchor for the brief-writing AI in very long videos, rather than repeating the state in every single segment.
9.10 原始质感 / 少滤镜模式(cleanDirect)
9.10 Raw look / low-filter mode (cleanDirect)
请用原始质感 / 少滤镜 / raw-seedance / clean-direct。
保留自然纹理,不要过度导演包装。
Use raw look / low filter / raw-seedance / clean-direct.
Keep natural texture; do not over-package with director-grade dressing.
这些显式关键词启用 cleanDirect:去掉美学修饰层,换取更干净自然的模型原始质感。
These explicit keywords enable cleanDirect: aesthetic dressing is removed in exchange for a cleaner, more natural raw look from the model.
§ 10给写剧本 AI 的最终提醒
§ 10Final note to the AI writing the brief
如果你(AI)是被用户拿来代写 Artemis brief 的:
If you (the AI) are being used to ghost-write an Artemis brief:
- 你的产出是一份 结构化 Markdown 剧本,不是已经渲染好的 video prompt
- 每个
[X-Y秒]段通常会被 Artemis 作为独立的 video 模型调用,再由 Saga 拼接;所以每段要自包含关键信息 - 对白必须带 marker(
**对白**:),不带就被忽略 - 不要试图"帮"用户加导演术语堆砌——Artemis 自动加更专业的
- 40-60 秒视频通常不需要堆到 8000 字;更推荐每 6-10 秒写 120-250 个中文字的有效镜头信息,复杂段落可到约 420 字。长片可以扩展到多千字,但每段必须清晰自包含,避免无效画质词占用提示空间
- 输出剧本前自查一遍 §8 的 checklist
- Your output is a structured Markdown brief, NOT a ready-to-render video prompt.
- Each
[X-Ysec]segment is usually a separate video-model call, then stitched by Saga, so every segment must be self-contained with its critical info. - Dialogue must have a marker (
**dialogue**:); unmarked quotes are ignored. - Do NOT pad with director jargon — Artemis adds better, more professional versions.
- A 40-60 sec video rarely needs 8000 characters; aim for 120-250 effective Chinese chars (or ~80-160 English words) per 6-10 sec segment, up to ~420 for complex shots. Long-form can run into thousands of chars, but every segment must stay self-contained without quality-prompt filler.
- Run the §8 checklist before handing the brief over.
写完后告诉用户:"这份剧本可直接发给 Artemis /saga,预期分段数:N,预计渲染时长:X 分钟。"
When done, tell the user: "This brief is ready for Artemis /saga; expected segments: N; estimated render time: X minutes."