2026-05-19-00-28-03 移除本地配音工具备份

2026-05-19 00:32:40 +08:00
parent 33e804655c
commit 31b6bcac37
14 changed files with 98 additions and 900 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -11,6 +11,8 @@ packaging_build
 storage/jobs
 storage/uploads
 待配音视频
+Tools_scripts_XunFei-Ubuntu
+配音生成工作流-Ubuntu-Agent.md
 storage/**/*.raw.mp4
 *.tar.gz
 *.zip
--- a/.gitignore
+++ b/.gitignore
@@ -11,6 +11,8 @@ storage/jobs/
 storage/uploads/
 storage/demos/
 待配音视频/
+Tools_scripts_XunFei-Ubuntu/
+配音生成工作流-Ubuntu-Agent.md
 packaging_build/
 release_packages/
 *.log
--- a/Tools_scripts_XunFei-Ubuntu/README.md
+++ b/Tools_scripts_XunFei-Ubuntu/README.md
@@ -1,38 +0,0 @@
-# Tools_scripts_XunFei-Ubuntu
-
-Ubuntu 版配音工具，使用 Bash + Python + ffmpeg 替代 PowerShell。
-
-## Install
-
-```bash
-sudo apt update
-sudo apt install -y python3 python3-pip ffmpeg
-python3 -m pip install -r Tools_scripts_XunFei-Ubuntu/requirements-ubuntu.txt
-```
-
-## Environment
-
-```bash
-export XF_APPID="your_app_id"
-export XF_APIKEY="your_api_key"
-export XF_APISECRET="your_api_secret"
-```
-
-## Generate Voice
-
-```bash
-./Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_super_tts.sh \
-  --script 配音稿.md \
-  --output-dir 02_audio/super_tts \
-  --voice x5_lingfeiyi_flow \
-  --speed 50
-```
-
-## Build Final Video
-
-```bash
-python3 Tools_scripts_XunFei-Ubuntu/build_final_video_ubuntu.py \
-  --video input.mp4 \
-  --audio-dir 02_audio/super_tts \
-  --output 05_outputs/final_voiceover.mp4
-```
--- a/Tools_scripts_XunFei-Ubuntu/build_final_video_ubuntu.py
+++ b/Tools_scripts_XunFei-Ubuntu/build_final_video_ubuntu.py
@@ -1,232 +0,0 @@
-#!/usr/bin/env python3
-"""Build a final voice-over video on Ubuntu with ffmpeg."""
-
-from __future__ import annotations
-
-import argparse
-import shutil
-import subprocess
-from pathlib import Path
-
-
-AUDIO_EXTS = {".mp3", ".wav", ".m4a", ".aac", ".flac", ".ogg"}
-
-
-def run(cmd: list[str]) -> None:
-    print("+ " + " ".join(cmd))
-    subprocess.run(cmd, check=True)
-
-
-def require_tool(name: str) -> str:
-    path = shutil.which(name)
-    if not path:
-        raise SystemExit(f"{name} is required. Install it with: sudo apt install -y ffmpeg")
-    return path
-
-
-def media_duration(path: Path) -> float:
-    result = subprocess.check_output(
-        [
-            "ffprobe",
-            "-v",
-            "error",
-            "-show_entries",
-            "format=duration",
-            "-of",
-            "default=nw=1:nk=1",
-            str(path),
-        ],
-        text=True,
-    ).strip()
-    return float(result)
-
-
-def audio_files(audio_dir: Path) -> list[Path]:
-    files = [
-        path
-        for path in sorted(audio_dir.iterdir())
-        if path.is_file() and path.suffix.lower() in AUDIO_EXTS
-    ]
-    if not files:
-        raise FileNotFoundError(f"No audio files found in {audio_dir}")
-    return files
-
-
-def concat_audio_dir(audio_dir: Path, work_dir: Path, silence: float) -> Path:
-    work_dir.mkdir(parents=True, exist_ok=True)
-    normalized: list[Path] = []
-    silence_path = work_dir / "silence.wav"
-    run(
-        [
-            "ffmpeg",
-            "-hide_banner",
-            "-loglevel",
-            "error",
-            "-y",
-            "-f",
-            "lavfi",
-            "-t",
-            f"{silence:.3f}",
-            "-i",
-            "anullsrc=channel_layout=stereo:sample_rate=48000",
-            "-c:a",
-            "pcm_s16le",
-            str(silence_path),
-        ]
-    )
-
-    for index, src in enumerate(audio_files(audio_dir), start=1):
-        dst = work_dir / f"audio_{index:02d}.wav"
-        run(
-            [
-                "ffmpeg",
-                "-hide_banner",
-                "-loglevel",
-                "error",
-                "-y",
-                "-i",
-                str(src),
-                "-vn",
-                "-ar",
-                "48000",
-                "-ac",
-                "2",
-                "-c:a",
-                "pcm_s16le",
-                str(dst),
-            ]
-        )
-        normalized.append(dst)
-
-    concat_items: list[Path] = []
-    for index, item in enumerate(normalized):
-        concat_items.append(item)
-        if index != len(normalized) - 1 and silence > 0:
-            concat_items.append(silence_path)
-
-    list_path = work_dir / "audio_concat.txt"
-    with list_path.open("w", encoding="utf-8") as handle:
-        for item in concat_items:
-            escaped = item.resolve().as_posix().replace("'", "'\\''")
-            handle.write(f"file '{escaped}'\n")
-
-    out_audio = work_dir / "combined_voice.wav"
-    run(
-        [
-            "ffmpeg",
-            "-hide_banner",
-            "-loglevel",
-            "error",
-            "-y",
-            "-f",
-            "concat",
-            "-safe",
-            "0",
-            "-i",
-            str(list_path),
-            "-c:a",
-            "pcm_s16le",
-            str(out_audio),
-        ]
-    )
-    return out_audio
-
-
-def parse_args() -> argparse.Namespace:
-    parser = argparse.ArgumentParser(description="Combine one video with voice-over audio.")
-    parser.add_argument("--video", type=Path, required=True, help="Source video path.")
-    parser.add_argument("--audio", type=Path, default=None, help="Single voice-over audio file.")
-    parser.add_argument("--audio-dir", type=Path, default=None, help="Directory of ordered audio files.")
-    parser.add_argument("--output", type=Path, default=Path("05_outputs/final_voiceover.mp4"))
-    parser.add_argument("--work-dir", type=Path, default=Path("04_intermediate/ubuntu_voiceover"))
-    parser.add_argument("--silence", type=float, default=0.35, help="Gap seconds between audio files.")
-    parser.add_argument("--width", type=int, default=1920)
-    parser.add_argument("--height", type=int, default=1080)
-    parser.add_argument("--fps", type=int, default=30)
-    parser.add_argument("--crf", type=int, default=20)
-    parser.add_argument("--preset", default="medium")
-    parser.add_argument("--video-speed", type=float, default=None, help="Override automatic speed.")
-    return parser.parse_args()
-
-
-def main() -> int:
-    args = parse_args()
-    require_tool("ffmpeg")
-    require_tool("ffprobe")
-
-    if not args.video.exists():
-        raise FileNotFoundError(args.video)
-    if bool(args.audio) == bool(args.audio_dir):
-        raise SystemExit("Use exactly one of --audio or --audio-dir.")
-
-    args.work_dir.mkdir(parents=True, exist_ok=True)
-    args.output.parent.mkdir(parents=True, exist_ok=True)
-
-    audio_path = args.audio if args.audio else concat_audio_dir(args.audio_dir, args.work_dir, args.silence)
-    if not audio_path or not audio_path.exists():
-        raise FileNotFoundError(audio_path)
-
-    video_duration = media_duration(args.video)
-    audio_duration = media_duration(audio_path)
-    if video_duration <= 0 or audio_duration <= 0:
-        raise RuntimeError("Invalid media duration.")
-
-    speed = args.video_speed if args.video_speed else video_duration / audio_duration
-    if speed <= 0:
-        raise ValueError("--video-speed must be greater than 0.")
-
-    print(f"video_duration={video_duration:.3f}s")
-    print(f"audio_duration={audio_duration:.3f}s")
-    print(f"video_speed={speed:.6f}x")
-
-    vf = (
-        f"[0:v]setpts=PTS/{speed:.8f},fps={args.fps},"
-        f"scale={args.width}:{args.height}:force_original_aspect_ratio=decrease,"
-        f"pad={args.width}:{args.height}:(ow-iw)/2:(oh-ih)/2:black,"
-        "setsar=1,format=yuv420p[v];"
-        "[1:a]aresample=48000,apad[a]"
-    )
-    run(
-        [
-            "ffmpeg",
-            "-hide_banner",
-            "-y",
-            "-i",
-            str(args.video),
-            "-i",
-            str(audio_path),
-            "-filter_complex",
-            vf,
-            "-map",
-            "[v]",
-            "-map",
-            "[a]",
-            "-t",
-            f"{audio_duration:.3f}",
-            "-c:v",
-            "libx264",
-            "-preset",
-            args.preset,
-            "-crf",
-            str(args.crf),
-            "-c:a",
-            "aac",
-            "-b:a",
-            "192k",
-            "-ar",
-            "48000",
-            "-ac",
-            "2",
-            "-movflags",
-            "+faststart",
-            str(args.output),
-        ]
-    )
-    final_duration = media_duration(args.output)
-    print(f"output={args.output}")
-    print(f"final_duration={final_duration:.3f}s")
-    return 0
-
-
-if __name__ == "__main__":
-    raise SystemExit(main())
--- a/Tools_scripts_XunFei-Ubuntu/check_audio_duration.sh
+++ b/Tools_scripts_XunFei-Ubuntu/check_audio_duration.sh
@@ -1,21 +0,0 @@
-#!/usr/bin/env bash
-set -euo pipefail
-
-TARGET="${1:-02_audio}"
-
-if ! command -v ffprobe >/dev/null 2>&1; then
-  echo "ffprobe is required. Install it with: sudo apt install -y ffmpeg" >&2
-  exit 1
-fi
-
-if [[ ! -e "$TARGET" ]]; then
-  echo "Path not found: $TARGET" >&2
-  exit 1
-fi
-
-find "$TARGET" -type f \( -iname '*.mp3' -o -iname '*.wav' -o -iname '*.m4a' \) -print0 |
-  sort -z |
-  while IFS= read -r -d '' file; do
-    duration="$(ffprobe -v error -show_entries format=duration -of default=nw=1:nk=1 "$file")"
-    printf '%8.3fs  %s\n' "$duration" "$file"
-  done
--- a/Tools_scripts_XunFei-Ubuntu/requirements-ubuntu.txt
+++ b/Tools_scripts_XunFei-Ubuntu/requirements-ubuntu.txt
@@ -1 +0,0 @@
-websocket-client>=1.8.0
--- a/Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_super_tts.sh
+++ b/Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_super_tts.sh
@@ -1,5 +0,0 @@
-#!/usr/bin/env bash
-set -euo pipefail
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-python3 "$SCRIPT_DIR/xfyun_tts_ubuntu.py" --mode super "$@"
--- a/Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_tts.sh
+++ b/Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_tts.sh
@@ -1,5 +0,0 @@
-#!/usr/bin/env bash
-set -euo pipefail
-
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-python3 "$SCRIPT_DIR/xfyun_tts_ubuntu.py" --mode normal "$@"
--- a/Tools_scripts_XunFei-Ubuntu/xfyun_tts_ubuntu.py
+++ b/Tools_scripts_XunFei-Ubuntu/xfyun_tts_ubuntu.py
@@ -1,356 +0,0 @@
-#!/usr/bin/env python3
-"""Generate XFYUN TTS voice files on Ubuntu.
-
-This script supports both the normal XFYUN online TTS endpoint and the
-super-realistic TTS endpoint used by the PowerShell workflow.
-"""
-
-from __future__ import annotations
-
-import argparse
-import base64
-import email.utils
-import hashlib
-import hmac
-import json
-import os
-import re
-import sys
-from dataclasses import dataclass
-from pathlib import Path
-from typing import Any
-from urllib.parse import quote, urlparse
-
-
-NORMAL_TTS_URL = "wss://tts-api.xfyun.cn/v2/tts"
-SUPER_TTS_URL = "wss://cbm01.cn-huabei-1.xf-yun.com/v1/private/mcd9m97e6"
-
-
-@dataclass(frozen=True)
-class ScriptSegment:
-    index: str
-    title: str
-    text: str
-
-
-def safe_filename(value: str) -> str:
-    cleaned = re.sub(r'[\\/:*?"<>|]', "", value).strip()
-    return cleaned or "segment"
-
-
-def find_default_script(cwd: Path) -> Path:
-    candidates = sorted(cwd.glob("*.md"))
-    preferred = [
-        path
-        for path in candidates
-        if path.name.startswith("配音稿") or path.name.lower().startswith("voice")
-    ]
-    fallback = [
-        path
-        for path in candidates
-        if not path.name.startswith("配音生成工作流") and path.name.lower() != "readme.md"
-    ]
-    selected = (preferred or fallback or candidates)
-    if not selected:
-        raise FileNotFoundError("Cannot find script Markdown file. Use --script to specify one.")
-    return selected[0]
-
-
-def load_segments(script_path: Path) -> list[ScriptSegment]:
-    content = script_path.read_text(encoding="utf-8-sig")
-    pattern = re.compile(
-        r"(?ms)^##\s+([1-9])\.\s+(.+?)\r?\n(.*?)(?=^##\s+[1-9]\.\s+|\Z)"
-    )
-    matches = pattern.findall(content)
-    if not matches:
-        raise ValueError("Cannot find sections like '## 1. title' in script Markdown.")
-
-    segments: list[ScriptSegment] = []
-    metadata = re.compile(r"^(说明|时长|备注|镜头|画面|音色|语速|输出|提示)[:：]")
-    for index, title, body in matches:
-        lines = []
-        for raw_line in body.splitlines():
-            line = raw_line.strip()
-            if not line or line.startswith("#") or metadata.match(line):
-                continue
-            lines.append(line)
-        text = "\n".join(lines).replace("\t", " ").strip()
-        if not text:
-            raise ValueError(f"Section {index} has no readable text.")
-        segments.append(ScriptSegment(index=index, title=title.strip(), text=text))
-    return segments
-
-
-def build_auth_url(request_url: str, api_key: str, api_secret: str) -> str:
-    uri = urlparse(request_url)
-    host_name = uri.hostname or ""
-    path = uri.path or "/"
-    date = email.utils.formatdate(usegmt=True)
-    signature_origin = f"host: {host_name}\ndate: {date}\nGET {path} HTTP/1.1"
-    digest = hmac.new(
-        api_secret.encode("utf-8"),
-        signature_origin.encode("utf-8"),
-        hashlib.sha256,
-    ).digest()
-    signature = base64.b64encode(digest).decode("ascii")
-    authorization_origin = (
-        f'api_key="{api_key}", algorithm="hmac-sha256", '
-        f'headers="host date request-line", signature="{signature}"'
-    )
-    authorization = base64.b64encode(authorization_origin.encode("utf-8")).decode("ascii")
-    return (
-        f"{request_url}?authorization={quote(authorization)}"
-        f"&date={quote(date)}&host={quote(host_name)}"
-    )
-
-
-def require_websocket():
-    try:
-        import websocket  # type: ignore
-    except ImportError as exc:
-        raise SystemExit(
-            "Missing dependency: websocket-client. Install it with:\n"
-            "  python3 -m pip install -r Tools_scripts_XunFei-Ubuntu/requirements-ubuntu.txt"
-        ) from exc
-    return websocket
-
-
-def recv_json(socket: Any) -> dict[str, Any]:
-    message = socket.recv()
-    if isinstance(message, bytes):
-        message = message.decode("utf-8")
-    return json.loads(message)
-
-
-def synthesize_normal(
-    *,
-    text: str,
-    out_file: Path,
-    app_id: str,
-    api_key: str,
-    api_secret: str,
-    voice: str,
-    speed: int,
-    volume: int,
-    pitch: int,
-) -> None:
-    websocket = require_websocket()
-    url = build_auth_url(NORMAL_TTS_URL, api_key, api_secret)
-    socket = websocket.create_connection(url, timeout=30)
-    audio = bytearray()
-    try:
-        payload = {
-            "common": {"app_id": app_id},
-            "business": {
-                "aue": "lame",
-                "sfl": 1,
-                "auf": "audio/L16;rate=16000",
-                "vcn": voice,
-                "speed": speed,
-                "volume": volume,
-                "pitch": pitch,
-                "bgs": 0,
-                "tte": "UTF8",
-                "reg": "2",
-                "rdn": "0",
-            },
-            "data": {
-                "status": 2,
-                "text": base64.b64encode(text.encode("utf-8")).decode("ascii"),
-            },
-        }
-        socket.send(json.dumps(payload, ensure_ascii=False, separators=(",", ":")))
-        while True:
-            response = recv_json(socket)
-            if response.get("code", 0) != 0:
-                raise RuntimeError(
-                    f"XFYUN normal TTS failed: code={response.get('code')}, "
-                    f"message={response.get('message')}"
-                )
-            data = response.get("data") or {}
-            if data.get("audio"):
-                audio.extend(base64.b64decode(data["audio"]))
-            if data.get("status") == 2:
-                break
-    finally:
-        socket.close()
-
-    if not audio:
-        raise RuntimeError("No audio data returned by XFYUN normal TTS.")
-    out_file.write_bytes(audio)
-
-
-def synthesize_super(
-    *,
-    text: str,
-    out_file: Path,
-    app_id: str,
-    api_key: str,
-    api_secret: str,
-    voice: str,
-    speed: int,
-    volume: int,
-    pitch: int,
-    raw_text: bool,
-) -> None:
-    websocket = require_websocket()
-    url = build_auth_url(SUPER_TTS_URL, api_key, api_secret)
-    socket = websocket.create_connection(url, timeout=30)
-    audio = bytearray()
-    request_text = text if raw_text else base64.b64encode(text.encode("utf-8")).decode("ascii")
-    try:
-        payload = {
-            "header": {"app_id": app_id, "status": 2},
-            "parameter": {
-                "oral": {
-                    "oral_level": "mid",
-                    "spark_assist": 1,
-                    "remain": 1,
-                },
-                "tts": {
-                    "vcn": voice,
-                    "speed": speed,
-                    "volume": volume,
-                    "pitch": pitch,
-                    "bgs": 0,
-                    "reg": 0,
-                    "rdn": 0,
-                    "rhy": 0,
-                    "watermask": 0,
-                    "implicit_watermark": False,
-                    "audio": {
-                        "encoding": "lame",
-                        "sample_rate": 24000,
-                        "channels": 1,
-                        "bit_depth": 16,
-                        "frame_size": 0,
-                    },
-                },
-            },
-            "payload": {
-                "text": {
-                    "encoding": "utf8",
-                    "compress": "raw",
-                    "format": "plain",
-                    "status": 2,
-                    "seq": 0,
-                    "text": request_text,
-                }
-            },
-        }
-        socket.send(json.dumps(payload, ensure_ascii=False, separators=(",", ":")))
-        while True:
-            response = recv_json(socket)
-            header = response.get("header") or {}
-            if header and header.get("code", 0) != 0:
-                raise RuntimeError(
-                    f"XFYUN super TTS failed: code={header.get('code')}, "
-                    f"message={header.get('message')}, sid={header.get('sid')}"
-                )
-            if response.get("code", 0) != 0:
-                raise RuntimeError(
-                    f"XFYUN super TTS failed: code={response.get('code')}, "
-                    f"message={response.get('message')}"
-                )
-            payload_audio = ((response.get("payload") or {}).get("audio") or {})
-            if payload_audio.get("audio"):
-                audio.extend(base64.b64decode(payload_audio["audio"]))
-            if header.get("status") == 2 or payload_audio.get("status") == 2:
-                break
-    finally:
-        socket.close()
-
-    if not audio:
-        raise RuntimeError("No audio data returned by XFYUN super TTS.")
-    out_file.write_bytes(audio)
-
-
-def validate_range(name: str, value: int) -> None:
-    if value < 0 or value > 100:
-        raise ValueError(f"{name} must be between 0 and 100.")
-
-
-def parse_args() -> argparse.Namespace:
-    parser = argparse.ArgumentParser(description="Generate XFYUN TTS audio on Ubuntu.")
-    parser.add_argument("--script", type=Path, default=None, help="Markdown script path.")
-    parser.add_argument("--output-dir", type=Path, default=Path("02_audio/xfyun_tts"))
-    parser.add_argument("--mode", choices=["normal", "super"], default="super")
-    parser.add_argument("--voice", default=None, help="XFYUN vcn voice name.")
-    parser.add_argument("--speed", type=int, default=50)
-    parser.add_argument("--volume", type=int, default=70)
-    parser.add_argument("--pitch", type=int, default=50)
-    parser.add_argument("--raw-text", action="store_true", help="Use raw text for super TTS.")
-    parser.add_argument("--overwrite", action="store_true", help="Overwrite existing mp3 files.")
-    parser.add_argument("--dry-run", action="store_true", help="Only parse script and print plan.")
-    return parser.parse_args()
-
-
-def main() -> int:
-    args = parse_args()
-    validate_range("speed", args.speed)
-    validate_range("volume", args.volume)
-    validate_range("pitch", args.pitch)
-
-    script_path = args.script or find_default_script(Path.cwd())
-    segments = load_segments(script_path)
-    voice = args.voice or ("xiaoyan" if args.mode == "normal" else "x5_lingfeiyi_flow")
-
-    print(f"script={script_path}")
-    print(f"mode={args.mode}")
-    print(f"voice={voice}")
-    print(f"segments={len(segments)}")
-    for segment in segments:
-        print(f"  {segment.index}. {segment.title} ({len(segment.text)} chars)")
-
-    if args.dry_run:
-        return 0
-
-    app_id = os.environ.get("XF_APPID")
-    api_key = os.environ.get("XF_APIKEY")
-    api_secret = os.environ.get("XF_APISECRET")
-    if not app_id or not api_key or not api_secret:
-        raise SystemExit("Please set XF_APPID, XF_APIKEY and XF_APISECRET first.")
-
-    args.output_dir.mkdir(parents=True, exist_ok=True)
-    for segment in segments:
-        out_file = args.output_dir / f"{segment.index}-{safe_filename(segment.title)}.mp3"
-        if out_file.exists() and not args.overwrite:
-            print(f"skip existing: {out_file}")
-            continue
-        print(f"synthesizing {segment.index}: {segment.title}")
-        if args.mode == "normal":
-            synthesize_normal(
-                text=segment.text,
-                out_file=out_file,
-                app_id=app_id,
-                api_key=api_key,
-                api_secret=api_secret,
-                voice=voice,
-                speed=args.speed,
-                volume=args.volume,
-                pitch=args.pitch,
-            )
-        else:
-            synthesize_super(
-                text=segment.text,
-                out_file=out_file,
-                app_id=app_id,
-                api_key=api_key,
-                api_secret=api_secret,
-                voice=voice,
-                speed=args.speed,
-                volume=args.volume,
-                pitch=args.pitch,
-                raw_text=args.raw_text,
-            )
-        print(f"generated: {out_file}")
-
-    print("all voice files generated")
-    return 0
-
-
-if __name__ == "__main__":
-    try:
-        raise SystemExit(main())
-    except KeyboardInterrupt:
-        raise SystemExit(130)
--- a/工程分析/实现方案-2026-05-19-00-28-03.md
+++ b/工程分析/实现方案-2026-05-19-00-28-03.md
@@ -0,0 +1,23 @@
+# 实现方案
+
+开始时间：2026-05-19-00-28-03
+
+## Git 跟踪处理
+
+- 保持 `Tools_scripts_XunFei-Ubuntu/` 与 `配音生成工作流-Ubuntu-Agent.md` 的删除状态。
+- 提交删除记录，使 Gitea 最新 `main` 不再包含这些本地配音工具。
+
+## 忽略规则
+
+- `.gitignore` 增加：
+  - `Tools_scripts_XunFei-Ubuntu/`
+  - `配音生成工作流-Ubuntu-Agent.md`
+- `.dockerignore` 增加：
+  - `Tools_scripts_XunFei-Ubuntu`
+  - `配音生成工作流-Ubuntu-Agent.md`
+
+## 校验
+
+- 使用 `git check-ignore` 验证三个路径都被忽略。
+- 使用 `git status --short` 确认待提交内容只包含删除、忽略规则与工程分析文档。
+- 推送到 Gitea 后重新部署并检查健康接口。
--- a/工程分析/测试方案-2026-05-19-00-28-03.md
+++ b/工程分析/测试方案-2026-05-19-00-28-03.md
@@ -0,0 +1,34 @@
+# 测试方案
+
+开始时间：2026-05-19-00-28-03
+
+## 忽略校验
+
+- `git check-ignore -v 待配音视频/任意文件`
+- `git check-ignore -v Tools_scripts_XunFei-Ubuntu/任意文件`
+- `git check-ignore -v 配音生成工作流-Ubuntu-Agent.md`
+
+执行结果：
+
+- `待配音视频/` 命中 `.gitignore` 第 13 行规则。
+- `Tools_scripts_XunFei-Ubuntu/` 命中 `.gitignore` 第 14 行规则。
+- `配音生成工作流-Ubuntu-Agent.md` 命中 `.gitignore` 第 15 行规则。
+- `.dockerignore` 也已加入这三个不需要进入构建上下文的路径。
+
+## Git 校验
+
+- `git diff --check`
+- `git status --short`
+- `git ls-files Tools_scripts_XunFei-Ubuntu 配音生成工作流-Ubuntu-Agent.md 待配音视频`
+
+执行结果：
+
+- `git diff --check` 通过。
+- `待配音视频/` 没有被 Git 跟踪。
+- `Tools_scripts_XunFei-Ubuntu/` 与 `配音生成工作流-Ubuntu-Agent.md` 曾在上一轮被纳入 Git，本次保持删除状态并提交删除，让 Gitea 最新 `main` 不再包含它们。
+
+## 部署校验
+
+- `docker compose -f docker_compose_huijutec.yaml up -d --build`
+- `curl -fsS http://127.0.0.1:10004/api/health`
+- `curl -fsS https://isiseg.huijutec.cn/api/health`
--- a/工程分析/经验记录.md
+++ b/工程分析/经验记录.md
@@ -271,3 +271,15 @@ B. 产生问题原因：ffmpeg concat demuxer 要求输入流参数一致，多
 C. 解决问题方案：`build_final_video_ubuntu.py` 在合并音频目录时先把每段音频统一转为 48kHz 双声道 PCM WAV，再拼接并与视频合成，最终输出 H.264/AAC MP4。

 D. 后续如何避免问题：多段音频拼接前先标准化采样率、声道和编码；最终成片统一使用 H.264/AAC/yuv420p/faststart。
+
+## 2026-05-19-00-28-03 配音工具不纳入 Gitea 备份
+
+### 1. 临时配音工具误纳入源码备份
+
+A. 具体问题：Ubuntu 版配音工具和工作流文档属于本地辅助材料，用户明确说明它们和 `待配音视频` 一样不需要 Gitea 备份。
+
+B. 产生问题原因：上一轮把“新建工具”理解为需要随仓库保存，未先确认这些本地配音工具是否属于源码交付范围。
+
+C. 解决问题方案：从 Git 跟踪中移除 `Tools_scripts_XunFei-Ubuntu/` 与 `配音生成工作流-Ubuntu-Agent.md`，并将它们加入 `.gitignore` 与 `.dockerignore`；`待配音视频/` 继续保持忽略。
+
+D. 后续如何避免问题：凡是视频、配音、录屏、临时工具和本地工作流材料，默认先按“不进入源码仓库”处理；只有用户明确要求备份时再纳入 Gitea。
--- a/工程分析/需求分析-2026-05-19-00-28-03.md
+++ b/工程分析/需求分析-2026-05-19-00-28-03.md
@@ -0,0 +1,25 @@
+# 需求分析
+
+开始时间：2026-05-19-00-28-03
+
+## 用户需求
+
+用户明确说明：刚创建的 Ubuntu 配音工具与工作流文档，以及 `待配音视频`，都不需要备份到 Gitea。
+
+## 需处理对象
+
+- `Tools_scripts_XunFei-Ubuntu/`
+- `配音生成工作流-Ubuntu-Agent.md`
+- `待配音视频/`
+
+## 目标
+
+- 从 Git 跟踪中移除 Ubuntu 配音工具目录和 Ubuntu 工作流文档。
+- 将上述路径加入 `.gitignore`，避免后续误提交。
+- 将上述路径加入 `.dockerignore`，避免 Docker 构建上下文夹带配音工具和视频素材。
+- 保留工程分析文档用于记录本次处理过程。
+
+## 注意
+
+- `待配音视频/` 已在上一轮加入 `.gitignore` 和 `.dockerignore`，本次继续保留。
+- 当前工作区中 Ubuntu 配音工具目录和 Ubuntu 工作流文档已经处于删除状态，本次不恢复文件，直接确认从 Gitea 最新版本移除。
--- a/配音生成工作流-Ubuntu-Agent.md
+++ b/配音生成工作流-Ubuntu-Agent.md
@@ -1,242 +0,0 @@
-# 配音生成工作流 Ubuntu Agent
-
-本文档用于指导 Agent 在 Ubuntu 环境中使用 `Tools_scripts_XunFei-Ubuntu`，将配音稿文字转为讯飞配音音频，并与视频合成为最终介绍视频。
-
-## 1. 目录约定
-
-建议保持以下结构：
-
-```text
-项目目录/
-  配音稿.md
-  Tools_scripts_XunFei-Ubuntu/
-    requirements-ubuntu.txt
-    xfyun_tts_ubuntu.py
-    synthesize_xfyun_tts.sh
-    synthesize_xfyun_super_tts.sh
-    check_audio_duration.sh
-    build_final_video_ubuntu.py
-```
-
-其中：
-
- `xfyun_tts_ubuntu.py`：核心 Python 脚本，支持普通 TTS 和超拟人 TTS。
- `synthesize_xfyun_tts.sh`：普通讯飞 TTS 入口，默认声音 `xiaoyan`。
- `synthesize_xfyun_super_tts.sh`：讯飞超拟人 TTS 入口，默认声音 `x5_lingfeiyi_flow`。
- `check_audio_duration.sh`：批量查看 mp3、wav、m4a 等音频时长。
- `build_final_video_ubuntu.py`：将单个视频与配音音频合成为最终视频，并按旁白时长自动调整画面速度。
-
-## 2. 安装依赖
-
-```bash
-sudo apt update
-sudo apt install -y python3 python3-pip ffmpeg
-python3 -m pip install -r Tools_scripts_XunFei-Ubuntu/requirements-ubuntu.txt
-chmod +x Tools_scripts_XunFei-Ubuntu/*.sh Tools_scripts_XunFei-Ubuntu/*.py
-```
-
-如果使用虚拟环境：
-
-```bash
-python3 -m venv .venv-tts
-source .venv-tts/bin/activate
-python -m pip install -r Tools_scripts_XunFei-Ubuntu/requirements-ubuntu.txt
-```
-
-## 3. 配置讯飞环境变量
-
-```bash
-export XF_APPID="你的AppId"
-export XF_APIKEY="你的ApiKey"
-export XF_APISECRET="你的ApiSecret"
-```
-
-如需长期生效，可以写入 `~/.bashrc`：
-
-```bash
-cat >> ~/.bashrc <<'EOF'
-export XF_APPID="你的AppId"
-export XF_APIKEY="你的ApiKey"
-export XF_APISECRET="你的ApiSecret"
-EOF
-source ~/.bashrc
-```
-
-脚本不会保存密钥，也不要把真实密钥写入仓库。
-
-## 4. 配音稿格式
-
-脚本识别 Markdown 中的分段标题：
-
-```markdown
-## 1. 第一段标题
-第一段配音正文。
-
-## 2. 第二段标题
-第二段配音正文。
-
-## 3. 第三段标题
-第三段配音正文。
-
-## 4. 第四段标题
-第四段配音正文。
-```
-
-注意：
-
- 标题建议保持 `## 1.` 到 `## 4.`。
- 输出文件名会使用段号和标题，例如 `1-第一段标题.mp3`。
- `说明：`、`时长：`、`备注：`、`镜头：`、`画面：` 等元信息行会被忽略。
- 正文只放最终朗读内容，不要放内部提示词。
-
-可先做干跑检查：
-
-```bash
-python3 Tools_scripts_XunFei-Ubuntu/xfyun_tts_ubuntu.py \
-  --script 配音稿.md \
-  --dry-run
-```
-
-## 5. 普通 TTS 合成
-
-普通 TTS 适合快速生成清晰稳定的中文配音。
-
-```bash
-./Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_tts.sh \
-  --script 配音稿.md \
-  --output-dir 02_audio/tts_audio_xiaoyan \
-  --voice xiaoyan \
-  --speed 50 \
-  --volume 70 \
-  --pitch 50
-```
-
-## 6. 超拟人 TTS 合成
-
-超拟人 TTS 更适合项目汇报、系统介绍和宣传片。
-
-```bash
-./Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_super_tts.sh \
-  --script 配音稿.md \
-  --output-dir 02_audio/super_tts_x5_lingfeiyi \
-  --voice x5_lingfeiyi_flow \
-  --speed 50 \
-  --volume 70 \
-  --pitch 50
-```
-
-如果接口要求明文文本模式，可加：
-
-```bash
-./Tools_scripts_XunFei-Ubuntu/synthesize_xfyun_super_tts.sh \
-  --script 配音稿.md \
-  --output-dir 02_audio/super_tts_raw \
-  --raw-text
-```
-
-## 7. 声音和语速选择
-
- `--voice`：讯飞发音人，也就是 `vcn`。
- `--speed`：语速，通常 `0-100`，默认 `50`。
- `--volume`：音量，通常 `0-100`，默认 `70`。
- `--pitch`：音调，通常 `0-100`，默认 `50`。
- `--overwrite`：覆盖已存在的音频文件。
-
-建议：
-
- 系统介绍：优先用超拟人 TTS，例如 `x5_lingfeiyi_flow`。
- 快速校稿：使用普通 TTS，例如 `xiaoyan`。
- 需要缩短成片时长：先压缩文案，再把 `--speed` 调到 `55-60`。
-
-## 8. 检查音频时长
-
-```bash
-./Tools_scripts_XunFei-Ubuntu/check_audio_duration.sh 02_audio/super_tts_x5_lingfeiyi
-```
-
-如果总时长过长：
-
- 优先删减配音稿。
- 其次略微提高 `--speed`。
- 最后再调整视频变速。
-
-不建议为了追赶过长旁白而大幅加速视频，否则画面会不自然。
-
-## 9. 合成最终视频
-
-如果已有一个完整录屏和一组分段配音：
-
-```bash
-python3 Tools_scripts_XunFei-Ubuntu/build_final_video_ubuntu.py \
-  --video 待配音视频/ISISeg-介入导丝视频分割工作台-使用展示.mp4 \
-  --audio-dir 02_audio/super_tts_x5_lingfeiyi \
-  --output 05_outputs/ISISeg-系统使用视频-配音版.mp4
-```
-
-如果已经有合并好的单个旁白音频：
-
-```bash
-python3 Tools_scripts_XunFei-Ubuntu/build_final_video_ubuntu.py \
-  --video input.mp4 \
-  --audio voiceover.mp3 \
-  --output 05_outputs/final_voiceover.mp4
-```
-
-脚本会：
-
- 用 `ffprobe` 读取视频和旁白时长。
- 自动计算画面变速系数。
- 静音原视频音轨，只保留新旁白。
- 输出 H.264/AAC、yuv420p、faststart MP4。
-
-常用参数：
-
- `--width 1920 --height 1080`：输出分辨率。
- `--fps 30`：输出帧率。
- `--silence 0.35`：多段配音之间插入的静音秒数。
- `--video-speed 1.25`：手动指定画面速度，覆盖自动计算。
-
-## 10. Agent 执行清单
-
-1. 确认 `Tools_scripts_XunFei-Ubuntu` 存在。
-2. 检查 `ffmpeg`、`ffprobe`、`python3` 是否可用。
-3. 安装 `requirements-ubuntu.txt` 中的依赖。
-4. 检查 `XF_APPID`、`XF_APIKEY`、`XF_APISECRET`。
-5. 创建或读取配音稿，并用 `--dry-run` 校验分段。
-6. 根据场景选择普通 TTS 或超拟人 TTS。
-7. 设置独立输出目录，避免覆盖不同声音和语速的试听结果。
-8. 生成音频后运行 `check_audio_duration.sh`。
-9. 用 `build_final_video_ubuntu.py` 合成最终视频。
-10. 用 `ffprobe` 检查最终视频时长、编码和音频流。
-
-## 11. 常见问题
-
-### Missing dependency: websocket-client
-
-运行：
-
-```bash
-python3 -m pip install -r Tools_scripts_XunFei-Ubuntu/requirements-ubuntu.txt
-```
-
-### Please set XF_APPID, XF_APIKEY and XF_APISECRET first
-
-说明当前终端没有讯飞凭证环境变量。设置后重新执行脚本：
-
-```bash
-export XF_APPID="你的AppId"
-export XF_APIKEY="你的ApiKey"
-export XF_APISECRET="你的ApiSecret"
-```
-
-### Cannot find script Markdown file
-
-请使用 `--script 配音稿.md` 显式指定配音稿。
-
-### ffmpeg 或 ffprobe 不存在
-
-运行：
-
-```bash
-sudo apt install -y ffmpeg
-```