配置选项 (Options)

配置选项用于初始化 AaaS Pilot Kit 实例。包含必填选项和可选选项两部分。

必填选项

配置说明

以下选项在常规使用中为必填，但部分选项（如 agentConfig）在特定条件下可省略，详见各选项说明。

`token` (string)

🔑 数字员工鉴权 Token - 用于调用数字员工服务的身份认证凭证。

示例: "your-auth-token-here"
获取方式: 联系平台管理员或访问平台文档

`figureId` (string)

🆔 数字员工形象 ID - 从平台获取的唯一形象资源标识。

示例: "209337"、"209342"
获取方式: 访问平台文档

`ttsPer` (string)

🎙️ 数字员工音色 ID - 对应 TTS 引擎的发音人标识。

示例: "LITE_audiobook_female_1"、"LITE_badao_shaoye"
获取方式: 访问平台文档

`agentConfig` (AgentConfig)

🧠 AI Agent API 配置 - 用于 LLM 对话处理的配置。

interface IAgentConfig {
    // 客悦ONE-智能外呼机器人 ID https://ky.cloud.baidu.com/ky/telemarketing/config/robot/manage
    robotId?: string;
    // 客悦ONE-智能客服机器人 token https://ky.cloud.baidu.com/ky/unit-app
    token?: string;
}

配置要求：

⚠️ 若未提供自定义 agentService，此配置必填。
✅ 已提供 agentService 时，可省略 agentConfig，由自定义服务完全处理对话逻辑。

相关链接：

可选选项

`ttsSample` (number)

📊 TTS 音频采样率（Hz）- 影响音质与带宽。

默认值: 16000
推荐值:
- 16000（16k，默认，平衡音质与性能）
- 24000（24k，高保真场景）
- 8000（8k，低带宽/嵌入式设备）

`locale` (string | LanguageCode)

🌐 统一语言配置 (v1.1.2+) - 设置 SDK 界面语言和语音服务语言。

类型: 'zh' | 'en' | 'ja' | 'ko' | LanguageCode | string
默认值: 'zh'
说明: 同时控制：
- SDK 内部消息文本的语言（如错误提示、状态消息）
- ASR 语音识别和 TTS 语音合成的语言

import {createAaaSPilotKit, Language} from '@bdky/aaas-pilot-kit';

// 方式 1: 使用字符串
const kit = createAaaSPilotKit({
    locale: 'en', // 界面和语音都用英文
});

// 方式 2: 使用 Language 枚举（推荐，有类型提示）
const kit = createAaaSPilotKit({
    locale: Language.ENGLISH,
});

统一配置

从 v1.1.2 开始，只需设置 locale 即可同时配置界面和语音服务语言。如需界面和语音使用不同语言，可通过 asr.config.lang 覆盖语音服务语言。

详细用法: 参见国际化 (i18n)

`messages` (Partial<I18nMessages>)

📝 自定义翻译消息 (v1.1.2+) - 覆盖或扩展内置翻译。

类型: Partial<I18nMessages>
默认值: undefined

查找优先级:

自定义 messages
当前 locale 对应的内置语言包
中文回退（zh）
key 本身（中文原文）

createAaaSPilotKit({
    locale: 'en',
    messages: {
        // 覆盖内置翻译
        '网络连接错误': 'Custom network error message',
    },
});

详细用法: 参见国际化 (i18n)

`lang` (LanguageCode)

已废弃 (v1.1.2+)

lang 顶级配置已废弃，请迁移到 locale。

// ❌ 旧写法（已废弃）
{ lang: 'en' }

// ✅ 新写法
{ locale: 'en' }

如果只需要设置语音服务语言而保持界面为中文，使用 asr.config.lang：

// ✅ 仅语音服务使用英文
{
    locale: 'zh',  // 界面保持中文
    asr: { provider: 'baidu', config: { lang: 'en' } }
}

🌐 语言配置 - 用于配置 ASR 语音识别和 TTS 语音合成的语言。

类型: LanguageCode (字符串或 Language 常量)
默认值: 'zh' (中文)
支持的语言:

Language 常量	语言代码	语言名称
`Language.CHINESE`	`'zh'`	中文(普通话)
`Language.ENGLISH`	`'en'`	英语
`Language.JAPANESE`	`'ja'`	日语
`Language.SPANISH`	`'es'`	西班牙语
`Language.RUSSIAN`	`'ru'`	俄语
`Language.KOREAN`	`'ko'`	韩语
`Language.VIETNAMESE`	`'vi'`	越南语
`Language.GERMAN`	`'de'`	德语
`Language.INDONESIAN`	`'id'`	印尼语
`Language.THAI`	`'th'`	泰语

使用方式:

import {createAaaSPilotKit, Language} from '@bdky/aaas-pilot-kit';

// 方式 1: 使用 Language 常量(推荐,有类型提示)
const controller = createAaaSPilotKit({
    figureId: 'your-figure-id',
    token: 'your-token',
    lang: Language.ENGLISH,  // 英语
});

// 方式 2: 直接使用语言代码字符串
const controller = createAaaSPilotKit({
    figureId: 'your-figure-id',
    token: 'your-token',
    lang: 'en',  // 英语
});

注意

传入不支持的语言代码会自动回退到中文 ('zh') 并在控制台输出警告
建议使用 Language 常量以获得 TypeScript 类型提示和 IDE 自动完成

`rendererMode` ('cloud' | 'cloud-native' | 'client')

🖥️ 渲染模式 - 选择数字人服务的技术实现方式。

默认值: 'cloud'
选项:
- 'cloud'(默认) → 云端推流渲染(iframe 方式)
  - 适用场景: PC 端
  - 特点: 高保真动态形象,需网络 + RTC 支持
- 'cloud-native' → 云端推流渲染(原生 SDK 集成)
  - 适用场景: 移动端(规避 iframe 点击限制)
  - 特点: 高保真动态形象,需网络 + RTC 支持
  - 推荐: 移动端浏览器环境优先选择此模式
- 'client' → 本地 2D 渲染(静态图 + 口型动画)
  - 适用场景: 离线场景或低资源消耗需求
  - 特点: 低资源消耗,离线可用

使用示例:

// PC 端(默认)
const controller = createAaaSPilotKit({
    figureId: 'xxx',
    rendererMode: 'cloud'
});

// 移动端优化
const controller = createAaaSPilotKit({
    figureId: 'xxx',
    rendererMode: 'cloud-native'
});

`clientRendererConfig` (object)

🖼️ 客户端渲染配置 - 仅 rendererMode='client' 时生效。

包含形象资源路径、口型映射、TTS 驱动参数等。

`timeoutSec` (number)

⏱️ 会话全局超时时间（秒）- 超时后自动结束对话，释放资源。

默认值: 60

`disconnectAlertSec` (number)

⏰ 超时前提醒（秒）- 超时前提醒。

默认值: 10
示例: 设为 10 → 超时前10秒播报"即将结束对话"

`figureResolutionWidth` / `figureResolutionHeight` (number)

📐 数字员工形象分辨率（像素）。

要求:
- 必须为偶数
- 最小 400，最大不超过 1920
- 与 height 组合不能超过 1080×1920 或 1920×1080
⚠️ 警告: 过高分辨率可能导致性能下降

`speechSpeed` (number)

🗣️ 播报语速（字/秒）。

默认值: 6（标准播音语速）
调整建议:
- 教学/老人场景 → 4~5
- 快节奏客服 → 7~8

`typeDelay` / `enTypeDelay` (number)

⌨️ "打字机效果"字符间隔（毫秒）。

typeDelay 默认值: 163（流畅自然），计算公式：1000ms/163 每秒输出约6个字
enTypeDelay 默认值: 45（英文字符）

`interruptible` (boolean)

🛑 允许用户打断 - 是否允许用户打断当前播报（语音/手动输入均可触发）。

默认值: true（推荐开启，提升交互体验）

`prologue` (string)

🎬 开场白话术 - 初始化后自动播报。

示例: "您好，我是您的数字员工小悦，有什么可以帮您？"

`scaleX` / `scaleY` (number)

📏 数字员工形象缩放比例。

默认值: 1.0
>1 放大，<1 缩小

`translateX` / `translateY` (number)

↔️↕️ 数字员工人物形象位置偏移（像素）。

translateX: 正右负左
translateY: 正下负上

`position` (IPosition)

🧩 像素级位置与裁剪控制 —— 用于精准布局数字员工形象在最终画面中的显示区域。

🔧 工作流程：

先根据 crop 从原始人像底板中裁剪出"数字员工形象主体矩形区域"
再根据 location 将裁剪后的区域缩放+定位到最终视频画布中

💡 适用场景：

将数字员工形象嵌入到固定尺寸的"对话框"或"产品卡片"中
对齐 UI 设计稿的像素级定位需求
屏蔽背景干扰，只展示数字员工形象上半身/面部特写

📐 参数结构：

interface IPosition {
    // 裁剪区域（基于原始底板）
    crop: {x: number, y: number, width: number, height: number}
    // 最终定位+缩放（基于输出画布）
    location: {x: number, y: number, width: number, height: number}
}

⚠️ 注意：

所有值均为像素单位，不支持百分比
crop 区域超出原始底板 → 自动 clamp 边界
location 超出画布 → 数字员工形象部分或全部不可见
与 scaleX/Y、translateX/Y 不冲突，会叠加计算（先裁剪定位 → 再缩放平移）

示例 —— 居中显示数字员工形象上半身（原始底板 1920x1080，输出画布 800x600）：

const position = {
    // 裁剪上半身
    crop: { 
        x: 600,
        y: 0,
        width: 720,
        height: 540 
    },
    // 缩放后定位到左上区域
    location: {
        x: 40,
        y: 20,
        width: 720,
        height: 540 
    }
};

`minSplitLen` (number)

✂️ 首句流式切分粒度（字数）。

默认值: 5
逻辑: "先攒够 N 个字，遇到标点就开播"，避免逐字卡顿
示例: "今天天气不错。" → 攒到"不错。"开始播

`ttsModel` ('turbo_v2' | 'quality_v2' | undefined)

⚡ TTS 模型版本。

undefined → 标准版（稳定，低延迟）
'turbo_v2' → 加速版（响应更快，适合实时对话）
'quality_v2' → 质量版（音质优先，延迟会有所增加）

示例:

const controller = await createAaaSPilotKit({
    ttsModel: 'turbo_v2', // 使用加速版 TTS
    // ... 其他配置
});

`asrVad` (number)

🎙️ ASR 语音端点检测（VAD）的静音超时时长（毫秒）- 用于判断用户"说完一句话"的停顿阈值。

默认值: 600
通俗理解: 你停顿多久，系统就认为你说完了，开始识别
- 值越大 → 越"耐心"，适合语速慢/爱思考的用户
- 值越小 → 越"灵敏"，适合语速快/想快速响应的场景
推荐值:
- 默认 600ms（适合大多数对话场景）
- 快速问答可设 300~400ms
- 慢速/英文教学场景可设 800~1000ms
⚠️ 注意: 过小可能导致话没说完就被打断，过大会让用户觉得"反应迟钝"

废弃提示 (v1.2.0+)

asrVad 顶级配置已废弃，请迁移到 asr.config.asrVad（Baidu ASR）。旧写法仍兼容但不推荐。

// 旧写法（仍兼容）
{ asrVad: 600 }

// 新写法（推荐）
{ asr: { provider: 'baidu', config: { asrVad: 600 } } }

`asr` (AsrConfig)

🎤 ASR 服务配置 (v1.1.0+) - 配置语音识别服务提供商及其参数。

新特性

v1.1.0 起支持 Azure 微软云语音识别服务，适用于国际化场景。

类型定义:

type AsrConfig =
    | {provider: 'baidu', config: IBaiduAsrConfig}
    | {provider: 'azure', config: IAzureSpeechConfig};

Baidu ASR 配置 (默认)

使用百度语音识别服务，适用于国内场景。

字段	类型	默认值	说明
`asrVad`	number	600	ASR VAD 静音超时（毫秒）
`lang`	LanguageCode	'zh'	语言配置
`audioConstraints`	MediaTrackConstraints	见下方	音频约束
`enableEchoCancellation`	boolean	false	启用回声消除（移动端推荐开启）
`echoCancellationConfig`	object	-	回声消除调优参数

示例:

import {createAaaSPilotKit, Language} from '@bdky/aaas-pilot-kit';

const controller = createAaaSPilotKit({
    token: 'xxx',
    figureId: 'xxx',
    ttsPer: 'xxx',
    agentConfig: {...},
    asr: {
        provider: 'baidu',
        config: {
            asrVad: 600,
            lang: Language.CHINESE
        }
    }
});

Azure ASR 配置 (国际化)

使用 Azure 微软云语音识别服务，适用于国际化多语言场景。

字段	类型	必填	默认值	说明
`subscriptionKey`	string	✅	-	Azure Speech 订阅密钥
`region`	string	✅	-	Azure 区域（如 `'eastasia'`、`'southeastasia'`）
`languages`	string[]	-	`['en-US']`	识别语言列表（最多4个，支持多语言自动切换）
`phraseList`	string[]	-	-	自定义短语列表（提升专有名词识别率）
`phraseWeight`	number	-	2	短语权重（1-10）
`initialSilenceTimeoutMs`	number	-	30000	初始静音超时（毫秒）
`endSilenceTimeoutMs`	number	-	30000	结束静音超时（毫秒）
`segmentationSilenceTimeoutMs`	number	-	1000	分段静音超时（毫秒）
`connectionTimeoutMs`	number	-	-	连接超时（毫秒，不设置则无超时）
`enableAudioLogging`	boolean	-	false	启用音频日志（调试用）
`customEndpointId`	string	-	-	自定义语音端点 ID
`advancedConfig`	`Record<string, any>`	-	-	高级配置
`audioConstraints`	MediaTrackConstraints	-	见下方	音频约束
`enableEchoCancellation`	boolean	-	false	启用回声消除
`echoCancellationConfig`	object	-	-	回声消除调优参数

示例:

import {
    createAaaSPilotKit,
    AzureSpeechRegion,
    AzureSpeechLanguage
} from '@bdky/aaas-pilot-kit';

const controller = createAaaSPilotKit({
    token: 'xxx',
    figureId: 'xxx',
    ttsPer: 'xxx',
    agentConfig: {...},
    asr: {
        provider: 'azure',
        config: {
            subscriptionKey: 'YOUR_AZURE_SUBSCRIPTION_KEY',
            region: AzureSpeechRegion.SOUTHEAST_ASIA,
            languages: [
                AzureSpeechLanguage.CHINESE_SIMPLIFIED_CN,
                AzureSpeechLanguage.ENGLISH_US
            ],
            initialSilenceTimeoutMs: 30000,
            endSilenceTimeoutMs: 30000,
            phraseList: ['客悦ONE', 'AaaS']
        }
    }
});

通用音频配置

以下配置适用于 Baidu 和 Azure 两种 provider：

audioConstraints 默认值:

{
    echoCancellation: true,
    noiseSuppression: true,
    autoGainControl: true
}

echoCancellationConfig 结构:

{
    energyMultiplier?: number;   // 能量倍数
    idleThreshold?: number;      // 空闲阈值
    smoothingFactor?: number;    // 平滑因子
    recoveryDelay?: number;      // 恢复延迟（毫秒）
}

启用回声消除示例（移动端推荐）:

asr: {
    provider: 'azure', // 或 'baidu'
    config: {
        // ... provider 特定配置
        enableEchoCancellation: true,
        audioConstraints: {
            echoCancellation: true,
            noiseSuppression: true,
            autoGainControl: true
        }
    }
}

`checkAudioDeviceBeforeStart` (boolean)

🎙️ 【选填】ASR 启动前检查音频设备可用性 - 执行 4 层渐进式检测（API 支持、HTTPS、设备枚举、权限、流获取）。

默认值: true

开启后（默认 true）:

在 ASR 启动前自动调用 checkAudioDevice()
如果检测失败，并不会阻止 ASR 启动
用户可通过监听 microphone_available 事件获取诊断结果

性能影响:

首次检测增加 ~100-500ms（主要是 getUserMedia）
5 秒内缓存结果，重复调用 <1ms

推荐场景:

对用户体验要求高的场景（提前发现问题）
需要精准错误提示的场景（区分「无设备」「权限拒绝」「设备占用」）

示例:

const controller = await createAaaSPilotKit({
    checkAudioDeviceBeforeStart: true, // 启动前自动检测（推荐）
    // ... 其他配置
});

// 监听检测结果
controller.emitter.on('microphone_available', (result) => {
    if (!result.available) {
        console.error('设备检测失败:', result.userMessage);
    
        // 根据错误类型提供解决方案
        if (result.error === 'PERMISSION_DENIED') {
          showPermissionGuide();
        } 
        else if (result.error === 'HTTPS_REQUIRED') {
          showHTTPSWarning();
        }
    }
});

相关事件: microphone_available - 接收检测结果 相关方法: checkAudioDevice() - 手动触发设备检测

`microphoneFailureHandling` ('error' | 'warn' | 'silent' | 'prompt')

🎙️ 【选填】麦克风检测失败时的处理策略 - 当 checkAudioDeviceBeforeStart=true 且麦克风检测失败时的行为。

默认值: 'error'

策略说明:

'error' (默认) → 抛出 AsrInitializationError，终止启动
'warn' → 在控制台输出警告，禁用语音输入，继续启动(仅文本模式)
'silent' → 静默降级为仅文本模式，不输出警告(仍会发出事件)
'prompt' → 调用 onMicrophoneCheckFailed 回调，让开发者自定义交互逻辑

使用场景:

'error' → 严格要求语音功能的场景(如语音客服)
'warn' → 语音为可选功能，允许文本输入兜底
'silent' → 自动降级，不干扰用户体验
'prompt' → 需要显示自定义对话框让用户确认

注意事项:

降级后用户仍可通过 input(text) 方法进行文本输入
系统会发出 device_check_completed 和 microphone_available 事件
ASR 服务会被自动禁用(controller.asrService.disabled = true)

示例:

const controller = await createAaaSPilotKit({
    checkAudioDeviceBeforeStart: true,
    microphoneFailureHandling: 'warn', // 警告但继续
    // ... 其他配置
});

// 监听麦克风可用性
controller.emitter.on('microphone_available', (result) => {
    if (!result.available) {
        console.warn('麦克风不可用，已降级为仅文本模式');
        showTextOnlyModeNotice();
    }
});

相关配置:

checkAudioDeviceBeforeStart - 启动前检查音频设备
onMicrophoneCheckFailed - 自定义失败处理回调

相关事件:

device_check_completed - 设备检测完成
microphone_available - 麦克风可用性结果

`onMicrophoneCheckFailed` (Function)

🔔 【选填】麦克风检测失败时的自定义处理回调 - 仅当 microphoneFailureHandling='prompt' 时生效。

类型定义:

type OnMicrophoneCheckFailed = (
    result: IAudioDeviceCheckResult,
    continueCallback: () => void
) => void | Promise<void>

参数说明:

参数	类型	说明
`result`	`IAudioDeviceCheckResult`	设备检测结果，包含错误详情和用户友好提示
`continueCallback`	`() => void`	回调函数，调用后继续初始化流程(无麦克风模式)

典型用途:

显示自定义确认对话框
提供"继续使用"或"取消"选项
记录用户决策用于数据分析

完整示例:

import {createAaaSPilotKit} from '@bdky/aaas-pilot-kit';

const controller = await createAaaSPilotKit({
    checkAudioDeviceBeforeStart: true,
    microphoneFailureHandling: 'prompt',
    onMicrophoneCheckFailed: async (result, continueCallback) => {
        // 显示自定义对话框
        const userChoice = await showDialog({
            title: '麦克风不可用',
            message: result.userMessage,
            buttons: [
                {text: '继续使用(仅文字)', value: 'continue'},
                {text: '取消', value: 'cancel'}
            ]
        });

        if (userChoice === 'continue') {
            console.log('用户选择降级为文本模式');
            continueCallback(); // 继续初始化
        } else {
            console.log('用户取消初始化');
            // 不调用 continueCallback，初始化将失败
            throw new Error('用户取消初始化');
        }
    },
    // ... 其他配置
});

错误处理:

如果回调抛出异常，初始化将失败
如果不调用 continueCallback，初始化将等待(异步情况下)或失败(同步情况下)
如果回调执行失败，会在控制台输出警告并返回 false

注意事项:

仅在 microphoneFailureHandling='prompt' 时生效
回调可以是同步或异步函数
必须调用 continueCallback() 才能继续初始化流程

相关配置:

microphoneFailureHandling - 失败处理策略
checkAudioDeviceBeforeStart - 启动前检查设备

`env` ('development' | 'sandbox' | 'production')

🌐 运行环境。

'development' → 开发调试（日志全开）
'sandbox' → 沙箱测试（模拟生产）
'production' → 生产环境（默认，性能最优）

`enableDebugMode` (boolean)

🐞 开启 Debug 模式 - 输出全链路日志（ASR/Agent/TTS/渲染）。

开发调试时建议开启，生产环境需关闭

`hotWordReplacementRules` (ReplacementRule[])

🧩 语音识别热词纠正规则（正则替换）。

用于纠正 ASR 识别不准的专有名词、品牌词等。

// 示例
hotWordReplacementRules: [
    {pattern: /客悦\s*one/gi, replacement: '客悦·ONE'},
    {pattern: /A I/g, replacement: 'AI'}
]

`speechFormatters` / `conversationFormatters` (TFormatter[])

📝 文本格式处理函数。

speechFormatters: 语音文本输入格式处理
conversationFormatters: 消息内容格式处理

`agentService` (Newable<BaseAgentService>)

🧠 自定义 Agent 服务类（需继承 BaseAgentService）。

用于实现私有 Agent 流式 Api 协议对接，⚠️ 这个绑定后 agentConfig 配置将不可用。

详细实现指南：自定义 AgentService 配置

`inactivityPrompt` (string)

⏸️ 长时间无交互提示语 - 用户沉默超时后，数字员工主动播报提醒。

示例: "您还在吗？我可以继续为您服务~"

`autoChromaKey` (boolean)

🟢 是否自动开启绿幕抠像（仅 rendererMode='cloud' 时有效）。

默认值: true → 自动去除背景，融合进您的页面
false → 保留原始背景（适合已有透明通道的素材）

配置示例

const options: IOptions = {
    // 必填配置
    token: 'your-auth-token-here',
    figureId: '209337',
    ttsPer: 'LITE_audiobook_female_1',
    agentConfig: {
        token: 'your-agent-token',
        robotId: 'your-robot-id'
    },

    // 可选配置
    locale: Language.ENGLISH,  // 统一语言配置（界面 + 语音）
    ttsSample: 16000,
    rendererMode: 'cloud',
    timeoutSec: 60,
    speechSpeed: 6,
    interruptible: true,
    prologue: '您好,我是您的数字员工,有什么可以帮您?',
    asrVad: 600,
    env: 'production',
    enableDebugMode: false,
    autoChromaKey: true,
    inactivityPrompt: '您长时间未讲话,我先挂断啦~',

    // 热词替换规则
    hotWordReplacementRules: [
        {pattern: /客悦\s*one/gi, replacement: '客悦·ONE'},
        {pattern: /A I/g, replacement: 'AI'}
    ]
};

const controller = createAaaSPilotKit(options);

必填选项
可选选项
配置示例

必填选项​

token (string)​

figureId (string)​

ttsPer (string)​

agentConfig (AgentConfig)​

可选选项​

ttsSample (number)​

locale (string | LanguageCode)​

messages (Partial<I18nMessages>)​

lang (LanguageCode)​

rendererMode ('cloud' | 'cloud-native' | 'client')​

clientRendererConfig (object)​

timeoutSec (number)​

disconnectAlertSec (number)​

figureResolutionWidth / figureResolutionHeight (number)​

speechSpeed (number)​

typeDelay / enTypeDelay (number)​

interruptible (boolean)​

prologue (string)​

scaleX / scaleY (number)​

translateX / translateY (number)​

position (IPosition)​

minSplitLen (number)​

ttsModel ('turbo_v2' | 'quality_v2' | undefined)​

asrVad (number)​

asr (AsrConfig)​

Baidu ASR 配置 (默认)​

Azure ASR 配置 (国际化)​

通用音频配置​

checkAudioDeviceBeforeStart (boolean)​

microphoneFailureHandling ('error' | 'warn' | 'silent' | 'prompt')​

onMicrophoneCheckFailed (Function)​

env ('development' | 'sandbox' | 'production')​

enableDebugMode (boolean)​

hotWordReplacementRules (ReplacementRule[])​

speechFormatters / conversationFormatters (TFormatter[])​

agentService (Newable<BaseAgentService>)​

inactivityPrompt (string)​

autoChromaKey (boolean)​

配置示例​