Skip to main content

Event System

AaaS Pilot Kit is built on an event-driven architecture, providing rich event listening capabilities through controller.emitter.

Lifecycle Events

ready

🚀 [One-time] Digital Employee initialization complete, safe to call other APIs - Triggered when the Digital Employee avatar is loaded and ASR is initialized.

Best Practice: Only call playXXX / input methods after the 'ready' event to avoid race conditions.

Payload: No parameters

controller.emitter.on('ready', () => {
console.log('Digital human is ready');
controller.input('Hello!');
});

error

[Must Listen] System error occurred - Covers full-chain exceptions including ASR / Agent / TTS / Network / Rendering.

🚨 Must monitor and handle in production! Recommended to report to monitoring system + provide user-friendly prompts

Payload: IErrorEventPayload containing:

  • code: Unique error identifier, such as 'ASR_TIMEOUT', 'AGENT_UNAUTHORIZED'
  • message: Human-readable error description, can be used for logging or user prompts
  • stack?: Error stack trace (recommended to keep in development, optional to report in production)
  • actionRequired: Action required by user or developer, such as "Please check microphone permissions", "Try again later"
  • severity: 'low' | 'medium' | 'high' | 'critical' - Used for alert classification
  • metadata: Error context (time, session ID, user ID, service module, etc., for tracking)
  • originalError?: Original Error object (for internal debugging, can be ignored in production)

Response Recommendations:

  • low/medium → Log + local notification
  • high/critical → Interrupt flow + popup prompt + report to operations
controller.emitter.on('error', (error) => {
console.error('System error:', error);

if (error.severity === 'critical') {
showErrorDialog(error.message);
reportToMonitoring(error);
}
});

Conversation Events

conversation_add

💬 [High Frequency] New conversation message (user or AI) - Triggered whenever new conversation content is generated (ASR recognition result / Agent reply).

Use Cases: Chat history display, logging, data analysis

Payload: AnyConversation - Contains role (client/aiWorker), content, timestamp, etc.

controller.emitter.on('conversation_add', (conversation) => {
console.log('New conversation:', conversation);
addToChatHistory(conversation);
});

conversation_change

📝 [High Frequency] Current conversation fragment status update (streaming broadcast in progress) - Used for real-time updates of "typing..." animations or typewriter effects.

Use Cases: UI real-time rendering of streaming replies, progress indication

Payload: Object containing the following fields:

  • text: string
  • type: 'client' | 'aiWorker'
  • id: string
  • completed: boolean - true indicates this fragment has finished broadcasting (bubble rendering complete)
controller.emitter.on('conversation_change', (update) => {
if (update.completed) {
console.log('Fragment completed:', update.text);
}
else {
console.log('Streaming update:', update.text);
}
});

reply_start

Agent starts replying (Digital Human service processing complete) - Triggered when the Digital Human service receives streaming text and completes preprocessing (TTS/action synthesis) and is ready to start broadcasting.

Use Cases: Track first-word latency (Latency), debug performance

Payload: IReplyStartEventPayload containing:

  • agentResponse: Agent's response object
  • latency?: Digital Human service processing time (milliseconds), time from receiving text stream to ready to broadcast (excluding LLM time)
  • requestId: Tracking ID for this request
controller.emitter.on('reply_start', (payload) => {
console.log('Ready to broadcast, Latency:', payload.latency, 'ms');

// Can analyze full-chain latency with ttft event:
// LLM time ≈ ttft.totalLatency - payload.latency
});

ttft

⏱️ Agent first-token latency performance metric - Triggered when first Agent response content is received, containing performance data for the first character generated by Agent.

Use Cases: Performance monitoring, first-token latency analysis.

Payload: ITtftEventPayload containing:

  • firstToken: First character returned by Agent.
  • timestamp: Timestamp of first character return.
  • sessionId: Current session ID.
  • queryId?: Current round query ID.
  • totalLatency?: Total latency from user input to receiving first character (milliseconds), calculated and populated by Controller layer.
controller.emitter.on('ttft', (payload) => {
console.log('Agent first token:', payload.firstToken);
console.log('First token timestamp:', new Date(payload.timestamp));
if (payload.totalLatency) {
console.log('Total latency:', payload.totalLatency, 'ms');
}
});

conversation_end

[Conversation Complete] Digital Employee completes one full reply - Triggered when AIWorkerConversationBean completes broadcasting and renders all content.

Trigger Timing:

  • Digital Employee's one complete reply (including all streaming fragments) completes broadcasting
  • All multimodal content (text, images, videos) complete rendering

Use Cases:

  • Know when AI "finished speaking", can proceed to next operation
  • Track complete duration for each round of conversation
  • Show feedback buttons after conversation ends ("Satisfied"/"Dissatisfied")
  • Record complete conversation content for analysis

Payload: IConversationEndPayload containing:

FieldTypeDescription
idstringUnique conversation identifier
textstringComplete conversation text content
contentsIContent[]All content included in conversation (text, images, videos, etc.)
timestampnumberConversation end timestamp (milliseconds)
controller.emitter.on('conversation_end', (payload) => {
console.log('Digital Employee reply completed:', payload.text);

// Show feedback buttons
showFeedbackButtons(payload.id);

// Record conversation data
saveConversation({
id: payload.id,
message: payload.text,
contentCount: payload.contents.length,
timestamp: payload.timestamp
});

// Count multimodal content
const multimodalStats = payload.contents.reduce((acc, content) => {
acc[content.type] = (acc[content.type] || 0) + 1;
return acc;
}, {} as Record<string, number>);

console.log('This round includes:', multimodalStats);
// For example: {text: 3, image: 1, video: 0}
});

Coordination with other events:

  • reply_start: AI starts replying
  • conversation_change: Streaming content updating
  • conversation_add: New conversation message
  • conversation_end: AI reply complete ← This event

Complete Conversation Flow Example:

let replyStartTime: number;

// 1. AI starts replying
controller.emitter.on('reply_start', () => {
replyStartTime = Date.now();
showTypingIndicator();
});

// 2. Streaming update
controller.emitter.on('conversation_change', (update) => {
if (!update.completed) {
updateTypingText(update.text);
}
});

// 3. Conversation complete
controller.emitter.on('conversation_end', (payload) => {
const duration = Date.now() - replyStartTime;
console.log(`Complete reply duration: ${duration}ms`);

hideTypingIndicator();
showFeedbackButton();
});

Framework Support:

Important Notes:

  • This event is only triggered when AIWorkerConversationBean (Digital Employee message) completes
  • ClientConversationBean (user message) will not trigger this event
  • If conversation is interrupted (interrupt()), this event may not trigger

Speech Recognition Events

asr_start

🎙️ ASR audio capture starts (user starts speaking) - Usually triggered after VAD detects voice activity.

Use Cases: Show "Listening..." animation, start interruption detection, etc.

Payload: No parameters

controller.emitter.on('asr_start', () => {
showListeningAnimation();
});

asr_message

🎙️ [High Frequency Streaming] ASR real-time recognition results (sentence-by-sentence/word-by-word push) - Continuously triggered during user speaking, used for implementing real-time subtitles, voice preview, hot word highlighting, etc.

Trigger Timing:

  • Triggered whenever speech recognition engine outputs an "intermediate fragment" or "final sentence"
  • Driven jointly by VAD (Voice Activity Detection) and semantic sentence-breaking strategies

Use Cases:

  • Real-time display "User is saying: xxx..."
  • Hot word replacement/sensitive word marking (with opts.hotWordReplacementRules)
  • Interruption detection (when completed=false but pause is too long, can predict interruption)

Payload: IAsrMessageEventPayload containing:

  • text: Currently recognized text fragment (may be incomplete)
  • completed: true=sentence recognition complete (can be sent to Agent), false=still speaking (intermediate result)
  • id: Unique identifier for this voice fragment (can be used for deduplication or tracking)
  • sessionId: Current session ID (used for multi-conversation isolation)

⚠️ Note:

  • Intermediate results (completed=false) may be overwritten by subsequent results, do not use directly for business logic
  • Only text with completed=true should be submitted to Agent or recorded to conversation history
controller.emitter.on('asr_message', (payload) => {
if (payload.completed) {
console.log('Final recognition:', payload.text);
sendToAgent(payload.text);
}
else {
console.log('Intermediate result:', payload.text);
showPreview(payload.text);
}
});

microphone_available

🎙️ [Device Detection] Microphone device availability check result - Triggered after audio device detection, providing user-friendly error prompts.

Trigger Timing:

  • Manually calling controller.checkAudioDevice()
  • Before ASR starts with automatic detection (requires configuration checkAudioDeviceBeforeStart: true)

Use Cases:

  • Real-time display device status prompt: "Microphone not available, please check device"
  • Show targeted solutions based on different error types
  • Collect device statistics (device count, permission status, etc.)

Payload: Object containing the following fields:

FieldTypeDescription
availablebooleanWhether device is available
error?AudioDeviceErrorError type (only when available=false)
Possible values: BROWSER_NOT_SUPPORTED, HTTPS_REQUIRED, DEVICE_ENUMERATION_FAILED, PERMISSION_DENIED, DEVICE_NOT_READABLE, DEVICE_CHECK_TIMEOUT
userMessage?stringUser-friendly error prompt text (can be used directly for UI display)
devices?MediaDeviceInfo[]Detected audio device list
permissionState?PermissionStatePermission status ('granted'|'denied'|'prompt')

⚠️ Important Notes:

  • userMessage is an optimized user prompt that can be used directly for UI display
  • If available=true, error and userMessage are undefined
  • permissionState may be undefined (Safari doesn't support Permissions API)

Example Code:

// Listen to device check result
controller.emitter.on('microphone_available', (result) => {
if (!result.available) {
// Display user-friendly error message directly
alert(result.userMessage);

// Or use Toast/Modal and other UI components
showToast({
type: 'error',
message: result.userMessage
});

// Special handling based on error type
if (result.error === 'PERMISSION_DENIED') {
showPermissionGuide(); // Guide user to enable permission
}
else if (result.error === 'HTTPS_REQUIRED') {
showHTTPSWarning(); // Prompt need for HTTPS
}
}
else {
// Device available
console.log('Microphone available, found', result.devices?.length, 'audio devices');
console.log('Permission status:', result.permissionState);
}
});

// Manually trigger device check
await controller.checkAudioDevice();

Error Type Details: See FAQ - Error Code Table for error codes 3100-3105

Related Configuration: checkAudioDeviceBeforeStart - Automatically detect device before ASR starts

Related Method: checkAudioDevice() - Manually trigger device check

device_check_completed

🔍 [Device Detection] Audio device check completed - Low-level device check completion event, containing complete diagnostic data.

Trigger Timing:

  • When manual controller.checkAudioDevice() call completes
  • When automatic detection before ASR start completes (requires checkAudioDeviceBeforeStart: true)

Difference from microphone_available:

  • device_check_completed: Low-level completion event, contains more technical diagnostic data
  • microphone_available: User-facing availability result event

Use Cases:

  • Technical diagnostics and debugging
  • Record detailed device check logs
  • Monitor device check performance metrics

Payload: IDeviceCheckCompletedPayload - Contains fields similar to microphone_available, but may include additional diagnostic information

controller.emitter.on('device_check_completed', (result) => {
console.log('Device check completed:', {
available: result.available,
error: result.error,
deviceCount: result.devices?.length,
timestamp: Date.now()
});

// Send to analytics system
analytics.track('audio_device_check', {
success: result.available,
error_type: result.error,
device_count: result.devices?.length
});
});

Related Configuration: checkAudioDeviceBeforeStart - Automatically detect device before ASR starts

Related Method: checkAudioDevice() - Manually trigger device check

Related Event: microphone_available - User-friendly availability result

Rendering Broadcast Events

render_start

🎬 [High Frequency] Digital Employee starts broadcasting a text segment - Triggered each time a sentence/fragment starts playing.

Use Cases: Sync broadcast subtitles,埋点 statistics for broadcast content

Payload: Object containing text, sessionId, timestamp

controller.emitter.on('render_start', (payload) => {
showSubtitle(payload.text);
logPlaybackEvent(payload);
});

is_rendering_change

🖼️ [State Sync] Digital Employee rendering/broadcasting state change - Reflects in real-time whether "currently speaking".

Core Use Cases:

  • Control UI: Show "Broadcasting..." loading state, disable send button
  • Experience optimization: Avoid user interruption causing fragmented experience during broadcast

Payload: boolean

  • true → Digital Human is rendering lip movements + playing audio (broadcasting)
  • falseBroadcast complete, idle and interactive (safe to input or perform next actions at this point)

⚠️ Note: This state only reflects "whether Digital Human is broadcasting", not whether ASR or Agent is working

controller.emitter.on('is_rendering_change', (isRendering) => {
if (isRendering) {
showPlayingState();
disableSendButton();
}
else {
hidePlayingState();
enableSendButton();
}
});

Control Events

mute

🔇 Mute state change - Triggered when calling .mute(true/false).

Use Cases: Sync UI mute button state

Payload: boolean - true=muted, false=unmuted

controller.emitter.on('mute', (isMuted) => {
console.log('Mute state:', isMuted);
updateMuteButton(isMuted);
});

interrupt

🛑 Conversation interrupted (user triggered or system triggered) - Triggered when calling .interrupt() or voice interruption occurs.

Use Cases: Stop animations, clear input box, reset state

Payload: IInterruptEventPayload containing optional requestId, queryId, sessionId

controller.emitter.on('interrupt', (payload) => {
stopAllAnimations();
clearInputBox();
resetConversationState();
});

Session Management Events

inactivity

⏸️ [Low Frequency] User inactive for long time (triggers opts.inactivityPrompt) - Triggered after configured inactivity time, Digital Employee will broadcast a reminder.

Use Cases: Retain users, prompt before automatically ending session

Payload: Object containing requestId

controller.emitter.on('inactivity', (payload) => {
console.log('User inactive, broadcasting prompt');
showInactivityWarning();
});

busy

🚧 [Low Frequency] System busy (resource limit reached) - Digital Human concurrent connection limit / Agent QPS / RTC bandwidth exceeded.

⚠️ Should guide users to "try again later" or "contact customer service"

Payload: IBusyEventPayload

controller.emitter.on('busy', (payload) => {
showBusyMessage('System currently busy, please try again later');
});

Complete Error Handling Example

Always listen to error event in production:

controller.emitter.on('error', (error) => {
// Log error
console.error('AaaS Pilot Kit Error:', {
code: error.code,
message: error.message,
severity: error.severity,
metadata: error.metadata
});

// Handle based on severity
switch (error.severity) {
case 'low':
case 'medium':
// Show minor notification
showToast(error.message);
break;

case 'high':
case 'critical':
// Show obvious error dialog
showErrorDialog({
title: 'Service Error',
message: error.actionRequired || error.message,
action: 'Retry'
});

// Report to monitoring
reportError(error);
break;
}
});

Real-time Subtitle Implementation Example

Combine multiple events to implement complete real-time subtitle functionality:

// For final subtitles
controller.emitter.on('conversation_add', (payload) => {
// Create new reply bubble
createNewResponseBubble(payload.text);
});

// For real-time typewriter effect of Digital Employee broadcast subtitles
controller.emitter.on('conversation_change', (update) => {
updateResponseBubbleTypingText(update.text, update.completed);
});

// User speech real-time display
controller.emitter.on('asr_message', (payload) => {
if (payload.completed) {
showUserSpeech(payload.text);
}
else {
showUserSpeechPreview(payload.text);
}
});

Event Listening Best Practices

  1. Must Listen Events: ready, error
  2. Recommended Events: conversation_add, is_rendering_change
  3. On-Demand Events: Select other events based on specific functional requirements
  4. Remember to Remove Listeners: Component destruction automatically removes event listeners when dispose() is called, avoiding memory leaks
// Remove all event listeners
controller.emitter.clearListeners();

// Or remove specific event listener
controller.emitter.off('ready', readyHandler);