FAQ

General Questions

What is the Woven City AI Vision Engine?

The Woven City AI Vision Engine is a multimodal large-scale foundation AI designed for video analysis. It interprets visual scenes by identifying objects, understanding behaviors and situations, and providing contextual insights through text-based responses.

System Requirements & Setup

What are the system requirements?

See the Introduction page for detailed system requirements including GPU specifications, memory requirements, and supported operating systems.

How do I activate my license?

Follow the License Activation guide for step-by-step instructions on activating your software license using CodeMeter.

Can I use the SDK without a GUI on Ubuntu Server?

Yes. The License Activation page includes instructions for headless mode using XQuartz for X11 forwarding from a Mac client.

The SDK fails with "LICENSE_NOT_AVAILABLE" (Error 35) — what does it mean?

CodeMeter could not find a valid license for the SDK's tier. The message looks like:

CodeMeter License: <FirmCode>:<ProductCode> : The Expiration Time is overrun
- the en-/decryption cannot be operated, Error 35. (LICENSE_NOT_AVAILABLE)

Common causes:

The license has expired — check the Expiration Date with cmu --list-content.
No license for your tier is activated in any connected CmContainer.
The CodeMeter service is not running (sudo systemctl status codemeter).

To resolve, ensure the CodeMeter service is running and that cmu --list-content shows a valid, non-expired license for your tier (see License Activation). For license renewal, please contact your sales representative.

API Usage

What are the minimum image size requirements?

Images should not be smaller than 448x448 pixels for optimal performance.

What is the difference between image_chat, rgb_image_chat, and rgb_images_chat?

image_chat: Uses an image file path
rgb_image_chat: Uses a single in-memory RGB NumPy array (HWC format)
rgb_images_chat: Uses a list of in-memory RGB NumPy arrays

How do I maintain conversation history across multiple turns?

Set return_history=True and pass the returned chat_history to subsequent API calls:

response1, chat_history = client.video_chat(
    prompt=prompt1,
    video_path=video_path,
    return_history=True
)

response2, chat_history = client.video_chat(
    prompt=prompt2,
    video_path=video_path,
    chat_history=chat_history,
    return_history=True
)

What do the API methods return?

By default, only the generated text is returned. With return_history or return_sampling_info enabled, the function returns a tuple: (text, history, sampling_info). Unpack based on which options you enabled.

Can I use the same chat history across different videos or images?

Technically yes, but it's not recommended. Chat history is designed for multi-turn conversations about the same visual content. Using it across different content may lead to confusing or incorrect responses.

Sampling Methods

What is the difference between min_num_frames and max_num_frames?

These parameters control the frame count range when using "duration", "middle", or "random" sampling methods:

min_num_frames: Minimum frames to sample (default: 64)
max_num_frames: Maximum frames to sample (default: 512)

The actual number sampled depends on video duration and the selected method.

How can I see which frames were sampled?

Set return_sampling_info=True to receive detailed sampling metadata including frame indices and timestamps.

How does the SDK handle very short videos?

For very short videos with "duration" sampling, frames are calculated from the video's FPS and adjusted to the nearest lower multiple of 8. For example, a 1-second video at 30 FPS will sample 24 frames.

Generation Configuration

How do I control the randomness of generated text?

Use the generation_config parameter with temperature:

generation_config = {"temperature": 0.7}
response = client.video_chat(
    prompt=prompt,
    video_path=video_path,
    generation_config=generation_config
)

Lower temperature (e.g., 0.2) produces more deterministic output, while higher values (e.g., 0.8) increase creativity.

What other generation parameters can I configure?

See the Generation Configuration page for details on top_k, top_p, max_new_tokens, do_sample, and other parameters.

Performance & Optimization

How can I improve inference speed?

- Reduce max_num_frames to process fewer frames
- Use appropriate sampling methods for your use case
- Choose shorter input videos when possible
- Warm up the model with dummy inferences for real-time applications

Does processing time increase with more frames?

Yes, processing time increases with the number of frames. Balance speed and output quality by selecting appropriate sampling strategies and frame limits.

Why is the first inference slower than subsequent ones?

The first inference includes initialization overhead (loading model weights, allocating GPU memory, and optimizing execution). For real-time applications, run a few dummy inferences to warm up the model after loading.

What is the inference speed?

Inference speed varies based on hardware configuration, input resolution, and sampling settings. Please contact your sales representative for detailed performance information.

Application Integration

Should I validate inputs before passing them to the SDK?

Yes, the SDK assumes inputs are already validated. Validate file paths, image arrays, and parameters before API calls to avoid errors.

What is the maximum number of frames for inference?

The maximum number of frames depends on your GPU specifications. We recommend testing with your hardware to find the optimal limit. Note that processing more frames requires more GPU memory and increases inference time.

Why is latency high when integrating with other systems?

Image transfer between services is often the bottleneck. We recommend profiling data transfer times across your services. Consider compressing images or reducing image size to minimize transfer time. Test different compression levels and sizes to find the balance between transfer speed and quality that meets your business requirements.

← Previous: API Usage Examples

Next: Changelog →