FAQ
FAQ
General Questions
What is the Woven City AI Vision Engine?
The Woven City AI Vision Engine is a multimodal large-scale foundation AI designed for video analysis. It interprets visual scenes by identifying objects, understanding behaviors and situations, and providing contextual insights through text-based responses.
What is the model parameter size?
8 billion parameters.
System Requirements & Setup
What are the system requirements?
See the Introduction page for detailed system requirements including GPU specifications, memory requirements, and supported operating systems.
How do I activate my license?
Follow the License Activation guide for step-by-step instructions on activating your software license using CodeMeter.
Can I use the SDK without a GUI on Ubuntu Server?
Yes. The License Activation page includes instructions for headless mode using XQuartz for X11 forwarding from a Mac client.
The SDK fails with "LICENSE_NOT_AVAILABLE" (Error 35) — what does it mean?
CodeMeter could not find a valid license for the SDK's tier. The message looks like:
Common causes:
CodeMeter License: <FirmCode>:<ProductCode> : The Expiration Time is overrun
- the en-/decryption cannot be operated, Error 35. (LICENSE_NOT_AVAILABLE)
- The license has expired — check the Expiration Date with
cmu --list-content. - No license for your tier is activated in any connected CmContainer.
- The CodeMeter service is not running (
sudo systemctl status codemeter).
cmu --list-content shows a valid, non-expired license for your tier (see License Activation). For license renewal, please contact your sales representative.
API Usage
What are the minimum image size requirements?
Images should not be smaller than 448x448 pixels for optimal performance.
What is the difference between image_chat, rgb_image_chat, and rgb_images_chat?
image_chat: Uses an image file pathrgb_image_chat: Uses a single in-memory RGB NumPy array (HWC format)rgb_images_chat: Uses a list of in-memory RGB NumPy arrays
How do I maintain conversation history across multiple turns?
Set
return_history=True and pass the returned chat_history to subsequent API calls:
What do the API methods return?
By default, only the generated text is returned. With
return_history or return_sampling_info enabled, the function returns a tuple: (text, history, sampling_info). Unpack based on which options you enabled.
Can I use the same chat history across different videos or images?
Technically yes, but it's not recommended. Chat history is designed for multi-turn conversations about the same visual content. Using it across different content may lead to confusing or incorrect responses.
Sampling Methods
What is the difference between min_num_frames and max_num_frames?
These parameters control the frame count range when using "duration", "middle", or "random" sampling methods:
min_num_frames: Minimum frames to sample (default: 64)max_num_frames: Maximum frames to sample (default: 512)
How can I see which frames were sampled?
Set
return_sampling_info=True to receive detailed sampling metadata including frame indices and timestamps.
How does the SDK handle very short videos?
For very short videos with "duration" sampling, frames are calculated from the video's FPS and adjusted to the nearest lower multiple of 8. For example, a 1-second video at 30 FPS will sample 24 frames.
Generation Configuration
How do I control the randomness of generated text?
Use the
generation_config parameter with temperature:
Lower temperature (e.g., 0.2) produces more deterministic output, while higher values (e.g., 0.8) increase creativity.
What other generation parameters can I configure?
See the Generation Configuration page for details on
top_k, top_p, max_new_tokens, do_sample, and other parameters.
Performance & Optimization
How can I improve inference speed?
- Reduce
max_num_frames to process fewer frames
- Use appropriate sampling methods for your use case
- Choose shorter input videos when possible
- Warm up the model with dummy inferences for real-time applications
Does processing time increase with more frames?
Yes, processing time increases with the number of frames. Balance speed and output quality by selecting appropriate sampling strategies and frame limits.
Why is the first inference slower than subsequent ones?
The first inference includes initialization overhead (loading model weights, allocating GPU memory, and optimizing execution). For real-time applications, run a few dummy inferences to warm up the model after loading.
What is the inference speed?
Inference speed varies based on hardware configuration, input resolution, and sampling settings. Please contact your sales representative for detailed performance information.
Application Integration
Should I validate inputs before passing them to the SDK?
Yes, the SDK assumes inputs are already validated. Validate file paths, image arrays, and parameters before API calls to avoid errors.
What is the maximum number of frames for inference?
The maximum number of frames depends on your GPU specifications. We recommend testing with your hardware to find the optimal limit. Note that processing more frames requires more GPU memory and increases inference time.
Why is latency high when integrating with other systems?
Image transfer between services is often the bottleneck. We recommend profiling data transfer times across your services. Consider compressing images or reducing image size to minimize transfer time. Test different compression levels and sizes to find the balance between transfer speed and quality that meets your business requirements.
← Previous: API Usage Examples
Next: Changelog →