Welcome to Hera's site

Keye Video Understanding Platform – A Multimodal AI Interaction Tool for Kuaishou

Working on Keye gave me a unique opportunity to engage with a large-scale multimodal AI model

Project Overview

Keye is a multimodal large-scale AI model independently developed by Kuaishou, capable of video understanding and language generation.

My Goal

The platform supports uploading, browsing, and managing Kuaishou in-app videos and enables real-time AI interaction to analyze videos and extract key information.

My goal was to design with a simple, intuitive interface and efficient interaction flow, minimizing the learning curve while ensuring quick AI feedback to enhance productivity.

Product Statement

Kuaishou teams have a growing need for efficient video management and deeper content understanding. The Keye platform is expected to provide...

AI-powered Conversational Understanding

enabling users to interact with the video through real-time Q&A and receive intelligent, context-aware responses.

Seamless video
upload and management

allowing users to easily upload, view, delete, and replace videos using a video ID.

Low Learning Cost
with Intuitive UI

a clean and minimal interface that requires little to no onboarding for new users.

Identified key workflows

1

Video Upload

2

Video Replacement

3

AI Q&A interaction

4

Conversation History

Product Features

1. Video Upload and Management

Single video upload by entering a video ID (PID)

Display of detailed video information after upload

Ability to delete or replace uploaded videos

2. Video Browsing

View uploaded videos directly on the platform

Easily switch between videos to initiate new conversations

Product Features

3. AI Interaction for Video Understanding

Real-time conversational Q&A about the uploaded video

Prompt suggestions based on keywords to guide user input

Support for Like/Dislike, Copy, and Regenerate response actions

Each new conversation records the current video information

Next Iteration: Structural Video Understanding

Following the successful launch of Keye v1.0 with AI-based conversational video understanding, the next version focuses on structural video understanding to provide deeper insights and content validation.

New Page

Automatic Tag Extraction

Originality Check

Related Video Recommendations

Keye Webpage for Product Presentation

To better showcase the platform internally and externally, I also designed and developed a dedicated webpage. The webpage serves as a centralized product presentation hub, enabling audience to quickly understand Keye’s team contribution.

Reflections

Reflections

Working on Keye gave me a unique opportunity to engage with a large-scale multimodal AI model and translate cutting-edge technology into practical user experiences.
This project deepened my understanding of how AI can be positioned not just as a backend system, but as an interactive assistant that enhances productivity.
The development faced significant resource constraints: No dedicated engineering team and Time-bound delivery. Despite these limitations, I proved the potential of multimodal AI in video understanding.