Welcome to Hera's site

Keye Video Understanding Platform – A Multimodal AI Interaction Tool for Kuaishou

Hey Everyone 👋🏻 Let's have a look at my latest project at Kuaishou!

Project Overview

Keye is a multimodal large-scale AI model independently developed by Kuaishou, capable of visual perception (image/video understanding), language generation (copywriting, translation), and logical reasoning (math, coding).

This project aimed to design and build the Keye platform (v1.0) as an internal tool for Kuaishou teams. The platform supports uploading, browsing, and managing Kuaishou in-app videos and enables real-time AI interaction to analyze videos and extract key information.

The platform was designed with a simple, intuitive interface and efficient interaction flow, minimizing the learning curve while ensuring quick AI feedback to enhance productivity.

Product Statement
and Goals

Kuaishou teams have a growing need for efficient video management and deeper content understanding. The Keye platform is expected to provide...

AI-powered Conversational Understanding

enabling users to interact with the video through real-time Q&A and receive intelligent, context-aware responses.

Seamless video
upload and management

allowing users to easily upload, view, delete, and replace videos using a video ID.

Low Learning Cost
with Intuitive UI

a clean and minimal interface that requires little to no onboarding for new users.

Identified key workflows

1

Video Upload

2

Video Replacement

3

AI Q&A interaction

4

Conversation History

Product Features

1. Video Upload and Management

Single video upload by entering a video ID (PID)

Display of detailed video information after upload

Ability to delete or replace uploaded videos

2. Video Browsing

View uploaded videos directly on the platform

Easily switch between videos to initiate new conversations

Product Features

3. AI Interaction for Video Understanding

Real-time conversational Q&A about the uploaded video

Prompt suggestions based on keywords to guide user input

Support for Like/Dislike, Copy, and Regenerate response actions

Each new conversation records the current video information

Next Iteration: Structural Video Understanding

Following the successful launch of Keye v1.0 with AI-based conversational video understanding, the next version focuses on structural video understanding to provide deeper insights and content validation.

New Page

Automatic Tag Extraction

Originality Check

Related Video Recommendations

Results

Results

The clean and minimal interface allowed users to start using the platform without formal training and improved collaboration
Real-time AI responses significantly reduced the time spent on manual video analysis and content review tasks
Internal users reported higher productivity and found the platform particularly valuable for quick decision-making and content evaluation