Enhancing Video Content Metadata Using AI-Generated Summaries
Our organization faced the challenge of managing vast video archives with minimal metadata, making content difficult to search and access. The presented solution transcribes and summarizes video files using advanced Natural Language Processing (NLP) tools, turning hours of footage into concise, searchable metadata. Key takeaways include significant time savings—reducing months of manual work to days—and improved accessibility of video archives. The implementation’s robust error handling, seamless integration, and adaptability make it valuable for a range of applications beyond our initial use case.
Background Information
In the realm of digital preservation, the accurate and detailed description of stored content is crucial for effective retrieval and management. Preservica, a widely used digital preservation system, has been used to ingest terabytes of video files in our organization that currently lack descriptive metadata beyond their titles. This limitation hampers the ability to search and utilize these video files effectively, further complicating the management and retrieval of valuable information. This problem necessitates a solution that can automatically generate and update descriptive metadata for the video files. To address this, we developed an automated solution that employs a multi-step process for summarizing video files and updating their metadata within Preservica.
The Workflow Process Entails the Following Steps:
- Transcription of video files using OpenAI Whisper.
- Tokenization of transcripts with Tiktoken.
- Chunking transcripts if they exceed OpenAI ChatGPT-4o’s token limit.
- Generation of summaries for each chunk and a final summary via the OpenAI ChatGPT-4o API.
- Upload of AI-generated summaries to Preservica using the pyPreservica library.
Key Takeaways
- Reduces manual effort from months to days.
- Greatly improves accessibility and searchability of video archives.
- Integrates seamlessly with existing systems.
- Adaptable for various domains (e.g., interviews, education, etc.).
- Scalable and efficient for managing large volumes of video content.
Listen to the poster presentation by clicking on this video.