Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.
In this project, we offer a solution to transform a long video into a document that contains visual and audio information. By doing so, we can easily chat over the video.
To achieve this, we use several technologies such as ChatGPT, CLIP, BLIP2, GRIT, Whisper, and LangChain. These technologies work together to generate a comprehensive document that is easy to understand and use.
To get started with this project, you can follow the installation instructions provided in the install.md file. Once installed, you can generate a video document and save it in the examples/buy_watermelon.log folder.
If you have any suggestions or functions that you would like to see in this codebase, please feel free to drop us an email at [email protected] or [email protected]. You can also open an issue on GitHub.
This project is based on ChatGPT, BLIP2, GRIT, KTS, Whisper, LangChain, and Image2Paragraph.