Creating My First AI Agent With Vicuna and Langchain
If you want to create an AI agent that writes and executes Python code based on a prompt, you can follow the steps described in this article. The process involves using open-source tools and resources, and it can be a fun experience.
Tools Used
- Vicuna service
- Langchain
- Fast Chat library
Accessing these models is fairly easy, you need two models: The llama-7b converted to the hugging face format You can then follow the Fast Chat documentation on GitHub I’d also suggest to install the Fast Chat library to solve the dependencies for you.
To have an AI agent, it’s important that we can pass stop tokens, and detect them, stopping the generation and returning what we have so far. This is the key to having an interactive flow for an AI agent.
To do this, you could implement a fork of the API server provided by Fast Chat. One important base function is called generate_stream. Once this is sorted, you can start a Fast API server with a single endpoint.
Creating a ReAct Agent
To create a ReAct agent, you need a custom LLM that uses your Vicuna service. This is a rather straightforward process, and you can follow the documentation that I used or just copy my code.
The final step to connect everything together is rather simple. You just import the langchain library components and our own VicunaLLM that we’ve previously created. At this point, you should also have the local inference server running.
Although it took much longer than just writing the script myself, the process was actually very fun! It is indeed possible to develop LLM AI Agents with open source, although it also shows how the currently available models are not so straightforward to use — and we probably need these models fine-tuned for this specific use case to achieve satisfactory results.