Multiple Tool use Duplicates output_text - Bugs in Responses API
When using multiple tools for a single Responses API call, I encounter a concerning issue where I receive multiple 'end user' messages ('output_text') instead of an integrated response. This means that I may get a response based on web search followed by a completely different response based on file search. This unexpected behavior occurs both in the Playground, where both responses are displayed one after the other, and when using the Python Library.
Issue with OpenAI Injection
The root of this problem seems to lie in OpenAI injecting their own message related to how to reproduce information from news. The AI is influenced by this injected message, which leads to unexpected behavior in the responses provided. This can be mitigated by providing clearer instructions to the AI, guiding it on how to handle different types of searches and tools.

Handling Multiple Queries
If more specific instructions are provided, such as directing the AI to first search the web and then search files, the Responses API can iterate internally and make multiple tool calls to generate comprehensive output. The system is designed to accommodate such scenarios, allowing for seamless integration of different tools and queries without manual intervention.
Ensuring Control Over AI
To prevent the AI from operating beyond your control and generating content not authored by you, it is advisable to utilize your own API services for web search and RAG. By customizing the API usage, you can ensure that the AI follows your specified guidelines and does not deviate into unauthorized realms of content generation.

Integration Challenges
Despite some level of integration between the first and second responses, there remains a challenge in consolidating the details from different user-targeted messages into a coherent final output. The process flow of 'Web Search → User output 1 → File Search → Final Output' does not allow for simultaneous access to all information by the model, resulting in fragmented outputs.
Powered by Discourse, best viewed with JavaScript enabled