10 Creative Ways to Monitor Cache Hits in Deepseek Provider

Published On Fri Jan 10 2025
10 Creative Ways to Monitor Cache Hits in Deepseek Provider

Deepseek provider isn't exposing caching values in usage · Issue ...

We read every piece of feedback, and take your input very seriously. To see all available qualifiers, see our documentation.

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community. By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails. Already on GitHub? Sign in to your account.

Monitoring Cache Hits

Two new fields in the API response's usage section help users monitor cache performance:

  • prompt_cache_hit_tokens: Number of tokens from the input that were served from the cache ($0.014 per million tokens)
  • prompt_cache_miss_tokens: Number of tokens from the input that were not served from the cache ($0.14 per million tokens)

This is most likely related to the open-ai compatible package which isn't spreading the usage record.

The text was updated successfully, but these errors were encountered: Hi, thanks for the report, looking into support for this.

Sorry, something went wrong.

@shaper shaper

No branches or pull requests.

Problem with input shaper in vase mode

No branches or pull requests.

An OpenAI Compatible Web Server for llama.cpp

No branches or pull requests.

Track usage and performance in the dashboard