ModulesAI ModuleAPIsExtracting Metrics

Extracting Provider Metrics

To send accurate telemetry to the OTM API, you need to grab real-time metrics (like prompt tokens, completion tokens, and timing) directly from your LLM provider’s response.

Every SDK hides these metrics in slightly different places. This guide shows you exactly where to find them so you don’t have to spend hours digging through documentation or console logging response objects.

Pro Tip: Always capture the startTime right before you make the API call and firstTokenTime when you receive the first chunk if you’re streaming. This ensures your TTFT (Time To First Token) metrics are spot on.

OpenAI

OpenAI makes it relatively easy to get token usage, provided you aren’t streaming. For streaming, you’ll need to look at the final chunk.

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});
 
// Here is the gold mine:
const { prompt_tokens, completion_tokens, total_tokens } = response.usage;
 
console.log(`Used ${prompt_tokens} input and ${completion_tokens} output tokens.`);

Anthropic

Anthropic calls their usage fields input_tokens and output_tokens. If you use prompt caching, they also provide cache_creation_input_tokens and cache_read_input_tokens.

const message = await anthropic.messages.create({
  model: "claude-3-5-sonnet-20240620",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hi Claude!" }],
});
 
// Standard usage extraction
const { input_tokens, output_tokens } = message.usage;
 
// If you use caching, keep an eye on these too:
const cacheRead = message.usage.cache_read_input_tokens || 0;

Google Gemini

Google wraps their usage in a usageMetadata object (CamelCase in JS/Go, snake_case in Python).

const result = await model.generateContent("Explain telemetry.");
const response = await result.response;
 
// Gemini calls them promptTokenCount and candidatesTokenCount
const { promptTokenCount, candidatesTokenCount } = response.usageMetadata;
 
console.log(`Prompt: ${promptTokenCount}, Output: ${candidatesTokenCount}`);

Next Steps

Now that you’ve got the raw numbers, you’re ready to format them for the Telemetry Usage API. Just map these values to the tokens.input and tokens.output fields in your payload, and you’re good to go!