This guide explains how to perform Human-in-the-Loop (HITL) evaluations of your prompt’s performance.
Prompt
In this example, we have a simple prompt that asks the LLM to generate a joke. We want users to be able to evaluate the quality of the joke and provide feedback.
---
provider: Latitude
model: gpt-4o-mini
temperature: 0.7
---
Please tell me a joke about cats.
How does this work?
In this scenario, we use OpenAI’s API directly to run the prompt defined in Latitude. We retrieve the prompt using Latitude’s SDK, display the messages, and send them to the OpenAI API.
Once the model completes its response, we upload a log to our prompt in Latitude, which you can view in the prompt’s logs section.
Finally, we annotate the log with the feedback received from the user. In this case, the user rates the joke on a scale from 1 to 5 and provides a reason for their rating.
You can learn more about HITL (Human-in-the-Loop) evaluations in our documentation.
Code examples
import { Latitude, Adapters } from '@latitude-data/sdk'
import OpenAI from 'openai'
// To run this example you need to create a evaluation on the prompt: `annontate-log/example`
// Info: https://docs.latitude.so/guides/evaluations/overview
const EVALUATION_UUID = 'YOUR_EVALUATION_UUID'
async function run() {
const sdk = new Latitude(process.env.LATITUDE_API_KEY, {
projectId: Number(process.env.PROJECT_ID),
versionUuid: 'live',
})
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })
// Get the prompt from Latitude
const prompt = await sdk.prompts.get('annotate-log/example')
// Generate messages from the Latitude prompt
// These messages are valid OpenAI messages. Note that we passed the Adapters.openai
const { config, messages } = await sdk.prompts.render({
prompt: { content: prompt.content },
parameters: {},
adapter: Adapters.openai,
})
// Call OpenAI
const llmResponse = await openai.chat.completions.create({
// @ts-ignore
messages,
model: config.model as string,
})
const { uuid } = await sdk.logs.create('annotate-log/example', messages, {
response: llmResponse.choices[0].message.content,
})
// Score from 1 to 5 because the evaluation we created is of type `
// More info: https://docs.latitude.so/guides/evaluations/humans-in-the-loop
const result = await sdk.evaluations.annotate(uuid, 5, EVALUATION_UUID, {
reason: 'This is a good joke!',
})
console.log('Result:', JSON.stringify(result, null, 2))
}
run()