Hi,
I have a llama3-70b-001 model deployed to Vertex AI via the Model Garden. I want to get predictions via the REST API from a Node.js application.
Here's the request I am making:
const response = await fetch(`https://${region}-aiplatform.googleapis.com/v1/projects/${project}/locations/us-west4/endpoints/${endpoint}:predict`, {
method: 'POST',
headers: {
Authorization: `Bearer ${token}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
instances: [
{
prompt: 'You are a career advisor. Give me 10 tips for a good CV.',
},
],
parameters: {
max_output_tokens: maxTokens,
temperature,
},
}),
cache: 'no-store',
});
Here's the response I am getting.
{
predictions: [
'Prompt:\n' +
'You are a career advisor. Give me 10 tips for a good CV.\n' +
'Output:\n' +
' _Use the phrases in the box_.\n' +
'\\begin{tabular}{l'
],
deployedModelId: <redacted>,
model: <redacted>,
modelDisplayName: 'llama3-70b-001',
modelVersionId: '1'
}
I have a couple of questions:
I have tried with llama-3-70b-chat-001 as well, with similar results. The documentation on how to pass parameters to specific models is lacking, or at least I couldn't find it.
Thanks!