RepoMeta AI (Llama)Meta AI (Llama)published Apr 2, 2025seen 6d

meta-llama/llama-api-typescript

TypeScript

Open original ↗

Captured source

source ↗
published Apr 2, 2025seen 6dcaptured 13hhttp 200method plain

meta-llama/llama-api-typescript

Description: The official Typescript library for the Llama API

Language: TypeScript

License: MIT

Stars: 37

Forks: 13

Open issues: 14

Created: 2025-04-02T23:05:49Z

Pushed: 2026-05-27T15:57:19Z

Default branch: main

Fork: no

Archived: no

README:

Llama API Client TypeScript API Library

[![NPM version]()](https://npmjs.org/package/llama-api-client)

This library provides convenient access to the Llama API Client REST API from server-side TypeScript or JavaScript.

The REST API documentation can be found on llama.developer.meta.com. The full API of this library can be found in [api.md](api.md).

It is generated with Stainless.

Installation

npm install llama-api-client

Usage

The full API of this library can be found in [api.md](api.md).

import LlamaAPIClient from 'llama-api-client';

const client = new LlamaAPIClient({
apiKey: process.env['LLAMA_API_KEY'], // This is the default and can be omitted
});

const createChatCompletionResponse = await client.chat.completions.create({
messages: [{ content: 'string', role: 'user' }],
model: 'model',
});

console.log(createChatCompletionResponse.completion_message);

Streaming responses

We provide support for streaming responses using Server Sent Events (SSE).

import LlamaAPIClient from 'llama-api-client';

const client = new LlamaAPIClient();

const stream = await client.chat.completions.create({
messages: [{ content: 'string', role: 'user' }],
model: 'model',
stream: true,
});
for await (const createChatCompletionResponseStreamChunk of stream) {
console.log(createChatCompletionResponseStreamChunk);
}

If you need to cancel a stream, you can break from the loop or call stream.controller.abort().

Request & Response types

This library includes TypeScript definitions for all request params and response fields. You may import and use them like so:

import LlamaAPIClient from 'llama-api-client';

const client = new LlamaAPIClient({
apiKey: process.env['LLAMA_API_KEY'], // This is the default and can be omitted
});

const params: LlamaAPIClient.Chat.CompletionCreateParams = {
messages: [{ content: 'string', role: 'user' }],
model: 'model',
};
const createChatCompletionResponse: LlamaAPIClient.CreateChatCompletionResponse =
await client.chat.completions.create(params);

Documentation for each method, request param, and response field are available in docstrings and will appear on hover in most modern editors.

File uploads

Request parameters that correspond to file uploads can be passed in many different forms:

  • File (or an object with the same structure)
  • a fetch Response (or an object with the same structure)
  • an fs.ReadStream
  • the return value of our toFile helper
import fs from 'fs';
import LlamaAPIClient, { toFile } from 'llama-api-client';

const client = new LlamaAPIClient();

// If you have access to Node `fs` we recommend using `fs.createReadStream()`:
await client.uploads.part('upload_id', { data: fs.createReadStream('/path/to/file') });

// Or if you have the web `File` API you can pass a `File` instance:
await client.uploads.part('upload_id', { data: new File(['my bytes'], 'file') });

// You can also pass a `fetch` `Response`:
await client.uploads.part('upload_id', { data: await fetch('https://somesite/file') });

// Finally, if none of the above are convenient, you can use our `toFile` helper:
await client.uploads.part('upload_id', { data: await toFile(Buffer.from('my bytes'), 'file') });
await client.uploads.part('upload_id', { data: await toFile(new Uint8Array([0, 1, 2]), 'file') });

Handling errors

When the library is unable to connect to the API, or if the API returns a non-success status code (i.e., 4xx or 5xx response), a subclass of APIError will be thrown:

const createChatCompletionResponse = await client.chat.completions
.create({ messages: [{ content: 'string', role: 'user' }], model: 'model' })
.catch(async (err) => {
if (err instanceof LlamaAPIClient.APIError) {
console.log(err.status); // 400
console.log(err.name); // BadRequestError
console.log(err.headers); // {server: 'nginx', ...}
} else {
throw err;
}
});

Error codes are as follows:

| Status Code | Error Type | | ----------- | -------------------------- | | 400 | BadRequestError | | 401 | AuthenticationError | | 403 | PermissionDeniedError | | 404 | NotFoundError | | 422 | UnprocessableEntityError | | 429 | RateLimitError | | >=500 | InternalServerError | | N/A | APIConnectionError |

Retries

Certain errors will be automatically retried 2 times by default, with a short exponential backoff. Connection errors (for example, due to a network connectivity problem), 408 Request Timeout, 409 Conflict, 429 Rate Limit, and >=500 Internal errors will all be retried by default.

You can use the maxRetries option to configure or disable this:

// Configure the default for all requests:
const client = new LlamaAPIClient({
maxRetries: 0, // default is 2
});

// Or, configure per-request:
await client.chat.completions.create({ messages: [{ content: 'string', role: 'user' }], model: 'model' }, {
maxRetries: 5,
});

Timeouts

Requests time out after 1 minute by default. You can configure this with a timeout option:

// Configure the default for all requests:
const client = new LlamaAPIClient({
timeout: 20 * 1000, // 20 seconds (default is 1 minute)
});

// Override per-request:
await client.chat.completions.create({ messages: [{ content: 'string', role: 'user' }], model: 'model' }, {
timeout: 5 * 1000,
});

On timeout, an APIConnectionTimeoutError is thrown.

Note that requests which time out will be [retried twice by default](#retries).

Advanced Usage

Accessing raw Response data (e.g., headers)

The "raw" Response returned by fetch() can be accessed through the .asResponse() method on the APIPromise type that all methods return. This method returns as soon as the headers for a successful response are received and does not consume the response body, so you are free to write custom parsing or streaming logic.

You can also use the .withResponse() method to get the raw Response along with the parsed data. Unlike .asResponse() this method consumes the body, returning once it is parsed.

const client = new LlamaAPIClient();

const response = await client.chat.completions…

Excerpt shown — open the source for the full document.

Notability

notability 3.0/10

Low star count, routine new repo