Embed Text — embed_ollama • ragnar

Embed Text

Usage

embed_ollama(
  x,
  base_url = "http://localhost:11434",
  model = "snowflake-arctic-embed2:568m",
  batch_size = 10L
)

embed_openai(
  x,
  model = "text-embedding-3-small",
  base_url = "https://api.openai.com/v1",
  api_key = get_envvar("OPENAI_API_KEY"),
  dims = NULL,
  user = get_user(),
  batch_size = 20L
)

Arguments

x

x can be:

A character vector, in which case a matrix of embeddings is returned.
A data frame with a column named text, in which case the dataframe is returned with an additional column named embedding.
Missing or NULL, in which case a function is returned that can be called to get embeddings. This is a convenient way to partial in additional arguments like model, and is the most convenient way to produce a function that can be passed to the embed argument of ragnar_store_create().

base_url

string, url where the service is available.

model

string; model name

batch_size

split x into batches when embedding. Integer, limit of strings to include in a single request.

api_key

resolved using env var OPENAI_API_KEY

dims

An integer, can be used to truncate the embedding to a specific size.

user

User name passed via the API.

Value

If x is a character vector, then a numeric matrix is returned, where nrow = length(x) and ncol = <model-embedding-size>. If x is a data.frame, then a new embedding matrix "column" is added, containing the matrix described in the previous sentence.

A matrix of embeddings with 1 row per input string, or a dataframe with an 'embedding' column.

Examples

text <- c("a chunk of text", "another chunk of text", "one more chunk of text")
# \dontrun{
text |>
  embed_ollama() |>
  str()
#> Error in req_perform(req): Failed to perform HTTP request.
#> Caused by error in `curl::curl_fetch_memory()`:
#> ! Couldn't connect to server [localhost]:
#> Failed to connect to localhost port 11434 after 0 ms: Couldn't connect to server

text |>
  embed_openai() |>
  str()
#> Error in embed_openai(text): Can't find env var `OPENAI_API_KEY`.
# }