embed_databricks() gets embeddings for text using a model hosted in a
Databricks workspace. It relies on the ellmer package for managing
Databricks credentials. See ellmer::chat_databricks for more on
supported modes of authentication.
Usage
embed_databricks(
x,
workspace = databricks_workspace(),
model = "databricks-bge-large-en",
batch_size = 512L
)Arguments
- x
x can be:
A character vector, in which case a matrix of embeddings is returned.
A data frame with a column named
text, in which case the dataframe is returned with an additional column namedembedding.Missing or
NULL, in which case a function is returned that can be called to get embeddings. This is a convenient way to partial in additional arguments likemodel, and is the most convenient way to produce a function that can be passed to theembedargument ofragnar_store_create().
- workspace
The URL of a Databricks workspace, e.g.
"https://example.cloud.databricks.com". Will use the value of the environment variableDATABRICKS_HOST, if set.- model
The name of a text embedding model.
- batch_size
split
xinto batches when embedding. Integer, limit of strings to include in a single request.
