Skip to content

Uses the Cortex API EMBED functions to generate embeddings.

Usage

embed_snowflake(
  x,
  account = snowflake_account(),
  credentials = NULL,
  model = "snowflake-arctic-embed-m-v1.5",
  api_args = list(),
  batch_size = 512L
)

Arguments

x

x can be:

  • A character vector, in which case a matrix of embeddings is returned.

  • A data frame with a column named text, in which case the dataframe is returned with an additional column named embedding.

  • Missing or NULL, in which case a function is returned that can be called to get embeddings. This is a convenient way to partial in additional arguments like model, and is the most convenient way to produce a function that can be passed to the embed argument of ragnar_store_create().

account

A Snowflake account identifier, e.g. "testorg-test_account". Defaults to the value of the SNOWFLAKE_ACCOUNT environment variable.

credentials

A list of authentication headers to pass into httr2::req_headers(), a function that returns them when called, or NULL, the default, to use ambient credentials.

model

string; model name

api_args

Named list of arbitrary extra arguments appended to the body of every chat API call. Combined with the body object generated by ellmer with modifyList().

batch_size

split x into batches when embedding. Integer, limit of strings to include in a single request.

Details

See complete documentation.

Authentication

  • a Programmatic Access Token (PAT) defined via the SNOWFLAKE_PAT environment variable.

  • A static OAuth token defined via the SNOWFLAKE_TOKEN environment variable.

  • Key-pair authentication credentials defined via the SNOWFLAKE_USER and SNOWFLAKE_PRIVATE_KEY (which can be a PEM-encoded private key or a path to one) environment variables.

  • Posit Workbench-managed Snowflake credentials for the corresponding account.

  • Viewer-based credentials on Posit Connect. Requires the connectcreds package.