Skip to contents

ragnar_retrieve() is a thin wrapper around ragnar_retrieve_vss_and_bm25() using the recommended best practices.

Usage

ragnar_retrieve(store, text, top_k = 3L)

Arguments

store

A RagnarStore object or a dplyr::tbl() derived from it. When you pass a tbl, you may use usual dplyr verbs (e.g. filter(), slice()) to restrict the rows examined before similarity scoring. Avoid dropping essential columns such as text, embedding, origin, and hash.

text

A string to find the nearest match too

top_k

Integer, the number of nearest entries to find per method.

Value

A dataframe of retrieved chunks. Each row corresponds to an individual chunk in the store. It always contains a column named text that contains the chunks.

Pre-filtering with dplyr

The store behaves like a lazy table backed by DuckDB, so row‑wise filtering is executed directly in the database. This lets you narrow the search space efficiently without pulling data into R.

Examples

if (FALSE) { # (rlang::is_installed("dbplyr") && nzchar(Sys.getenv("OPENAI_API_KEY")))
# Basic usage
store <- ragnar_store_create(
  embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small")
)
ragnar_store_insert(store, data.frame(text = c("foo", "bar")))
ragnar_store_build_index(store)
ragnar_retrieve(store, "foo")

# More Advanced: store metadata, retrieve with pre-filtering
store <- ragnar_store_create(
  embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small"),
  extra_cols = data.frame(category = character())
)

ragnar_store_insert(
  store,
  data.frame(
    category = "desert",
    text = c("ice cream", "cake", "cookies")
  )
)

ragnar_store_insert(
  store,
  data.frame(
    category = "meal",
    text = c("steak", "potatoes", "salad")
  )
)

ragnar_store_build_index(store)

# simple retrieve
ragnar_retrieve(store, "carbs")

# retrieve with pre-filtering
dplyr::tbl(store) |>
  dplyr::filter(category == "meal") |>
  ragnar_retrieve("carbs")
}