Skip to contents

Runs ragnar_retrieve_vss() and ragnar_retrieve_bm25() and get the distinct documents.

Usage

ragnar_retrieve_vss_and_bm25(store, text, top_k = 3, ...)

Arguments

store

A RagnarStore object or a dplyr::tbl() derived from it. When you pass a tbl, you may use usual dplyr verbs (e.g. filter(), slice()) to restrict the rows examined before similarity scoring. Avoid dropping essential columns such as text, embedding, origin, and hash.

text

A string to find the nearest match too

top_k

Integer, the number of entries to retrieve using per method.

...

Forwarded to ragnar_retrieve_vss()

Value

A dataframe of retrieved chunks. Each row corresponds to an individual chunk in the store. It always contains a column named text that contains the chunks.

Note

The results are not re-ranked after identifying the unique values.

Pre-filtering with dplyr

The store behaves like a lazy table backed by DuckDB, so row‑wise filtering is executed directly in the database. This lets you narrow the search space efficiently without pulling data into R.

See also

Examples

if (FALSE) { # (rlang::is_installed("dbplyr") && nzchar(Sys.getenv("OPENAI_API_KEY")))
# Basic usage
store <- ragnar_store_create(
  embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small")
)
ragnar_store_insert(store, data.frame(text = c("foo", "bar")))
ragnar_store_build_index(store)
ragnar_retrieve(store, "foo")

# More Advanced: store metadata, retrieve with pre-filtering
store <- ragnar_store_create(
  embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small"),
  extra_cols = data.frame(category = character())
)

ragnar_store_insert(
  store,
  data.frame(
    category = "desert",
    text = c("ice cream", "cake", "cookies")
  )
)

ragnar_store_insert(
  store,
  data.frame(
    category = "meal",
    text = c("steak", "potatoes", "salad")
  )
)

ragnar_store_build_index(store)

# simple retrieve
ragnar_retrieve(store, "carbs")

# retrieve with pre-filtering
dplyr::tbl(store) |>
  dplyr::filter(category == "meal") |>
  ragnar_retrieve("carbs")
}