ragnar_retrieve()
is a thin wrapper around ragnar_retrieve_vss_and_bm25()
using the recommended best practices.
Arguments
- store
A
RagnarStore
object or adplyr::tbl()
derived from it. When you pass atbl
, you may use usual dplyr verbs (e.g.filter()
,slice()
) to restrict the rows examined before similarity scoring. Avoid dropping essential columns such astext
,embedding
,origin
, andhash
.- text
A string to find the nearest match too
- top_k
Integer, the number of nearest entries to find per method.
Value
A dataframe of retrieved chunks. Each row corresponds to an
individual chunk in the store. It always contains a column named text
that contains the chunks.
Pre-filtering with dplyr
The store behaves like a lazy table backed by DuckDB, so row‑wise filtering is executed directly in the database. This lets you narrow the search space efficiently without pulling data into R.
See also
Other ragnar_retrieve:
ragnar_retrieve_bm25()
,
ragnar_retrieve_vss()
,
ragnar_retrieve_vss_and_bm25()
Examples
if (FALSE) { # (rlang::is_installed("dbplyr") && nzchar(Sys.getenv("OPENAI_API_KEY")))
# Basic usage
store <- ragnar_store_create(
embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small")
)
ragnar_store_insert(store, data.frame(text = c("foo", "bar")))
ragnar_store_build_index(store)
ragnar_retrieve(store, "foo")
# More Advanced: store metadata, retrieve with pre-filtering
store <- ragnar_store_create(
embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small"),
extra_cols = data.frame(category = character())
)
ragnar_store_insert(
store,
data.frame(
category = "desert",
text = c("ice cream", "cake", "cookies")
)
)
ragnar_store_insert(
store,
data.frame(
category = "meal",
text = c("steak", "potatoes", "salad")
)
)
ragnar_store_build_index(store)
# simple retrieve
ragnar_retrieve(store, "carbs")
# retrieve with pre-filtering
dplyr::tbl(store) |>
dplyr::filter(category == "meal") |>
ragnar_retrieve("carbs")
}