Runs ragnar_retrieve_vss()
and ragnar_retrieve_bm25()
and get the distinct
documents.
Arguments
- store
A
RagnarStore
object or adplyr::tbl()
derived from it. When you pass atbl
, you may use usual dplyr verbs (e.g.filter()
,slice()
) to restrict the rows examined before similarity scoring. Avoid dropping essential columns such astext
,embedding
,origin
, andhash
.- text
A string to find the nearest match too
- top_k
Integer, the number of entries to retrieve using per method.
- ...
Forwarded to
ragnar_retrieve_vss()
Value
A dataframe of retrieved chunks. Each row corresponds to an
individual chunk in the store. It always contains a column named text
that contains the chunks.
Pre-filtering with dplyr
The store behaves like a lazy table backed by DuckDB, so row‑wise filtering is executed directly in the database. This lets you narrow the search space efficiently without pulling data into R.
See also
Other ragnar_retrieve:
ragnar_retrieve()
,
ragnar_retrieve_bm25()
,
ragnar_retrieve_vss()
Examples
if (FALSE) { # (rlang::is_installed("dbplyr") && nzchar(Sys.getenv("OPENAI_API_KEY")))
# Basic usage
store <- ragnar_store_create(
embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small")
)
ragnar_store_insert(store, data.frame(text = c("foo", "bar")))
ragnar_store_build_index(store)
ragnar_retrieve(store, "foo")
# More Advanced: store metadata, retrieve with pre-filtering
store <- ragnar_store_create(
embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small"),
extra_cols = data.frame(category = character())
)
ragnar_store_insert(
store,
data.frame(
category = "desert",
text = c("ice cream", "cake", "cookies")
)
)
ragnar_store_insert(
store,
data.frame(
category = "meal",
text = c("steak", "potatoes", "salad")
)
)
ragnar_store_build_index(store)
# simple retrieve
ragnar_retrieve(store, "carbs")
# retrieve with pre-filtering
dplyr::tbl(store) |>
dplyr::filter(category == "meal") |>
ragnar_retrieve("carbs")
}