from fastcore.utils import * from pathlib import Path from sentence_transformers import CrossEncoder
audrey.feldroy.com
The experimental notebooks of Audrey M. Roy Greenfeld. This website and all its notebooks are open-source at github.com/audreyfeldroy/audrey.feldroy.com
# Semantic Search With Sentence Transformers and a Cross-Encoder Model
by Audrey M. Roy Greenfeld | Tue, Apr 15, 2025
Continuing the Sentence Transformers exploration, I use a cross-encoder model to rank my notebooks by similarity to search queries.
Setup
Bi-Encoder vs. Cross-Encoder Model
A cross-encoder model takes 2 texts as input and outputs 1 similarity score. That means you can't precompute embeddings like I did with the bi-encoder model, but rather must use the cross-encoder model to generate similarities each time.
Aspect | Bi-Encoder | Cross-Encoder |
---|---|---|
Input/Output | Encodes texts separately into embeddings | Takes text pair, outputs similarity score |
Accuracy | Lower accuracy but sufficient for initial retrieval | Higher accuracy for relevance ranking |
Computational Cost | More efficient (can pre-compute embeddings) | More expensive (must process each text pair) |
Scalability | Good for large-scale retrieval | Poor for large datasets |
Use Case | Initial retrieval from large corpus | Re-ranking a small set of candidates |
Storage | Requires storing embeddings | No embedding storage needed |
Cross-encoders excel at precision, but are typically used after a bi-encoder has narrowed down search results to 10-100 documents. In my case, I have less than 100 notebooks on this site, so I can get away with using just a cross-encoder.
Download a Cross-Encoder Model
ce_model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L6-v2")
Get All Notebook Paths
We put each notebook to be searched into a list.
def get_nb_paths(): root = Path() if IN_NOTEBOOK else Path("nbs/") return L(root.glob("*.ipynb")).sorted(reverse=True)
nb_paths = get_nb_paths()
def read_nb_simple(nb_path): with open(nb_path, 'r', encoding='utf-8') as f: return f.read()
nbs = L(nb_paths).map(read_nb_simple)
Search for a Test Query String
Let's search my notebooks for a test string.
q = "Web search" hits = ce_model.rank(q, nbs, return_documents=False)
hits[:10]
def print_search_result(hit): print(f"{hit['score']} {nb_paths[hit['corpus_id']]}")
L(hits[:10]).map(print_search_result)
Those results seem not as good as those from the bi-encoder. Let's try another cross-encoder model.
Another Cross-Encoder: ms-marco-MiniLM-L12-v2
ce_model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L12-v2")
hits = ce_model.rank(q, nbs, return_documents=False)
L(hits[:10]).map(print_search_result)
Fascinating how "Web" is emphasized so much, rather than the idea of "Web search".
Another Cross-Encoder: ms-marco-TinyBERT-L2-v2
ce_model = CrossEncoder("cross-encoder/ms-marco-TinyBERT-L2-v2") hits = ce_model.rank(q, nbs, return_documents=False) L(hits[:10]).map(print_search_result)
This seems the best! I like this ranking.
Reflection
After experimenting with a few cross-encoder models, I found that the TinyBERT model (cross-encoder/ms-marco-TinyBERT-L2-v2
) gave the most intuitive results out of both the cross-encoder and bi-encoder models.
It seemed to understand the semantic relationship between "Web search" and my notebooks about search functionality better than the larger models.
© 2024-2025 Audrey M. Roy Greenfeld