• Subtitles and audio files from the Lex Fridman YouTube playlist downloaded using yt-dlp.
  • Speaker diarization by clustering audio file embeddings (computed using pyannote).
  • Aggregation of the subtitles chunks -with a speaker identified- into larger chunks and computation of text embedding using sentence transformers
  • For a given query and a given subtitle chunk, the search score is the cosine similarity between the embedding of the query and the embedding of the subtitle chunk. The top 12 result with the highest cosine similarity are shown.
  • Score goes from 0 to 100, with coloration according to relevancy: green (score > 60), yellow (30 < score < 60) or red (<30).