About
The goal of this site is to be a mirror for recorded dharma talks by turning them into a readable, searchable article format.
The Backstory & Mission
This project originally started as a personal tool to solve a common frustration: remembering a beautiful story, poem, or teaching from a dharma talk, but being unable to find it again easily. Traditional search tools rarely index the spoken content inside audio and video files, leaving us to manually "scrub" through hours of recordings to locate a single specific passage.
By transcribing and formatting these talks into text articles, you can search the entire collection instantly using our static, client-side search (powered by Pagefind), which runs entirely in your browser for speed and privacy.
We hope this site serves as a helpful resource for others—providing an intuitive way to cross-reference concepts, study the teachings in depth, and locate those hard-to-find moments without the scrubbing.
Currently, the talks are sourced from a few specific YouTube channels and AudioDharma, with the potential to expand to more sources over time.
Feedback & Improvements
We are always interested in making this resource better! If you have suggestions for new talks, additional sources, or feature improvements, please feel free to reach out to us at:
admin [at] moonpointing.com (replace [at] with @)
How Topics & Keywords Work
To help you discover relevant talks, the site automatically extracts key topics for each talk and speaker, and suggests helpful keywords when you browse.
Finding the Core Themes of a Talk (TF-IDF)
To identify what each talk is primarily about, we use a well-established mathematical method called TF-IDF, which stands for Term Frequency-Inverse Document Frequency. You can read a detailed breakdown of the math on Wikipedia's TF-IDF page.
Conceptually, the method looks at two simple factors to find keywords:
- Term Frequency (How often it is used in a talk): If a word is spoken many times during a talk, there is a good chance it represents a core theme. For example, if a teacher mentions "breath" 30 times, the talk is likely about breath meditation.
- Inverse Document Frequency (How unique the word is across all talks): Some words are used frequently everywhere (e.g., "buddhism", "practice", or common daily words). These aren't very helpful for distinguishing one talk from another. IDF penalizes words that appear in almost every talk, and rewards words that appear in only a few specific talks (like a specific name or a distinct Buddhist term like "Jataka").
By combining these two factors (TF * IDF), we get a relevance score for each word. The words with the highest scores are selected as the talk's keywords.
To make this process accurate, the site automatically does some behind-the-scenes preparation: it filters out very common words (like "the", "is") and groups different forms of the same word together. For example, variations like "meditating", "meditations", and "meditated" are all counted together under the single theme "meditate".
Summarizing Themes for Speakers
To help you understand the topics a speaker focuses on most across their entire body of work, we aggregate their keywords:
- We collect the topic relevance scores (from the TF-IDF step) across all the talks given by a speaker.
- We combine these scores for each unique word. If a speaker frequently talks about a topic with high relevance in multiple talks, that topic's aggregated score becomes very strong.
- The top 40 topics for each speaker are used to generate the Speaker Wordcloud on their speaker page. The size of each word visually represents how central that topic is to the speaker's talks on this site.
A Subtle Distinction (Speaker Caveat): Technically, we do not re-run the TF-IDF algorithm exclusively within a speaker's subset of talks. If we did, words that the speaker uses in all of their talks (e.g., a teacher who always focuses on "awareness") would become "common" within their talks and get a score of zero, causing them to vanish from their wordcloud! By summing the globally calculated TF-IDF scores instead, we successfully capture what makes this speaker's topics unique relative to the entire site's collection.
Dynamic Topic Suggestions in the Browse Table
When browsing the main table of talks, you can click on topic tags to filter the list. As you filter the talks, the site dynamically suggests additional topics to help you refine your search further.
The site calculates a score for potential topic suggestions to ensure you get the most helpful options first. The formula is:
Score = (Split Score * Global Weight * Core Multiplier) ** 0.5
Here is what this means in simple terms:
- Dynamic Geometric Scorer: Using the geometric mean instead of a simple sum requires both the local narrowing power AND the global keyword authority to be high. If a conversational noise word splits the remaining talks perfectly, but has near-zero global authority, its score dynamically collapses straight to zero!
- Split Score (Narrowing Power): It prioritizes words that appear in roughly half of the talks currently visible on your screen, allowing you to perfectly divide remaining lists and narrow results most efficiently.
- Global Weight: Implements the true site-wide global weight calculated during the Python extraction phase across all talks in the entire site collection.
- Core Multiplier: Curated Dharma subjects receive a
2.0xrelevance boost (standard words get1.0x) inside the geometric product, elevating traditional themes in suggestions. - The Single-Match Penalty: If a topic appears in only one talk remaining inside the visible rows, the system applies a strong penalty (reducing the score to 25% of its value) to push the topic down, allowing words that cluster multiple talks to be surfaced first.
The top 20 topics with the highest combined scores are presented in the "Filter by Topic" UI above the table.
Performance & Background Async Downloading (Suggestions Caveat): Recomputing true local TF-IDF weights in real-time on the fly inside the browser would require heavy processing and slow down the UI. To achieve absolute mathematical precision without any latency, the system uses dynamic async Web Worker fetching: it downloads the precomputed 853 KB globalKeywords.json file asynchronously in the background directly into Web Worker memory, executing search filtering and dynamic suggested keyword calculations inside under 20 milliseconds for a completely instantaneous, zero-lag browsing experience!
Respecting Content and Retractions
This site operates as a mirror and discovery tool for existing public teachings, and we strive to be completely respectful of the original creators' intent and content control:
- Respecting Complete Deletions (Full Purge): If a talk entry is totally removed from the source site feed, or if the talk has both its YouTube and MP3 player links removed, we respect this choice immediately. Our daily automation scans for missing or fully stripped feed items and will completely erase the text transcript from our static search index and delete the article page from disk.
- YouTube Video Availability & Audio Fallback: If a YouTube video goes private or is removed by the uploader, but the audio recording remains active on AudioDharma, we preserve the transcription text so the teaching isn't lost. The page automatically transitions from a YouTube iframe player to a native HTML5
<audio>player streaming the original MP3 file. - Preserving Active Content: If a talk entry on the source website is missing its video link (either due to an oversight or an omission), but the underlying YouTube recording is verified as public and healthy, we preserve the rich video embed layout.