Overview
The Semantic Search page provides powerful natural language search capabilities across all major entities in the HURE platform. Unlike traditional keyword search, semantic search understands the meaning of queries, finding relevant results even when exact words don’t match.
Key Features
How It Works
- Vector Embeddings: Text is converted to high-dimensional vectors using Ollama + mxbai-embed-large model
- Semantic Matching: Queries are matched by meaning, not just keywords
- Distance Scoring: Lower distance scores indicate better semantic matches
- Example: “beach retirement” finds coastal counties even without those exact words
Ollama Service Integration
- Status Indicator: Real-time service health check (Ready/Unavailable)
- Service URL: Configurable Ollama endpoint (https://ollama.supported.systems)
- Model Verification: Confirms mxbai-embed-large model is available
Index Management Dashboard
Five separate vector indices with status tracking:
| Index | Description | Reindex Action |
|---|---|---|
| County Index | 3,222 US counties with descriptions | One-click reindex |
| Broker Index | Brokerage profiles and specializations | One-click reindex |
| Agent Index | Agent bios and expertise areas | One-click reindex |
| Lead Index | Customer inquiry details | One-click reindex |
| State Index | State semantic content (region, climate, military, economic) | One-click reindex |
Each index shows:
- Current indexed count vs. total records
- Progress bar visualization
- Status badge (Fully Indexed / Needs Reindex)
- Manual reindex button
Test Search Interface
- Entity Tabs: Switch between Counties, Brokers, Agents, Leads, States
- Natural Language Input: Free-form query textbox
- Example Queries:
- “sunny beach retirement”
- “military relocation specialist”
- “warm coastal state with military bases and no income tax”
- “affordable midwest family-friendly”
Search Results
- Ranked by semantic distance (lower = better match)
- Entity-specific result cards with key details
- Direct links to view/edit matched records
Technical Architecture
Embedding Pipeline
- Entity text fields concatenated into searchable content
- Content sent to Ollama API for embedding generation
- 1024-dimensional vectors stored in sqlite-vec
- Queries embedded at search time and compared via cosine distance
Indexed Content by Entity
| Entity | Fields Used for Embedding |
|---|---|
| Counties | name, state, description, highlights |
| Brokers | company_name, description, specializations |
| Agents | name, bio, specializations |
| Leads | name, notes, county preferences |
| States | description, highlights, climate, military bases |
Use Cases
Military Family Relocations
- “warm state near air force base with good schools”
- “coastal area with navy presence and no state income tax”
Lifestyle Matching
- “mountain skiing resort town”
- “retirement community golf course warm weather”
Economic Factors
- “low cost of living affordable housing”
- “no income tax business friendly”
Testing Results
- ✅ Ollama service status displays correctly
- ✅ All five entity indices show accurate counts
- ✅ Reindex buttons trigger background reindexing
- ✅ Tab switching updates search context
- ✅ Natural language queries return semantically relevant results
- ✅ Distance scores properly rank results by relevance