Mariadb Cloud Hybrid Rag Search
MariaDB Cloud Hybrid & RAG-Ready Search: Vector Search Where Your Data Lives + Real-Time MCP Augmentation
Ask AI about Mariadb Cloud Hybrid Rag Search
Powered by Claude ยท Grounded in docs
I know everything about Mariadb Cloud Hybrid Rag Search. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
MariaDB Cloud Hybrid RAG-Ready Search
Vector Search Where Your Data Lives + Real-Time MCP Augmentation
Deploy a RAG-ready hybrid search engine instantly on MariaDB Cloud.
Plug-and-play MariaDB Cloud demo leveraging MariaDB Vector Search, the FastMCP Server with Brave Search augmentation and the 20 Newsgroups dataset from scikit-learn. This solution unifies cloud-based semantic search with real-time web search augmentation for a broad array of applications and possible use cases. It eliminates infrastructure bloat and inter-solution latency to deliver a streamlined, RAG-ready search architecture.
Easily adaptable to any search source and ready for integration into any RAG workflow.
๐ฏ Application Domains
This architecture is designed for scalability and can be adapted for:
- Startup Search Engines: Combine a proprietary/owned website database (indexed via MariaDB Vector and deployed via MariaDB Cloud) with results from larger search engines via MCP augmentation to bootstrap comprehensive search coverage.
- Product Search Engines: Display results from internal stock inventory first via MariaDB Cloud with Vector Search, then augment the catalog with products from large sellers like Amazon via remote MCP tools to ensure users never hit a "zero results" page.
- Enterprise Federated Search: Distribute load by resolving high-fidelity internal queries locally on MariaDB, while offloading trend-based or ephemeral queries to multiple remote MCP instances.
- ...and more. The possibilities for hybridizing static and dynamic knowledge are endless.
๐ Repository Structure
| File | Type | Function |
|---|---|---|
| mariadb_cloud_hybrid_search.py | ๐ง Core Logic | The main application. Connects to MariaDB for semantic vector-based search and spawns the MCP instnace to fetch external web results. |
| server.py | ๐ MCP Server | A lightweight, pure-Python implementation of the FastMCP + Brave Search MCP server. A core strength is that there is no need for complex Node.js/NPX dependencies. |
| load_data.py | ๐ Ingestion | Robust data loader. Reads newsgroups_with_embeddings.csv, parses JSON vectors, and performs safe batch inserts into a MariaDB Cloud instance. |
| schema.sql | ๐๏ธ Database | DDL script. Defines the documents table and configures the HNSW (Hierarchical Navigable Small Worlds) Vector Index with Cosine Distance. |
| embeddings.py | โ๏ธ Transformation | Generates 384-dimensional dense vectors from raw text using sentence-transformers (all-MiniLM-L6-v2). |
| import_newsgroups.py | ๐ฆ Source | ETL script. Downloads the "20 Newsgroups" dataset from Scikit-Learn and organizes it for processing. |
| brave_api_config.py | โ๏ธ Config | Centralized credentials file for the Brave API Key. |
| db_config.py | โ๏ธ Config | Centralized credentials file for MariaDB Cloud connection details. |
๐ Quick Start
1. Prerequisites
- MariaDB Cloud (or MariaDB 11.8+ with MariaDB Vector) (Get a free trial here, more info here, dashboard here)
- Python 3.10+
- Brave Search API Key (Get a free one here)
2. Installation
Install the required Python packages (no Node.js required!):
sudo apt update
sudo apt install -y python3
pip3 install --break-system-packages scikit-learn pandas sentence-transformers mysql-connector-python mcp fastmcp httpx
3. Database Setup
Get a free trial at https://mariadb.com/cloud-get-started/
After signing up, configure your provided credentials in db_config.py
Replace the host, port, user and password in the file with the credentials provided in MariaDB Cloud. Note the host will be <some_host>.<skysql.com>
Next initialize the schema on MariaDB Cloud:
python3 init_schema.py
4. MCP/Brave API Setup
Configure your API Key (a free key; link above) in brave_api_config.py
5. Data Pipeline
Ingest the sample data (20 Newsgroups):
# 1. Download raw data
python3 import_newsgroups.py
# 2. Generate Vectors
python3 embeddings.py
# 3. Load into MariaDB
python3 load_data.py
You will want to see output similar to the following:
$ python3 load_data.py
Reading load_data.sql...
Connecting to MariaDB (serverless-eu-west-2.sysp0000.db1.skysql.com)...
Parsing and executing statements (This may take some time)...
Note that if you have a mariadb CLI client connection to your MariaDB Cloud instance,
you can use the SHOW PROCESSLIST; there to see the LOAD DATA LOCAL INFILE command executing!
-> LOAD DATA executed. Rows affected: 17872
-> Verification Result: [(17872, '[0.00207798,0.0234504,0.0248088,-0.0101102,0.0462614,-0.0190388,0.0619883,0.0491666,0.0265862,-0.00934642,-0.0995098,0.0397233,-0.0552096,0.0253242,0.029936,-0.0195666,-0.0608661,0.0158701,0.0253339,0.0459387,-0.0141414,-0.00794888,0.0213752,-0.0101096,0.1009,0.0132258,0.00994408,0.0649844,0.0359498,0.00901051,-0.0493552,0.0284283,0.0166625,-0.0703645,0.0288974,-0.0127835,-0.0162346,-0.0295971,0.00119797,0.014752,0.0327471,0.0327008,-0.0538816,-0.0343446,0.0388207,-0.0128942,-0.0578634,-0.0505732,0.0364845,-0.0185513,-0.0109562,-0.0236339,0.0850376,-0.0982703,0.0315816,0.0419593,-0.0214829,-0.0429301,0.0539161,-0.0595207,0.0101381,-0.0324808,0.0105534,-0.0295961,0.000469759,-0.0988063,-0.0261606,0.022961,-0.0431282,0.0262253,-0.0197124,-0.010355,0.0422773,-0.00657083,-0.0307955,0.043206,-0.0730875,0.00620324,0.0111945,0.00741116,0.106064,-0.0605068,0.0679394,0.0162757,0.0442884,0.0580315,-0.0317181,-0.056659,-0.0250365,0.0500393,0.014552,0.04476,0.0342192,-0.0427249,0.0201785,0.0179153,0.0471298,0.0743766,-0.0370667,0.0917412,0.0626817,0.0586675,-0.0685789,-0.0914458,0.0772436,0.00542554,0.00732471,-0.0697413,0.0390976,0.00170016,-0.0395483,0.031039,-0.0382415,-0.035023,-0.000318743,-0.0136482,0.00872001,0.12912,0.0229155,0.0353523,0.0721866,0.0821704,-0.0257309,0.0167397,0.122414,0.0199302,-0.0278529,5.47535e-34,0.0377817,0.0385392,-0.0292307,0.0330809,0.00722811,-0.0426218,0.027011,-0.141702,-0.0902081,-0.0307317,-0.020806,0.0545303,-0.0834641,-0.0445694,0.00319737,0.0126429,-0.00751519,-0.0711889,0.0474987,0.0450712,0.0713869,-0.0268942,-0.00465639,0.0221245,-0.0377545,-0.0690702,-0.00952104,-0.0239597,-0.0911421,-0.0158635,-0.0844707,0.0372535,0.0240838,0.0404958,-0.00198159,-0.0725607,0.0109588,-0.00463272,-0.029936,-0.100364,-0.0340318,0.00302746,-0.102122,0.0170296,-0.0158033,0.0598794,-0.00274417,-0.0407885,0.0162227,-0.0383247,0.0657488,0.0118673,0.0115224,0.0190624,0.0607903,-0.0136143,0.0737808,-0.0463348,-0.00973392,0.0136533,-0.00238923,0.0282113,-0.0422151,-0.0891679,0.00394035,0.0266988,0.0000345921,0.027942,-0.0359832,-0.0217479,-0.0339333,-0.0153249,-0.0942851,-0.0971724,0.0373808,0.0158067,-0.0564069,-0.000277566,0.0210942,-0.0601796,0.0690045,-0.105883,-0.00802559,-0.0160427,0.0175808,-0.0880975,0.0900668,0.0405845,-0.0564371,-0.049276,0.0457819,-0.0356521,0.0915206,0.00919633,-0.110842,-2.48525e-33,-0.074846,0.0494792,-0.0824998,0.0418563,-0.1552,-0.0439295,0.0192631,-0.00966113,0.0590463,-0.0146881,-0.00141262,-0.00580723,-0.00185381,0.125171,-0.0804913,0.0120403,-0.00742259,0.0574826,-0.0591295,-0.112534,0.0355046,0.0482968,0.0211093,0.0316254,-0.00540866,0.039821,0.0507724,-0.0538451,-0.108723,-0.056546,0.00917155,-0.0127874,0.0416142,-0.0590954,0.0310165,0.00620159,0.0697401,0.037875,-0.00289023,0.0218099,0.00173059,-0.100301,-0.0584759,-0.0396568,-0.026757,-0.0750557,0.0533853,0.0288793,-0.00308289,0.0572516,-0.0667324,0.0578654,-0.0726077,0.0570928,-0.0326259,-0.0651438,-0.000157117,-0.0244271,0.014564,-0.0291592,-0.0743428,0.0490508,-0.0526584,0.0306764,0.1116,0.0487338,0.00533625,-0.0664902,0.0604829,-0.0586754,-0.0611762,0.0997686,0.014563,0.0073697,0.0761423,0.0576636,-0.00103635,-0.0233747,0.0835168,0.0155544,-0.054695,-0.055751,-0.0528943,0.0611399,-0.0865382,-0.0566337,0.0420126,-0.0754347,0.00359017,-0.0107689,0.111075,-0.0227384,0.0180179,0.0210111,0.0581104,-4.14187e-8,-0.0225763,0.0266639,-0.116486,-0.0280538,0.0996045,-0.0382829,0.00980801,-0.0274747,0.0349741,0.13355,0.0601781,0.0702117,0.0266095,-0.0735184,0.0754873,0.0784986,0.0388271,0.0177443,-0.0194504,-0.0309874,0.0348932,0.0401711,-0.102051,-0.000333285,0.0430148,-0.0856237,-0.0293319,0.0183104,0.0692883,-0.00915029,0.00509703,0.00207025,0.0105034,0.0234064,-0.0567807,-0.0630953,-0.0389075,-0.0230801,0.0395504,0.027733,-0.0902364,-0.0485402,-0.0275208,-0.00357997,-0.0559152,-0.0538809,0.0113278,0.0377156,-0.00260669,-0.0852173,0.0148187,-0.0650994,0.0389202,0.0958818,0.0609211,-0.0238973,-0.0266706,0.102948,-0.0191063,-0.0067373,-0.0291557,0.00143591,0.0151075,0.0528758]')]
SUCCESS: Data load completed.
6. Run MariaDB Hybrid Search
Launch the engine:
python3 mariadb_cloud_hybrid_search.py
7. ๐ง Troubleshooting
Issue: "Database or MCP/Brave API Connection Error"
Ensure that db_config.py and brave_api_config.py contain the correct credentials.
๐ Runtime Demo!
$ python3 mariadb_cloud_hybrid_search.py
Initializing AI Model (SentenceTransformer)...
======================================================================
Starting Hybrid RAG Search for: 'AI technology'
======================================================================
--- PHASE 1: INTERNAL KNOWLEDGE RETRIEVAL VIA MARIADB VECTOR SEARCH ---
Connecting to MariaDB (serverless-eu-west-2.sysp0000.db1.skysql.com)...
Internal Search executed in 0.7903 seconds.
Found 5 internal documents:
[sci.med] (Dist: 0.5179)
> If you have any information on artificial intelligence in medicine, then I would appreciate it if you could mail me with whatever it is. The informati...
----------------------------------------
[comp.graphics] (Dist: 0.5711)
> From article <1993May1.092058.1@aurora.alaska.edu>, by pstlb@aurora.alaska.edu: Since this was posted on comp.ai, I assume there is an AI angle to thi...
----------------------------------------
[sci.med] (Dist: 0.6187)
> [For those attending the AAAI conf this summer, note that this conference is immediately preceding it.] PRELIMINARY PROGRAM AND REGISTRATION MATERIALS...
----------------------------------------
[comp.graphics] (Dist: 0.6231)
> The Harvard Computer Society is pleased to announce its third lecture of the spring. Ivan Sutherland, the father of computer graphics and an innovator...
----------------------------------------
[comp.graphics] (Dist: 0.6522)
> Technion - Israel Institute of Technology Department of Computer Science GRADUATE STUDIES IN COMPUTER GRAPHICS Applications are invited for graduate s...
----------------------------------------
--- PHASE 2: EXTERNAL WEB RETRIEVAL (MCP) FOR SEARCH RESULTS AUGMENTATION ---
Spinning up Local Python MCP Server...
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ โ
โ โ
โ โโโ โโโ โโโ โโโ โโโโโ โโโ โโโ โ
โ โโ โโโ โโโ โ โ โ โ โโโ โโโ โ
โ โ
โ โ
โ FastMCP 2.14.2 โ
โ https://gofastmcp.com โ
โ โ
โ ๐ฅ Server: brave-search โ
โ ๐ Deploy free: https://fastmcp.cloud โ
โ โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ โจ FastMCP 3.0 is coming! โ
โ Pin fastmcp<3 in production, then upgrade when you're ready. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
[01/08/26 19:02:59] INFO Starting MCP server 'brave-search' with transport 'stdio' server.py:2504
Calling MCP Tool 'brave_web_search' with query: 'AI technology'
Found 3 external results:
----------------------------------------
Artificial intelligence - Wikipedia
Link: https://en.wikipedia.org/wiki/Artificial_intelligence
> The prevalence of generative AI tools has increased significantly since the AI boom in the 2020s. This boom was made possible by improvements in deep neural networks, particularly large language models (LLMs), which are based on the transformer architecture. Major tools include LLM-based chatbots such as ChatGPT, Claude, Copilot, DeepSeek, Google Gemini and Grok; text-to-image models such as Stable Diffusion, Midjourney, and DALL-E; and text-to-video models such as Veo, LTX and Sora. Technology companies developing generative AI include Alibaba, Anthropic, Baidu, DeepSeek, Google, Lightricks, Meta AI, Microsoft, Mistral AI, OpenAI, Perplexity AI, xAI, and Yandex.
----------------------------------------
What Is Artificial Intelligence (AI)? | IBM
Link: https://www.ibm.com/think/topics/artificial-intelligence
> Artificial intelligence (AI) is technology that <strong>enables computers and machines to simulate human learning, comprehension, problem solving, decision making, creativity and autonomy</strong>.
----------------------------------------
Home - AI Technology, Inc.
Link: https://www.aitechnology.com/
> Since pioneering the use of flexible epoxy technology for microelectronic packaging in 1985, AI Technology has been one of the leading forces in development and patented applications of advanced materials and adhesive solutions for electronic interconnection and packaging.
----------------------------------------
