{"id":47394,"date":"2024-09-23T16:04:56","date_gmt":"2024-09-23T16:04:56","guid":{"rendered":"https:\/\/writer.com\/?post_type=eng_post&#038;p=47394"},"modified":"2025-06-13T11:34:13","modified_gmt":"2025-06-13T11:34:13","slug":"rag-vector-database","status":"publish","type":"eng_post","link":"https:\/\/writer.com\/engineering\/rag-vector-database\/","title":{"rendered":"RAG vector database explained"},"content":{"rendered":"<!--auWTKFVgmi-->\n<div class=\"wpm-post-container -thm-1 p-3 p-md-4_6 mt-6 wpm-summarized \" >\n\n<!--auWTKFVgmi-->\n<div class=\"text-center \" >\n\n<!--auWTKFVgmi-->\n<div class=\"wpm-button-thm-1 \" >\n\n\n<svg width=\"30\" height=\"31\" viewBox=\"0 0 30 31\" fill=\"none\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\">\r\n\t\t\t\t\t\t<rect y=\"1\" width=\"30\" height=\"30\" rx=\"15\" fill=\"black\"><\/rect>\r\n\t\t\t\t\t\t<path fill-rule=\"evenodd\" clip-rule=\"evenodd\" d=\"M21.2611 19.0857L23.2645 10.8203H22.0451H22.0269H20.4954H20.4772H19.2578L21.2611 19.0857Z\" fill=\"white\"><\/path>\r\n\t\t\t\t\t\t<path fill-rule=\"evenodd\" clip-rule=\"evenodd\" d=\"M14.3392 10.8203H13.1016L15.8888 22.3202H17.1265H18.6761H19.9137L17.1265 10.8203H15.8888H14.3392Z\" fill=\"white\"><\/path>\r\n\t\t\t\t\t\t<path fill-rule=\"evenodd\" clip-rule=\"evenodd\" d=\"M8.18293 10.8203H6.94531L9.7326 22.3202H10.9702H12.5199H13.7575L10.9702 10.8203H9.7326H8.18293Z\" fill=\"white\"><\/path>\r\n\t\t\t\t\t\t<\/svg>\n\n\n\n<p data-styleid=\"style-xnfl09t10\">TL;DR by <a href=\"https:\/\/writer.com\/\">WRITER<\/a><\/p>\n\n\n<\/div>\n\n<\/div>\n\n\n<p data-styleid=\"style-skcfs6hvs\">RAG vector databases boost LLMs by integrating timely, relevant data, improving response accuracy and relevance. They use KNNs to query data and segment it into manageable embeddings. Due to issues like imprecision and high update costs, RAG vector databases prove to be a challenge to implement for enterprise environments. Our graph-based RAG solution, Writer Knowledge Graph, simplifies integration and improves accuracy in ML projects. Check out our <a href=\"https:\/\/dev.writer.com\/api-guides\/introduction\">Knowledge Graph API guide<\/a>.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"335\" height=\"38\" src=\"https:\/\/writer.com\/wp-content\/uploads\/2023\/02\/ic.svg?w=335\" alt=\"\" class=\"wp-image-26953\"\/><\/figure>\n<\/div>\n\n<\/div>\n\n\n<p data-styleid=\"style-6air2gvsq\">Generative AI is quickly becoming the go-to for enterprises wanting to tap into their wealth of internal data for company-specific question-answering and analysis in real time. But here&#8217;s the thing: relying only on large language models (LLMs) doesn\u2019t cut it. LLMs are great, but they\u2019re primarily trained on public data\u200c, so they don\u2019t understand your company\u2019s specific knowledge. True intelligence doesn\u2019t come from just adding more training data or building a more sophisticated model \u2014 it comes from augmenting the model with relevant, real-time data using smarter retrieval.<\/p>\n\n\n\n<p data-styleid=\"style-7xteg64fy\">That\u2019s where retrieval-augmented generation (RAG) steps in.<\/p>\n\n\n\n<p data-styleid=\"style-326018u0a\">RAG is all about finding the right data to answer a question and feeding it to the LLM. The traditional method for this is vector retrieval, which works, but it\u2019s far from perfect\u200c \u2014 \u200cespecially in large, complex ML projects.<\/p>\n\n\n\n<p data-styleid=\"style-50ask6bak\">Let\u2019s take a look at the key benefits and limitations of RAG vector databases. We\u2019ll also go over why graph-based retrieval may be the key to making generative AI work for enterprise use at scale.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-s-a-rag-vector-database\" data-styleid=\"style-86ane1v1z\"><strong>What\u2019s a RAG vector database?<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-mckm67gnj\">At its core, a RAG vector database uses vector retrieval to locate relevant data for the LLM to process. This method involves breaking data into smaller vector embeddings and then matching a query with the closest vectors using algorithms like K-Nearest Neighbors (KNN). While effective for general use cases, this traditional approach has limitations when applied to large-scale machine learning (ML) projects.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-limitations-of-vector-retrieval-in-rag\" data-styleid=\"style-d7ouxavql\"><strong>Limitations of vector retrieval in RAG<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-ozrtb9b3r\">Using vector retrieval in a RAG vector database can sometimes feel like using a sledgehammer to crack a nut. It\u2019s powerful but not always precise. Vector databases store numerical representations of data, but they don\u2019t always capture the nuances or relationships between data points\u200c \u2014 \u200cleading to issues when handling complex, interconnected enterprise data.<\/p>\n\n\n\n<p data-styleid=\"style-i4amyoxc7\">For example, let&#8217;s look at keyword searches in a phone company&#8217;s internal database.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" height=\"576\" width=\"1024\" src=\"https:\/\/writer.com\/wp-content\/uploads\/2024\/09\/Image-1-2-1.png?w=640\" alt=\"Vector retrieval: fails with concentrated data\" class=\"wp-image-47552\" srcset=\"https:\/\/writer.com\/wp-content\/uploads\/2024\/09\/Image-1-2-1.png 1900w, https:\/\/writer.com\/wp-content\/uploads\/2024\/09\/Image-1-2-1.png?resize=300,169 300w, https:\/\/writer.com\/wp-content\/uploads\/2024\/09\/Image-1-2-1.png?resize=768,432 768w, https:\/\/writer.com\/wp-content\/uploads\/2024\/09\/Image-1-2-1.png?resize=1024,576 1024w, https:\/\/writer.com\/wp-content\/uploads\/2024\/09\/Image-1-2-1.png?resize=1536,863 1536w, https:\/\/writer.com\/wp-content\/uploads\/2024\/09\/Image-1-2-1.png?resize=950,535 950w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p data-styleid=\"style-jbwb5nefv\">If you search for &#8220;comparison between NovaPhone and NovaPhone+,&#8221; vector retrieval might pull up documents that mention both models. But because these documents often use similar terms, vector retrieval might not get it right, confusing similar features or mixing up the two models. This is a problem with RAG because the information it finds can be a mix of correct and incorrect data points, leading to answers that aren&#8217;t completely accurate.<\/p>\n\n\n\n<p data-styleid=\"style-0sl5zwfa9\">Given these limitations of vector retrieval, it&#8217;s important to understand the underlying mechanisms of a RAG vector database to see why these problems arise. Let&#8217;s look at how it works and explore its shortcomings in handling complex enterprise data scenarios.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-a-rag-vector-database-works-and-why-it-falls-short-in-enterprise-use-cases\" data-styleid=\"style-gc3pfvxt4\"><strong>How a RAG vector database works and why it falls short in enterprise use cases<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-cxinhsjxj\">So what leads to this confusion? The answer lies in the process of how a RAG vector database functions:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Data processing:<\/strong> Data is split into chunks (100\u2013200 characters each) and converted into vector embeddings. For example, the word &#8220;cat&#8221; might be represented as [1.5, -0.4, 7.2, &#8230;], turning text into a numerical form that algorithms can process.<\/li>\n\n\n\n<li><strong>Query and retrieval:<\/strong> Your query is turned into a vector, and algorithms like KNN or Approximate Nearest Neighbors (ANN) retrieve closest matches.<\/li>\n\n\n\n<li><strong>Answer generation:<\/strong> The LLM pieces together an answer from the top \u201ck\u201d matches.<\/li>\n<\/ol>\n\n\n\n<p data-styleid=\"style-5kazljgu2\">Sounds straightforward, right? But here\u2019s where RAG is limited. Chunking data into small pieces can lose context\u200c \u2014 \u200cimagine reading a book where the pages are shuffled. Plus, KNN\/ANN algorithms aren\u2019t always efficient or accurate when dealing with large, complex datasets.<\/p>\n\n\n\n<p data-styleid=\"style-fxqyviwr2\">Finally, there are the issues of rigidity and cost. Every time you need to add new data, a vector database can\u2019t just append it to the existing data set. It needs to rerun all the data and assign each data object a new value. This is because what is in the entire dataset determines what value is given to each vector embedding. And every time you change your embedding model, <a href=\"https:\/\/medium.com\/@kelvin.lu.au\/what-we-need-to-know-before-adopting-a-vector-database-85e137570fbb\">it costs money<\/a>. The larger the corpus of data your company has, the more it&#8217;ll cost.<\/p>\n\n\n\n<p data-styleid=\"style-sh98kh4tu\">With new data added every day, an enterprise environment demands a more dynamic, flexible, and affordable solution.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-graph-based-retrieval-a-smarter-alternative-for-rag\" data-styleid=\"style-wr50g2fkx\"><strong>Graph-based retrieval: a smarter alternative for RAG<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-j7ixygc3x\">Graph-based retrieval offers a more sophisticated approach to RAG. Instead of simply looking at the distance between data points, it builds a web of relationships between data points. Each data point becomes a node, and its relationships to other points become edges.<\/p>\n\n\n\n<p data-styleid=\"style-4jc77glb7\">Here&#8217;s how it works:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Data processing:<\/strong> Entities are represented as nodes, and relationships are represented as edges. The edges can be used to show how the nodes are related. For example, a graph of a customer database could include a node for each customer, and edges to represent their purchases.&nbsp;<\/li>\n\n\n\n<li><strong>Query and retrieval:<\/strong> Graph-based retrieval uses a combination of NLP algorithms, heuristic algorithms, and ML techniques to understand the context of the query and identify the most relevant entities and relationships.<\/li>\n\n\n\n<li><strong>Answer generation:<\/strong> The LLM then takes those relevant data points and formulates an answer. By storing data in a cost-effective and easily updatable graph structure, semantic relationships are retained, resulting in accurate retrieval of relevant data for each query. Advanced retrieval techniques and LLM enhancements can further improve accuracy and reduce hallucinations.<\/li>\n<\/ol>\n\n\n\n<p data-styleid=\"style-g1qdnqg1g\">With graph-based retrieval, you\u2019re not just finding the closest match\u200c \u2014 \u200cyou\u2019re finding the <em>right<\/em> match by understanding the deeper connections within your data. This method reduces the limitations of RAG when applied to complex, enterprise-level ML projects.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-building-a-graph-based-rag-system-is-challenging\" data-styleid=\"style-n15gs7g3h\">\u200b\u200b<strong>Why building a graph-based RAG system is challenging<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-pl5bjnruf\">Building a custom graph-based RAG system from scratch can be resource-intensive. Here\u2019s what you\u2019d typically need to do:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Data setup<\/strong>: Organize your data into nodes and edges, representing entities and their relationships, using tools like Neo4j or Amazon Neptune.<\/li>\n\n\n\n<li><strong>Query and retrieval<\/strong>: Implement Natural Language Processing (NLP) and ML algorithms to understand user queries and find relevant data points.<\/li>\n\n\n\n<li><strong>Scaling and maintenance<\/strong>: Ensure your system remains efficient as your graph database grows\u200c by adding new data, updating relationships, and running retrieval queries in real time.<\/li>\n\n\n\n<li><strong>Deployment<\/strong>: Fine-tune the system\u2019s ability to respond to different types of queries while optimizing retrieval speed.<\/li>\n<\/ol>\n\n\n\n<p data-styleid=\"style-wluf14qu2\">In other words, it\u2019s a big task. That\u2019s why many developers look for pre-built solutions to avoid the complexities of building graph-based retrieval from the ground up.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-writer-knowledge-graph-a-pre-built-solution\" data-styleid=\"style-l0jmti82k\">\u200b\u200b<strong>The Writer Knowledge Graph: a pre-built solution<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-q3rc8ca1c\">At WRITER, we\u2019ve already recognized the challenges of building a graph-based RAG system from scratch. That\u2019s why we built the Writer Knowledge Graph, a ready-made solution that takes care of the heavy lifting. With our API, you can easily integrate graph-based retrieval into your ML projects.<\/p>\n\n\n\n<p data-styleid=\"style-ucbv32i78\">Here\u2019s how to use the Writer Knowledge Graph API:<\/p>\n\n\n\n<p data-styleid=\"style-dgiqh54yz\"><strong>1. Create a Knowledge Graph<\/strong>: Organize your data into nodes and edges \u2014 but without the hassle of setting up graph databases.<\/p>\n\n\n\n<p data-styleid=\"style-paka8r1dw\">For example:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:flex;align-items:center;padding:10px 0px 10px 16px;margin-bottom:-2px;width:100%;text-align:left;background-color:#333545;color:#efefe1\">Python<\/span><span role=\"button\" tabindex=\"0\" style=\"color:#F8F8F2;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><textarea class=\"code-block-pro-copy-button-textarea\" aria-hidden=\"true\" readonly># Assumes that there is a Writer client instance\n# stored in `my_client`. \n\ndef create_knowledge_graph(graph_name, client):\n    &#8220;&#8221;&#8221;Creates a new knowledge graph and returns its id.&#8221;&#8221;&#8221;\n    return client.graphs.create(name=graph_name).id\n\n# Create a new KG named &#8220;My Knowledge Graph&#8221;\n# and display its graph ID.\ngraph_id = create_knowledge_graph(&#8220;My Knowledge Graph&#8221;, my_client)\nprint(graph_id)<\/textarea><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki dracula\" style=\"background-color: #282A36\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #6272A4\"># Assumes that there is a Writer client instance<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\"># stored in `my_client`. <\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #FF79C6\">def<\/span><span style=\"color: #F8F8F2\"> <\/span><span style=\"color: #50FA7B\">create_knowledge_graph<\/span><span style=\"color: #F8F8F2\">(<\/span><span style=\"color: #FFB86C; font-style: italic\">graph_name<\/span><span style=\"color: #F8F8F2\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">client<\/span><span style=\"color: #F8F8F2\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #6272A4\">&quot;&quot;&quot;Creates a new knowledge graph and returns its id.&quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #FF79C6\">return<\/span><span style=\"color: #F8F8F2\"> client.graphs.create(<\/span><span style=\"color: #FFB86C; font-style: italic\">name<\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\">graph_name).id<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\"># Create a new KG named &quot;My Knowledge Graph&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\"># and display its graph ID.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">graph_id <\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\"> create_knowledge_graph(<\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F1FA8C\">My Knowledge Graph<\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F8F8F2\">, my_client)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #8BE9FD\">print<\/span><span style=\"color: #F8F8F2\">(graph_id)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p data-styleid=\"style-jaugjzov0\"><strong>2. Upload files<\/strong>: Upload PDFs, spreadsheets, or text documents using the Writer SDK without worrying about compatibility or manual data parsing.<br><br>For example:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:flex;align-items:center;padding:10px 0px 10px 16px;margin-bottom:-2px;width:100%;text-align:left;background-color:#333545;color:#efefe1\">Python<\/span><span role=\"button\" tabindex=\"0\" style=\"color:#F8F8F2;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><textarea class=\"code-block-pro-copy-button-textarea\" aria-hidden=\"true\" readonly>import os\n\ndef upload_file(file_path, client):\n    &#8220;&#8221;&#8221;\n    Uploads a single file (specified by pathname)\n    and returns its id.\n    &#8220;&#8221;&#8221;\n    # Open and read the file&#8217;s contents\n    with open(file_path, &#8216;rb&#8217;) as file_obj:\n        file_contents = file_obj.read()\n\n    # Upload the file\n    file = client.files.upload(\n        content=file_contents,\n        content_disposition=f&#8221;attachment; filename={os.path.basename(file_path)}&#8221;,\n        content_type=&#8221;application\/octet-stream&#8221;,\n    )\n\n    return file.id\n\ndef upload_files(file_paths, client):\n    &#8220;&#8221;&#8221;\n    Uploads a list of files (specified by pathnames)\n    and returns a corresponding list of ids.\n    &#8220;&#8221;&#8221;\n    file_ids = []\n    \n    for file_path in file_paths:\n        file_ids.append(upload_file(file_path, client))\n\n    return file_ids\n\n# Upload three files to Writer and get their file IDs.\nfiles = [\n    &#8220;.\/files\/My Brochure.pdf&#8221;,\n    &#8220;.\/files\/Additional Notes.txt&#8221;,\n    &#8220;.\/files\/Supplementary Data.csv&#8221;,\n]        \nfile_ids = upload_files(files, my_client)<\/textarea><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki dracula\" style=\"background-color: #282A36\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #FF79C6\">import<\/span><span style=\"color: #F8F8F2\"> os<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #FF79C6\">def<\/span><span style=\"color: #F8F8F2\"> <\/span><span style=\"color: #50FA7B\">upload_file<\/span><span style=\"color: #F8F8F2\">(<\/span><span style=\"color: #FFB86C; font-style: italic\">file_path<\/span><span style=\"color: #F8F8F2\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">client<\/span><span style=\"color: #F8F8F2\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #6272A4\">&quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\">    Uploads a single file (specified by pathname)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\">    and returns its id.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\">    &quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #6272A4\"># Open and read the file&#39;s contents<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #FF79C6\">with<\/span><span style=\"color: #F8F8F2\"> <\/span><span style=\"color: #8BE9FD\">open<\/span><span style=\"color: #F8F8F2\">(file_path, <\/span><span style=\"color: #E9F284\">&#39;<\/span><span style=\"color: #F1FA8C\">rb<\/span><span style=\"color: #E9F284\">&#39;<\/span><span style=\"color: #F8F8F2\">) <\/span><span style=\"color: #FF79C6\">as<\/span><span style=\"color: #F8F8F2\"> file_obj:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">        file_contents <\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\"> file_obj.read()<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #6272A4\"># Upload the file<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    file <\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\"> client.files.upload(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">        <\/span><span style=\"color: #FFB86C; font-style: italic\">content<\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\">file_contents,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">        <\/span><span style=\"color: #FFB86C; font-style: italic\">content_disposition<\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #FF79C6\">f<\/span><span style=\"color: #F1FA8C\">&quot;attachment; filename=<\/span><span style=\"color: #BD93F9\">{<\/span><span style=\"color: #F8F8F2\">os.path.basename(file_path)<\/span><span style=\"color: #BD93F9\">}<\/span><span style=\"color: #F1FA8C\">&quot;<\/span><span style=\"color: #F8F8F2\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">        <\/span><span style=\"color: #FFB86C; font-style: italic\">content_type<\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F1FA8C\">application\/octet-stream<\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F8F8F2\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    )<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #FF79C6\">return<\/span><span style=\"color: #F8F8F2\"> file.id<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #FF79C6\">def<\/span><span style=\"color: #F8F8F2\"> <\/span><span style=\"color: #50FA7B\">upload_files<\/span><span style=\"color: #F8F8F2\">(<\/span><span style=\"color: #FFB86C; font-style: italic\">file_paths<\/span><span style=\"color: #F8F8F2\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">client<\/span><span style=\"color: #F8F8F2\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #6272A4\">&quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\">    Uploads a list of files (specified by pathnames)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\">    and returns a corresponding list of ids.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\">    &quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    file_ids <\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\"> []<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #FF79C6\">for<\/span><span style=\"color: #F8F8F2\"> file_path <\/span><span style=\"color: #FF79C6\">in<\/span><span style=\"color: #F8F8F2\"> file_paths:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">        file_ids.append(upload_file(file_path, client))<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #FF79C6\">return<\/span><span style=\"color: #F8F8F2\"> file_ids<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\"># Upload three files to Writer and get their file IDs.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">files <\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\"> [<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F1FA8C\">.\/files\/My Brochure.pdf<\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F8F8F2\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F1FA8C\">.\/files\/Additional Notes.txt<\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F8F8F2\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F1FA8C\">.\/files\/Supplementary Data.csv<\/span><span style=\"color: #E9F284\">&quot;<\/span><span style=\"color: #F8F8F2\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">]        <\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">file_ids <\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\"> upload_files(files, my_client)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p data-styleid=\"style-x6mjze3wv\"><strong>3. Associate files with the graph<\/strong>: Link uploaded files to the graph for retrieval operations.<br><br>For example:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:flex;align-items:center;padding:10px 0px 10px 16px;margin-bottom:-2px;width:100%;text-align:left;background-color:#333545;color:#efefe1\">Python<\/span><span role=\"button\" tabindex=\"0\" style=\"color:#F8F8F2;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><textarea class=\"code-block-pro-copy-button-textarea\" aria-hidden=\"true\" readonly>def associate_files_with_graph(file_ids, graph_id, client):\n    &#8220;&#8221;&#8221;Associates a list of files with a graph.&#8221;&#8221;&#8221;\n    for file_id in file_ids:\n        client.graphs.add_file_to_graph(graph_id, file_id=file_id)\n\n# Associate the files uploaded in the previous example\n# with the graph created in the earlier example.\nassociate_files_with_graph(file_ids, graph_id, my_client)<\/textarea><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki dracula\" style=\"background-color: #282A36\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #FF79C6\">def<\/span><span style=\"color: #F8F8F2\"> <\/span><span style=\"color: #50FA7B\">associate_files_with_graph<\/span><span style=\"color: #F8F8F2\">(<\/span><span style=\"color: #FFB86C; font-style: italic\">file_ids<\/span><span style=\"color: #F8F8F2\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">graph_id<\/span><span style=\"color: #F8F8F2\">, <\/span><span style=\"color: #FFB86C; font-style: italic\">client<\/span><span style=\"color: #F8F8F2\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #6272A4\">&quot;&quot;&quot;Associates a list of files with a graph.&quot;&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">    <\/span><span style=\"color: #FF79C6\">for<\/span><span style=\"color: #F8F8F2\"> file_id <\/span><span style=\"color: #FF79C6\">in<\/span><span style=\"color: #F8F8F2\"> file_ids:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">        client.graphs.add_file_to_graph(graph_id, <\/span><span style=\"color: #FFB86C; font-style: italic\">file_id<\/span><span style=\"color: #FF79C6\">=<\/span><span style=\"color: #F8F8F2\">file_id)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\"># Associate the files uploaded in the previous example<\/span><\/span>\n<span class=\"line\"><span style=\"color: #6272A4\"># with the graph created in the earlier example.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #F8F8F2\">associate_files_with_graph(file_ids, graph_id, my_client)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p data-styleid=\"style-yvk8t8dxd\"><strong>4. Integration with no-code AI apps<\/strong>: Once the Knowledge Graph is created and files are uploaded, you can use it in no-code apps built with AI Studio. These apps can also be embedded in Writer Framework apps.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-real-world-applications-of-graph-based-rag\" data-styleid=\"style-yusmqxodc\"><strong>Real-world applications of graph-based RAG<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-h30uaflfl\">Here\u2019s how graph-based retrieval can be applied to real-world scenarios:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Customer support systems<\/strong><strong><br><\/strong>Quickly resolve customer queries by linking historical tickets, product details, and policies. This speeds up response times and improves accuracy.<br><a href=\"https:\/\/writer.com\/blog\/customer-support-generative-ai-use-cases\/\">Read more about customer support systems here.<\/a><\/li>\n\n\n\n<li><strong>Sales enablement<\/strong><strong><br><\/strong>Equip sales teams with instant access to market data, product insights, and competitor analysis, improving decision-making and boosting sales velocity.<br><a href=\"https:\/\/writer.com\/engineering\/app-graph-based-rag\/\">Learn more about sales enablement here.<\/a><\/li>\n\n\n\n<li><strong>Financial advisors<\/strong><strong><br><\/strong>Financial trend analysis and client personalization become easier using Knowledge Graphs to retrieve relevant financial insights.<br><a href=\"https:\/\/writer.com\/engineering\/financial-app-writer-framework-palmyra-fin\/\">Explore financial advising here.<\/a><\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-simplify-your-rag-setup-with-writer-knowledge-graph\" data-styleid=\"style-goz928m12\"><strong>Simplify your RAG setup with Writer Knowledge Graph<\/strong><\/h2>\n\n\n\n<p data-styleid=\"style-yglo0oh7u\">Smarter retrieval, like graph-based RAG, isn\u2019t just about finding the closest match\u200c \u2014 \u200cit\u2019s about finding the <em>right<\/em> <em>answer<\/em>. With the <a href=\"https:\/\/writer.com\/product\/graph-based-rag\/\">Writer Knowledge Graph<\/a>, you can skip the complexity of building a system from scratch and start building truly intelligent enterprise AI.<\/p>\n\n\n\n<p data-styleid=\"style-qpmdof5n8\">Ready to dive in? Start with our <a href=\"https:\/\/dev.writer.com\/api-guides\/introduction\">Knowledge Graph API guide<\/a> and see how you can implement smarter retrieval. Let us know what you build! \ud83d\ude80<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Discover the limitations of vector retrieval and how the Writer Knowledge Graph offers a scalable, accurate solution for complex enterprise data needs.<\/p>\n","protected":false},"author":1,"featured_media":47581,"comment_status":"closed","ping_status":"closed","template":"","meta":{"content-type":"","inline_featured_image":false,"illustrator_name":"","dc_display_publish_date":true,"footnotes":""},"eng_post_category":[117],"eng_post_tag":[],"class_list":["post-47394","eng_post","type-eng_post","status-publish","has-post-thumbnail","hentry","eng_post_category-thought-leadership"],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/eng_post\/47394","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/eng_post"}],"about":[{"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/types\/eng_post"}],"author":[{"embeddable":true,"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/comments?post=47394"}],"version-history":[{"count":35,"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/eng_post\/47394\/revisions"}],"predecessor-version":[{"id":56887,"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/eng_post\/47394\/revisions\/56887"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/media\/47581"}],"wp:attachment":[{"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/media?parent=47394"}],"wp:term":[{"taxonomy":"eng_post_category","embeddable":true,"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/eng_post_category?post=47394"},{"taxonomy":"eng_post_tag","embeddable":true,"href":"https:\/\/writer.com\/wp-json\/wp\/v2\/eng_post_tag?post=47394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}