🕹 Quick Start

【English | 中文】

🕹 Quick Start

1. Getting Started

Environment Setup

Backend Environment Setup

First, ensure your machine has Python 3.8 - 3.10 installed.

$ python --version
Python 3.10.12

Next, create a virtual environment and install the project dependencies within it.

# Clone the repository
$ git clone https://github.com/zhouxh19/ChatBase.git

# Enter the directory
$ cd ChatBase

# Install all dependencies
$ pip3 install -r requirements.txt 

# To run only the API service
$ pip3 install -r requirements_api.txt 

# The default dependencies include the basic runtime environment (Chroma-DB vector library). If you want to use other vector libraries, uncomment the corresponding dependencies in requirements.txt before installing.

Frontend Service Setup

First, ensure your machine has Node (>= 18.15.0) installed.

$ node -v
v18.15.0

Next, install the project dependencies.

cd webui
# pnpm address https://pnpm.io/zh/motivation
# Install dependencies (Recommend using pnpm)
# You can use "npm -g i pnpm" to install pnpm 
pnpm install

Download Embedding Model from HuggingFace.

To download the model, you need to install Git LFS first, then run:

$ git lfs install
$ git clone https://huggingface.co/moka-ai/m3e-base

Adjust the model settings to the download path, for example:

EMBEDDING_MODEL = "m3e-base"
LLM_MODELS = ["Qwen-1_8B-Chat"]
MODEL_PATH = {
    "embed_model": {
        "m3e-base": "m3e-base", # Download path of embedding model.
    },

    "llm_model": {
        "Qwen-1_8B-Chat": "Qwen-1_8B-Chat", # Download path of LLM.
    },
}

Modify Configuration Files

Copy the configuration files and check each file's comments to modify them according to your needs.

$ python copy_config_example.py
# The generated configuration files are in the configs/ directory
# basic_config.py is the basic configuration file and doesn't need to be modified.
# kb_config.py is the knowledge base configuration file, you can modify DEFAULT_VS_TYPE to specify the storage vector library of the knowledge base, and you can also modify the relevant paths.
# model_config.py is the model configuration file, you can modify LLM_MODELS to specify the models used. The current model configuration is mainly for knowledge base search, diagnostic-related models have some hard coding in the code, and will be unified here later.
# prompt_config.py is the prompt configuration file, mainly for LLM dialogue and knowledge base prompts.
# server_config.py is the service configuration file, mainly for the service port number, etc.

!!! Note: Please modify the following configurations before initializing the knowledge base, otherwise it may cause database initialization failure.

model_config.py

# EMBEDDING_MODEL   Vectorization model, if you choose a local model, download it to the root directory as needed.
# LLM_MODELS        LLM, if you choose a local model, download it to the root directory as needed.
# ONLINE_LLM_MODEL  If you use an online model, modify the configuration.

server_config.py

# WEBUI_SERVER.api_base_url   Pay attention to this parameter. If deploying the project on a server, modify the configuration.

Initialize the Knowledge Base

Initialize your knowledge base and simply copy the configuration files as follows:

$ python init_database.py --recreate-vs

One-Click Start

Start the project with the following command:

$ python startup.py -a

Example of the startup interface

If it starts successfully, you will see the following interface:

RAG Dialogue Page

Database Dialogue Page:

Database Dialogue Start Page:

Database Dialogue History Page:

Multi-file Linked Dialogue Page:

Knowledge Base Page

Knowledge Base Management Page:

Knowledge Base Details Page:

⏱ Todo

Data-Driven workflow orchestration
ES Service

📒 Citation

Feel free to cite us if you like this project.

@misc{zhao2024llmdbdemo,
      title={Chat2Data: An Interactive Data Analysis System with RAG, Vector Databases and LLMs}, 
      author={Xinyang Zhao, Xuanhe Zhou, Guoliang Li},
      year={2024},
      journal={Proc. {VLDB} Endow.},
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
chains		chains
common		common
configs		configs
diagnostic_files		diagnostic_files
docs		docs
document_loaders		document_loaders
img		img
knowledge_base/th_document/content		knowledge_base/th_document/content
materials		materials
nltk_data		nltk_data
pandasai		pandasai
server		server
tests		tests
text_splitter		text_splitter
webui		webui
.gitattributes		.gitattributes
.gitignore		.gitignore
DataChatDockerFile		DataChatDockerFile
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README_Chinese.md		README_Chinese.md
SearchChroma.py		SearchChroma.py
celery_app.py		celery_app.py
chromadb_test.py		chromadb_test.py
copy_config_example.py		copy_config_example.py
demo_code.py		demo_code.py
init_database.py		init_database.py
opengauss_test.py		opengauss_test.py
paddleOCR.py		paddleOCR.py
release.py		release.py
requirements.txt		requirements.txt
save_file_to_vs.py		save_file_to_vs.py
shutdown_all.sh		shutdown_all.sh
start_all.sh		start_all.sh
start_api.sh		start_api.sh
startup.py		startup.py
streamlit_test.py		streamlit_test.py
webui.py		webui.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🕹 Quick Start

1. Getting Started

Environment Setup

One-Click Start

⏱ Todo

📒 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

XuanheZhou/ChatBase

Folders and files

Latest commit

History

Repository files navigation

🕹 Quick Start

1. Getting Started

Environment Setup

One-Click Start

⏱ Todo

📒 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages