An Agent Method Baseline for Spider 2.0 based on Docker environment.
# Clone the Spider 2.0 repository
git clone https://github.com/xlang-ai/Spider2.git
cd methods/spider-agent-dbt
# Optional: Create a Conda environment for Spider 2.0
# conda create -n spider2 python=3.11
# conda activate spider2
# Install required dependencies
pip install -r requirements.txt
Follow the instructions in the Docker setup guide to install Docker on your machine.
cd ../../spider2-dbt
or
cd ./spider2-dbt
gdown 'https://drive.google.com/uc?id=1N3f7BSWC4foj-V-1C9n8M2XmgV7FOcqL'
gdown 'https://drive.google.com/uc?id=1s0USV_iQLo4oe05QqAMnhGGp5jeejCzp'
python setup.py
export AZURE_API_KEY=your_azure_api_key
export AZURE_ENDPOINT=your_azure_endpoint
export OPENAI_API_KEY=your_openai_api_key
export GEMINI_API_KEY=your_genmini_api_key
cd ../../methods/spider-agent-dbt
python run.py -s <The name of this experiment>
python run.py --model gpt-4o -s test1python run.py --model gpt-4o -s experiment_name
python run.py --model gpt-4o -s experiment_nameReorganize run results into a standard submission format, here we store the answer directly into the evaluation suite
python get_spider2_submission_data.py --experiment_suffix <The name of this experiment> --results_folder_name <Standard Submission Folders>
python get_spider2_submission_data.py --experiment_suffix gpt-4o-test1 --results_folder_name ../../spider2-dbt/evaluation_suite/gpt-4o-test1You can run evaluate.py in the evaluation suite folder of Spider 2.0 to get the evaluation results.
Bash: Executes shell commands, such as checking file information, running code, or executing DBT commands.CREATE_FILE: Creates a new file with specified content.EDIT_FILE: Edits or overwrites the content of an existing file.BIGQUERY_EXEC_SQL: Executes a SQL query on BigQuery, with an option to save the results.BQ_GET_TABLES: Retrieves all table names and schemas from a specified BigQuery dataset.BQ_GET_TABLE_INFO: Retrieves detailed column information for a specific table in BigQuery.BQ_SAMPLE_ROWS: Samples a specified number of rows from a BigQuery table and saves them as JSON.Terminate: Marks the completion of the task, returning the final result or file path.