Skip to content

Official repository of the DeepIsaHOL project (number: 101102608) titled Reinforcement learning to improve proof-automation in theorem proving

License

Notifications You must be signed in to change notification settings

yonoteam/DeepIsaHOL

Repository files navigation

DeepIsaHOL

This is the repository of the DeepIsaHOL project. It was supported by the DeepIsaHOL MSCA Fellowship (number: 101102608) titled Reinforcement learning to improve proof-automation in theorem proving. The project's long-term objective is to use the Isabelle proof assistant as a reinforcement learning (RL) environment for training RL algorithms that ultimately serve as tools for Isabelle users.

The project currently offers:

  1. proof data retrieving capabilities from Isabelle libraries,
  2. a read-eval-print-loop interface for Isabelle in Scala and Python,
  3. Python training loops for Google's T5 models (using Hugging Face) and Gemma models (using Unsloth),
  4. a simple depth-first (dfs) search algorithm for evaluating the models ability to automatically prove Isabelle theorems,
  5. simple Python-server and Isabelle/ML-client structures for calling the models from Isabelle,
  6. Isabelle commands to interact with locally hosted models. Currently, there is support for ChatGPT, Gemini, Ollama and Hugging Face transformers.
  7. (optional) an application of some libraries from this repository to proof tools that appear in super_sketch_and_super_fix. The tools serve for making a proof sketch of a stated lemma and for automatically fixing broken proof-scripts due to continuous development in the formal verification process.

Additionally, generated data examples, weights of the T5 models trained with this repository, and logs and examples of the proofs generated by the models are available in this Zenodo repository from March, 2025.

Table of Contents

Instalation

Depending on your usage, you will want to install different dependencies but all processes require the Isabelle2025 proof assistant. We provide the installer file install_isabelle.sh for it that also installs the Archive of Formal Proofs (AFP). If you want to work with your own .thy files. For the tasks above, we provide specific instructions:

  1. Clone this repository.
  2. LLMs in Isabelle: You will need a Python environment with the necessary Python packages installed. The fastest way to install them is via our installer install_py_env.sh. It already installs the Python packages for ChatGPT, Ollama, Gemini and HuggingFace models. If using ChatGPT or Gemini, remember to use your API keys. This YouTube video shows the environment installation process, the configuration of the models, and the command to call them.
  3. Data extraction: For this you need:
  • The sbt build tool for Java and Scala.
  • Dominique Unruh's scala-isabelle updated library for Isabelle2025 (see "Scala level" below). A script installing and compiling these is install_scala_deps.sh. After you have the dependencies, you will also need to adapt this project's directories.scala. Specifically, you will need to provide the full path to this project to isabelle_rl, and the location of your Isabelle application to isabelle_app.
  1. Interaction with Isabelle in Python: For this you need:
  • The sbt build tool for Java and Scala.
  • Dominique Unruh's scala-isabelle updated library for Isabelle2025 (see "Scala level" below).
  • A Python environment with the necessary Python packages installed. For connecting Python to Isabelle, this repository uses the Py4J library (see "Python level" below). See the previous installation instructions for specific installers.
  1. Reproducing the training and evaluation of models: This requires all the dependencies listed in the previous points. Before using install_py_env.sh, change the .yml file in the script to one suitable for your machine. These are located in /src/main/python/. For training and evaluating models, the project uses Hugging Face's Transformers and Accelerate libraries and Unsloth's libraries. All of them depend on PyTorch.
  2. If you wish to extract data from a specific AFP entry or work with your own .thy files, the project works best when you have specified an Isabelle session and you have made Isabelle aware of its location. That is, ensure that:
session logic_name (AFP) = "parent_logic" +
  options [timeout = 1800]
  theories (* these correspond to .thy files in the entry's directory *)
    theory_file_name1 
    theory_file_name2
  document_files
    "root.tex"
    "root.bib"

Data extraction

This project's main.scala extracts data from a directory with Isabelle files.

One can call this functionality via sbt from this repository's top directory:

sbt "run /path/to/a/read/directory/ /path/to/a/writing/directory/"

Alternatively, you can use writer.scala or writer.python for interactively extracting data via their corresponding REPLs.

Scala:

scala> import isabelle_rl._

scala> import scala.sys.process._

// For this example, let us extract data from the Isabelle session on lineal temporal logic (LTL).
scala> val logic = "LTL"

scala> val data_extractor = new Writer("/path/to/your/afp/thys/LTL", "/path/to/your/data/output/dir/", logic)
/* (replies with) 
Imports: Adding local /path/to/your/afp/thys/LTL/example/Example.thy
Imports: Adding local /path/to/your/afp/thys/LTL/example/Rewriting.thy
...
Initialised minion for directory: /path/to/your/afp/thys/LTL
*/

// Extract data from the LTL theory file Equivalence_Relations.thy
scala> data_extractor.write_data("Equivalence_Relations.thy")

// Check that data is extracted
scala> "ls /path/to/your/data/output/dir/Equivalence_Relations".!
/* (replies with) 
proof0.json
proof1.json
proof10.json
proof11.json
...
*/

// Alternatively, extract data from all .thy files in the directory
scala> data_extractor.write_all()
/* (replies with)
Creating proofs for /path/to/your/afp/thys/LTL/LTL.thy
Creating proofs for /path/to/your/afp/thys/LTL/Equivalence_Relations.thy
...
*/

// Close the Isabelle process
scala> data_extractor.shutdown()

Python:

Enable a Scala port in a command-line window via sbt "runMain isabelle_rl.Py4j_Gateway_Main" (from this repository's top directory), then:

>>> import sys
>>> sys.path.append('/path/to/this/projects/src/main/python')
>>> from writer import Writer

>>> data_extractor = Writer('/path/to/your/read/directory/with/isabelle/files/', '/path/to/your/data/output/dir', 'the_logic_name_in_the_root_file')

>>> data_extractor.write_data("Equivalence_Relations") # same interface in Python as in Scala

>>> data_extractor.write_all() # you can see the process working in the Scala window

>>> data_extractor.shutdown_gateway() # close the Isabelle and Scala processes

In case of connections issues, you can manage your ports in the (automatically created) ports.json at the top directory of this repository.

Data interaction

You can see the REPL's methods in this project's /src/main/scala/repl.scala. For an example of how these methods are used, you can check the script /src/main/python/dfs.py that uses T5 models for interacting with Isabelle via the Python REPL. The functionality is analogous to that of the Writer object above.

Scala:

scala> import isabelle_rl._

scala> val repl = new REPL("HOL") // see Isabelle's documentation for more logic options
REPL started!
val repl: isabelle_rl.REPL = isabelle_rl.REPL@7274a164

scala> repl.apply("lemma \"\\<forall>x. P x \\<Longrightarrow> P c\"")
val res0: String = proof (prove)  goal (1 subgoal):   1. \<forall>x. P x \<Longrightarrow> P c

scala> repl.apply("apply blast")
val res3: String = proof (prove)  goal:  No subgoals!

scala> repl.apply("done")
val res4: String = ""

scala> repl.shutdown_isabelle() // close the Isabelle process

Python:

Again, start a Scala port in a command-line window via sbt "runMain isabelle_rl.Py4j_Gateway_Main" (from this repository's top directory), then run a Python session:

>>> import sys
>>> sys.path.append('/path/to/this/project/src/main/python')
>>> from repl import REPL

>>> repl = REPL("HOL")
REPL and minion initialized.

>>> repl.apply("lemma \"\\<forall>x. P x \\<Longrightarrow> P c\"")
'proof (prove)  goal (1 subgoal):   1. \\<forall>x. P x \\<Longrightarrow> P c'

>>> repl.apply("apply simp")
'proof (prove)  goal:  No subgoals!'

>>> repl.apply("done")
''

>>> repl.shutdown_gateway() # do not forget to close the Isabelle process

Implementation details

How it works?

The project has three levels: Isabelle, Scala and Python. They are linearly ordered with Isabelle being the lowest level, and Python being the highest. The Isabelle proof assistant is in charge of retrieving all data from its sources. The other 2 levels interact with functions, structures and/or methods from the level immediately below. Each level satisfies a need that the previous level cannot. Scala enables a programmatic interaction with Isabelle while also having more features than the Isabelle/ML programming language. Python enables the interaction with popular machine learning, deep learning, and reinforcement learning libaries.

Isabelle level

You can use Isabelle's jEdit prover IDE (PIDE) to experiment with this project's ML libraries. For instance, this project's file get.ML includes various functions for retrieving specific data from the proof assistant. You can open a .thy file (e.g. Temp.thy) with Isabelle and see the result of get.ML functions by giving arguments to them inside Temp.thy and seeing the result in the PIDE's output panel:

theory "Temp"
imports "Complex_Main"

begin

ML_file "~/path/to/this/project/src/main/ml/pred.ML"
ML_file "~/path/to/this/project/src/main/ml/ops.ML"
ML_file "~/path/to/this/project/src/main/ml/get.ML"

ML ‹Get.thms \<^context> "list_all2_eq"›

ML ‹Get.grouped_commands \<^theory>›

(* Returns a list of names and their corresponding list of theorems proved up to this point where the name contains the word "List" and the associated list has only one theorem *)
ML ‹val list_thms = Get.filtered_thms [Pred.on_fst (Get.passes_fact_check \<^context>), Pred.neg (Pred.on_snd Pred.has_many), Pred.on_fst (Pred.contains "List")] \<^context>›

end

Scala level

This level is handled by scala-isabelle. The project provides a minion.scala that receives a working directory and manages Isabelle's theory stack via our imports.scala graph. Then, the writer.scala asks this minion to call the writer.ML function extract_jsons to produce a string of json's with proof data from a .thy file. Alternatively, the writer's write_all uses the minion to retrieve the data and write it to a directory of your choice. Similarly, the repl.scala delegates its functions to the minion.scala who has access to a repl_state.ML model from this project's sources.

Python level

Finally, the project provides Python classes in writer.py and repl.py with the same functionality as their Scala analogs (writer.scala and repl.scala). In the background, these classes simply ask their Scala counterparts to execute the requested commands via Py4j.

Repository structure

Warning: The description below might be outdated. It is here to aid in the navigation for new library users.

.
├── build.sbt
├── CITATION.cff
├── README.md
├── deepisahol_2025.pdf
├── install_isabelle.sh
├── install_py_env.sh
├── install_scala_deps.sh
└── src                           # sources for the above-listed implementations
    ├── main
    │   ├── ml
    │   │   ├── Libraries.thy     # Theory importing this project's ML libraries (works from Isabelle/Pure)
    │   │   ├── Isabelle_RL.thy   # Theory importing this project's ML libraries (works from Isabelle/HOL)
    │   │   ├── ROOT              # Corresponding ROOT file for this project
    │   │   ├── actions.ML        # Isabelle's top-level representation of user-actions
    │   │   ├── client.ML
    │   │   ├── data.ML           # definition of Data.T grouping retrieved data
    │   │   ├── get.ML            # methods for retrieving Isabelle data
    │   │   ├── hammer.ML         # libraries to call Sledgehammer programmatically
    │   │   ├── imports.ML
    │   │   ├── json.ML           # model for JSON files
    │   │   ├── llm.ML            # libraries interacting with models ran in server.py
    │   │   ├── ops.ML            # frequently used operations
    │   │   ├── pred.ML           # frequently used operations using predicates
    │   │   ├── print.ML          # data displying methods
    │   │   ├── repl_state.ML     # model for the REPLs' state
    │   │   ├── sections.ML       # methods for identifying sections of a .thy file
    │   │   ├── seps.ML           # .thy file section-delimiter methods based on keywords
    │   │   └── writer.ML         # methods for writing the JSON objects
    │   ├── python
    |   │   ├── environment_cuda.yml  # lists python packages that rely on CUDA
    |   │   ├── environment.yml       # lists python packages for training without CUDA
    |   │   ├── isa_generation.yml    # lists python packages for calling genAI models from Isabelle
    |   │   ├── config_ops.py     # for the configuration file of the training of models
    │   │   ├── dfs.py            # depth-first-search loop for automated theorem proving
    │   │   ├── dicts.py          # operations manipulating Python dictionaries
    │   │   ├── distrib.py        # operations for distributed model training and evaluation
    │   │   ├── eval_t5.py        # evaluation loops for the LLMs
    │   │   ├── generation_ops.py # frequently used model generation functions
    │   │   ├── llm_server.py     # a server hosting a genAI model
    │   │   ├── ops.py            # frequently used operations
    │   │   ├── proofs.py
    │   │   │   ├── __init__.py   # methods for data-retrieval from the generated proof-JSONs
    │   │   │   ├── data_dir.py   # methods on the generated proof-JSONs directory
    │   │   │   ├── data_stats.py # statistics from the generated proof-JSONs
    │   │   │   └── str_ops.py    # string operations on the generated proof-JSONs
    │   │   ├── repl.py           # implementation of the Python REPL
    │   │   ├── save_ops.py       # methods for saving tokenizers, datasets, and models
    │   │   ├── tokenizer_ops.py  # frequently used tokenizer operations
    │   │   ├── train_t5.py       # training loop for the T5 models
    │   │   ├── train_gemma.py    # training loop for the Gemma models with HuggingFace
    │   │   ├── train_unsloth_gemma.py # loop for training the Gemma models with Unsloth
    │   │   └── writer.py         # interface with the data-extraction algorithm
    │   └── scala
    │       ├── directories.scala # register of important directories for this repository
    │       ├── graph.scala       # a generic graph model
    │       ├── imports.scala     # graph for managing Isabelle sessions
    │       ├── main.scala        # user-entry to the data-extraction algorithm
    │       ├── minion.scala      # module in charge of managing Isabelle from Scala
    │       ├── py4j_gateway.scala # the interface between Scala and Python
    │       ├── repl.scala        # implementation of the Scala REPL
    │       ├── utils.scala       # frequently used operations
    │       └── writer.scala      # implementation of the data-extraction algorithm
    └── tests                     # testing files for the methods in the main directory
        ├── ml
        │   └── Get.thy
        └── python
            ├── count_samples.py
            ├── test_dataloading.py
            ├── test_proofs.py
            └── test_repl.py

About

Official repository of the DeepIsaHOL project (number: 101102608) titled Reinforcement learning to improve proof-automation in theorem proving

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published