Skip to content

Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727

@sgbaird

Description

@sgbaird

For starters, my experience with Ax is running the Loop tutorial once and reading through some of the documentation such as the parameter types (i.e. fairly new). Also, I have some familiarity with Bayesian optimization.

The actual use-case is slightly different and more complicated, but I think the following is a suitable toy example. I go over the problem statement, some setup code, and possible solutions. Would love to hear some feedback.

Problem Statement

Take a composite material with the following class: ingredient combinations:

  • Filler: Colloidal Silica (filler_A)
  • Filler: Milled Glass Fiber (filler_B)
  • Resin: Polyurethane (resin_A)
  • Resin: Silicone (resin_B)
  • Resin: Epoxy (resin_C)

Take some toy data of components and their fractional prevalences (various combinations of fillers and resins, and various numbers of components) along with their objective (training data), and some model which takes arbitrary input parameters and predicts the objective (strength) which we wish to maximize.

For constraints, I'm thinking:

  • limit the total number of components in any given "formula" (e.g. max of 3 components)
  • naturally, that the compositions sum to 1 (or that abs(1-sum(composition)) <= tol)
  • there has to be at least one filler and at least one resin (if feasible)

Setup Code

To make it more concrete, it might look like the following:

choices = ["filler_A", "filler_B", "resin_A", "resin_B", "resin_C", "dummy"]

data = [
        [["filler_A", "filler_B", "resin_C"], [0.4, 0.4, 0.2]],
        [["filler_A", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
        [["filler_A", "filler_B", "resin_B"], [0.5, 0.3, 0.2]],
        [["filler_A", "resin_B", "dummy"], [0.5, 0.5, 0.0]],
        [["filler_B", "resin_C", "dummy"], [0.6, 0.4, 0.0]],
        [["filler_A", "filler_B", "resin_A"], [0.2, 0.2, 0.6]],
        [["filler_B", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
        ] # made-up data

def predict(objects, composition):
    ...
    return obj

Possible Solutions

One-hot-like prevalence encoding and components/composition

One-hot-like prevalence encoding

I've thought about trying to do a sort of "one-hot encoding" (assuming I'm using this term correctly), such that each component gets its own composition as a variable:

filler_A filler_B resin_A resin_B resin_C
0.4 0.4 -- -- 0.2
0.6 0.0 0.2 0.2 --
0.5 0.3 -- 0.2 --
0.5 -- -- 0.5 --
-- 0.6 -- -- 0.4
0.2 0.2 0.6 -- --
-- 0.6 0.2 0.2 --

which I think would look like the following:

best_parameters, values, experiment, model = optimize(
    parameters=[
        {
            "name": "filler_A",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "filler_B",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_A",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_B",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_C",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    experiment_name="composition_test",
    objective_name="strength",
    evaluation_function=predict,
    parameter_constraints=["abs(1 - (filler_A + filler_B + resin_A + resin_B + resin_C)) <= 1e-6", "filler_A + filler_B > 0", "resin_A + resin_B + resin_C > 0"], # not sure if I can use `abs` here
    total_trials=30,
)

However, this could easily lead to compositions where all of the components have a finite prevalence and can be problematic from an experimental perspective.

components/composition

As I mentioned in the constraints, I've also thought about setting an upper limit to the number of components in a formula, which I think might look something like the following:

best_parameters, values, experiment, model = optimize(
    parameters=[
        {
            "name": "object1",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "object2",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "object3",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "composition1",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "composition2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "composition3",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    experiment_name="composition_test",
    objective_name="strength",
    evaluation_function=predict,
    parameter_constraints=["abs(1 - (composition1 + composition2 + composition3)) <= 1e-6"],
    total_trials=30,
)

How would you suggest implementing this use-case in Ax? If it would help, I'd be happy to flesh this out into a full MWE or try out any suggestions. The real use-case involves ~100 different components across 4 different classes, and the idea is to (eventually) use this in an experimental adaptive design scheme.

(tag @ramz-i who is the individual in charge of this project in our research group, post here if you have anything to add)

#706

Metadata

Metadata

Labels

questionFurther information is requested

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions