<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Integer Programming on adventures in optimization</title>
    <link>https://ryanjoneil.dev/tags/integer-programming/</link>
    <description>Recent content in Integer Programming on adventures in optimization</description>
    <generator>Hugo -- 0.151.1</generator>
    <language>en</language>
    <lastBuildDate>Wed, 25 Jun 2025 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://ryanjoneil.dev/tags/integer-programming/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>🐏 RAMS Reboot</title>
      <link>https://ryanjoneil.dev/posts/2025-06-25-rams-reboot/</link>
      <pubDate>Wed, 25 Jun 2025 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2025-06-25-rams-reboot/</guid>
      <description>Solve MILPs with open source optimization solvers and Ruby.</description>
      <content:encoded><![CDATA[<p>Some years ago, I worked on real-time meal delivery at Zoomer, a YC startup based out of Philadelphia. Zoomer&rsquo;s production tech stack was primarily Ruby. As it grew we moved from using heuristics for things like routing and scheduling to open source optimization solvers.</p>
<p>Like most languages that aren&rsquo;t Python, Ruby doesn&rsquo;t have an especially mature ecosystem for optimization (or data science, or machine learning, for that matter). For some use cases that didn&rsquo;t matter. When we upgraded the routing engine, we built a model in C++ using <a href="https://www.gecode.org/">Gecode</a> and wrapped a Ruby gem around a <a href="https://www.swig.org/">SWIG</a> wrapper. But when we wanted to use integer programming to build schedules, the lack of solver APIs proved inconvenient.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>At the time, <a href="https://coin-or.github.io/pulp/index.html">PuLP</a> was probably the most commonly used open source multi-solver Python library for linear and integer programming.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> This led me the opportunity to develop <a href="https://github.com/ryanjoneil/rams">RAMS</a>, a PuLP-inspired library for basic MILP modeling in Ruby.</p>
<p>Then the Zoomer team became part of Grubhub. We moved to a Java stack and a commercial optimization solver. Improvements to the RAMS project languished on my todo list. It lagged behind major versions of Ruby, optimization, solvers, and dependencies, painfully out of date and unmaintained.</p>
<p>Then, last month, <a href="https://github.blog/news-insights/product-news/github-copilot-meet-the-new-coding-agent/">Github released its Copilot agent</a>. Unlike vibe coding directly in the editor, which sounds like <a href="https://deplet.ing/the-copilot-delusion/">speeding maniacally through a bad acid trip</a>, the idea here is more like running a project: create issues, receive and comment on pull requests, iterate.</p>
<p>I figured the grunt work of library upgrades should be perfect fodder to try out an AI developer assistant. RAMS is already well structured and tested. The upgrade is well defined. No creativity required.</p>
<h2 id="a-rams-modeling-example">A RAMS modeling example</h2>
<p>This post is meandering through two topics: solving optimization models with Ruby and RAMS, and my experiences maintaining that library using Copilot. I could have split this into two posts, but that didn&rsquo;t feel right. So let&rsquo;s show what building a model in RAMS looks like first.</p>
<p>I don&rsquo;t use Ruby with any regularity these days<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, but modeling with RAMS reminded me how elegant Ruby DSLs can be. Here&rsquo;s a simple example of a binary integer program.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-ruby" data-lang="ruby"><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic">#!/usr/bin/env ruby</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>require <span style="color:#a5d6ff">&#39;rams&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#79c0ff;font-weight:bold">RAMS</span><span style="color:#ff7b72;font-weight:bold">::</span><span style="color:#79c0ff;font-weight:bold">Model</span><span style="color:#ff7b72;font-weight:bold">.</span>new
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>x1 <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>variable <span style="color:#a5d6ff">type</span>: <span style="color:#a5d6ff">:binary</span>
</span></span><span style="display:flex;"><span>x2 <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>variable <span style="color:#a5d6ff">type</span>: <span style="color:#a5d6ff">:binary</span>
</span></span><span style="display:flex;"><span>x3 <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>variable <span style="color:#a5d6ff">type</span>: <span style="color:#a5d6ff">:binary</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>constrain(x1 <span style="color:#ff7b72;font-weight:bold">+</span> x2 <span style="color:#ff7b72;font-weight:bold">+</span> x3 <span style="color:#ff7b72;font-weight:bold">&lt;=</span> <span style="color:#a5d6ff">2</span>)
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>constrain(x2 <span style="color:#ff7b72;font-weight:bold">+</span> x3 <span style="color:#ff7b72;font-weight:bold">&lt;=</span> <span style="color:#a5d6ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>sense <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#a5d6ff">:max</span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>objective <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#a5d6ff">1</span> <span style="color:#ff7b72;font-weight:bold">*</span> x1 <span style="color:#ff7b72;font-weight:bold">+</span> <span style="color:#a5d6ff">2</span> <span style="color:#ff7b72;font-weight:bold">*</span> x2 <span style="color:#ff7b72;font-weight:bold">+</span> <span style="color:#a5d6ff">3</span> <span style="color:#ff7b72;font-weight:bold">*</span> x3
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>solution <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>solve
</span></span><span style="display:flex;"><span>puts <span style="color:#a5d6ff">&lt;&lt;-HERE
</span></span></span><span style="display:flex;"><span><span style="color:#a5d6ff"></span><span style="color:#a5d6ff">objective</span>: <span style="color:#8b949e;font-style:italic">#{solution.objective}</span>
</span></span><span style="display:flex;"><span>x1 <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#8b949e;font-style:italic">#{solution[x1]}</span>
</span></span><span style="display:flex;"><span>x2 <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#8b949e;font-style:italic">#{solution[x2]}</span>
</span></span><span style="display:flex;"><span>x3 <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#8b949e;font-style:italic">#{solution[x3]}</span>
</span></span><span style="display:flex;"><span><span style="color:#79c0ff;font-weight:bold">HERE</span>
</span></span></code></pre></div><p>I think that&rsquo;s rather nice, and very clean.</p>
<h2 id="rams-enhancements">RAMS enhancements</h2>
<p>The biggest change in RAMS is that it now supports the <a href="https://highs.dev/">HiGHS optimization solver</a>. Prior to v0.2.0, GLPK was the default solver, but now that is HiGHS. There are a number of smaller changes as well.</p>
<ul>
<li>RAMS requires Ruby v3.1.</li>
<li>CPLEX support was removed since I can&rsquo;t test it.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></li>
<li>One can set solver paths using environment variables (e.g. <code>RAMS_SOLVER_PATH_CBC</code>).</li>
<li>Improved documentation and <a href="https://raw.githubusercontent.com/ryanjoneil/rams/main/logo.svg">a logo</a>!</li>
</ul>
<h2 id="the-copilot-agent-as-coding-companion">The Copilot agent as coding companion</h2>
<p>While I tend to err on the side of LLM skepticism, working with the Copilot agent for this upgrade was generally positive. It was a bit like working with a fast, responsive, and inexperienced developer. The issues it ran into were pretty much the same, but the time scale was compressed.</p>
<p>I had it open three pull requests for me.</p>
<h3 id="-pr-29-upgrade-ruby-and-dependencies">🤨 PR 29: <a href="https://github.com/ryanjoneil/rams/pull/29">Upgrade Ruby and dependencies</a></h3>
<p>Performance here was middling. Copilot got through some of the task without assistance. It also made a number of changes that were unhelpful and irrelevant to the request.</p>
<p>On a positive note, I forgot to ask it to change from CircleCI to GitHub Actions for testing. This gave me <a href="https://github.com/ryanjoneil/rams/pull/29#issuecomment-2997815058">the opportunity to test its response to feature creep</a>. It responded with <a href="https://github.com/ryanjoneil/rams/pull/29/commits/8f8f054044f3008cfd12a8f81bfd32c519555f70">a partially working GitHub Actions workflow</a> (and no grumbling!).</p>
<p>Copilot made a number of errors and wasn&rsquo;t able to finish the upgrade on its own.</p>
<ul>
<li>It decided to <a href="https://github.com/ryanjoneil/rams/pull/29/commits/8f8f054044f3008cfd12a8f81bfd32c519555f70#diff-faff1af3d8ff408964a57b2e475f69a6b7c7b71c9978cccc8f471798caac2c88R44-R65">build the optimizers from source</a> instead of simply installing binary packages using <code>apt</code> or <code>dnf</code>. Not only is this wasteful and overly complicated, <a href="https://github.com/ryanjoneil/rams/actions/runs/15834266623/job/44634918114#step:6:1563">it ultimately wasn&rsquo;t able to build and install them from source</a>.</li>
<li>Once I told it to use a Fedora 42 base image, this improved, but it couldn&rsquo;t figure out what package to use for the CBC solver. It switched back and forth without prompting between <code>cbc</code> (incorrect) and <code>coin-or-Cbc</code> (correct).</li>
<li>It inexplicably <a href="https://github.com/ryanjoneil/rams/pull/29#issuecomment-2997857566">couldn&rsquo;t figure out the latest stable version of Ruby</a>.</li>
<li><a href="https://github.com/ryanjoneil/rams/pull/29#discussion_r2162808749">It added a bunch of architecture-specific package definitions</a> to the build, unprompted. This was unnecessary given that RAMS is a vanilla Ruby project.</li>
<li>I had to help it figure out that <a href="https://github.com/ryanjoneil/rams/pull/29/commits/135b6967d35f530c35ed4e111a04f59ba7300a67">the CBC binary is now called <code>coin.cbc</code> on Fedora</a>. This wasn&rsquo;t entirely surprising.</li>
</ul>
<h3 id="-pr-32-add-environment-variables-for-solver-paths">🤩 PR 32: <a href="https://github.com/ryanjoneil/rams/pull/32">Add environment variables for solver paths</a></h3>
<p>Copilot did a great job on this task. I had no issue with the code it wrote. It followed the style of the rest of the package nicely. It added appropriate documentation and unit tests.</p>
<h3 id="-pr-34-support-the-highs-optimization-solver">👌 PR 34: <a href="https://github.com/ryanjoneil/rams/pull/34">Support the HiGHS optimization solver</a></h3>
<p>Copilot did pretty well here, even though it didn&rsquo;t get the feature working. It was able to create a new solver interface and get most of the logic for solution parsing right. I was a little surprised that <a href="https://github.com/ryanjoneil/rams/pull/34#pullrequestreview-2952084649">it forgot to test the new solver integration in GitHub Actions</a>. The biggest issue it needed my help on <a href="https://github.com/ryanjoneil/rams/pull/34/commits/18e8095e97edba8198e1830e4a2a86960975a964#diff-3ac91a373440402d1cdd88e9987f4ea20efa6c0d82fee69ed7a4b4e0ff24b1d1L22-L23">was solution status parsing</a>, where it didn&rsquo;t realize that the second condition here will never trigger.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-ruby" data-lang="ruby"><span style="display:flex;"><span><span style="color:#ff7b72">return</span> <span style="color:#a5d6ff">:feasible</span> <span style="color:#ff7b72">if</span> status <span style="color:#ff7b72;font-weight:bold">=~</span> <span style="color:#79c0ff">/feasible/i</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">return</span> <span style="color:#a5d6ff">:infeasible</span> <span style="color:#ff7b72">if</span> status <span style="color:#ff7b72;font-weight:bold">=~</span> <span style="color:#79c0ff">/infeasible/i</span>
</span></span></code></pre></div><p>This should have been the following (note the <code>^</code>).</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-ruby" data-lang="ruby"><span style="display:flex;"><span><span style="color:#ff7b72">return</span> <span style="color:#a5d6ff">:feasible</span> <span style="color:#ff7b72">if</span> status <span style="color:#ff7b72;font-weight:bold">=~</span> <span style="color:#79c0ff">/^feasible/i</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">return</span> <span style="color:#a5d6ff">:infeasible</span> <span style="color:#ff7b72">if</span> status <span style="color:#ff7b72;font-weight:bold">=~</span> <span style="color:#79c0ff">/infeasible/i</span>
</span></span></code></pre></div><div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I don&rsquo;t remember finding any MILP modeling interfaces for Ruby like <a href="https://coin-or.github.io/pulp/index.html">PuLP</a> in 2016-17. More recently, <a href="https://github.com/wouterken/rulp">Rulp</a> and <a href="https://github.com/ankane/opt">Opt</a> have been developed.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>PulLP is still <a href="https://github.com/coin-or/pulp/graphs/contributors">heavily used and developed</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Once upon a time <a href="/tags/obfuscation/">I was a Perl programmer</a>. Ruby was originally written to be a better Perl. I&rsquo;ve long since given up the old ways.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>For now, RAMS is focussing on open source solvers. Maintaining commercial solver licenses can be challenging when you&rsquo;re not part of academia. PRs welcome.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>👔 Hierarchical Optimization with HiGHS</title>
      <link>https://ryanjoneil.dev/posts/2024-11-11-hierarchical-optimization-with-highs/</link>
      <pubDate>Mon, 11 Nov 2024 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2024-11-11-hierarchical-optimization-with-highs/</guid>
      <description>Managing trade-offs between different objectives with HiGHS.</description>
      <content:encoded><![CDATA[<p>In the <a href="../2024-11-08-hierarchical-optimization-with-gurobi/">last post</a>, we used Gurobi&rsquo;s hierarchical optimization features to compute the Pareto front for primary and secondary objectives in an assignment problem. This relied on Gurobi&rsquo;s <code>setObjectiveN</code> method and its internal code for managing hierarchical problems.</p>
<p>Some practitioners may need to do this without access to a commercial license. This post adapts the previous example to use HiGHS and its native Python interface, <a href="https://pypi.org/project/highspy/"><code>highspy</code></a>. It&rsquo;s also useful to see what the procedure is in order to understand it better. This isn&rsquo;t exactly what I&rsquo;d call <em>hard</em>, but it is <em>easy to mess up</em>.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h2 id="code">Code</h2>
<p>The mathematical models are available in the last post, so I won&rsquo;t restate them here. We start in roughly the same manner as before<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>: create a binary variable for each worker-patient pair, add assignment problem constraints, and state the primary objective.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">itertools</span> <span style="color:#ff7b72">import</span> product
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">import</span> <span style="color:#ff7b72">highspy</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>n <span style="color:#ff7b72;font-weight:bold">=</span> len(data[<span style="color:#a5d6ff">&#34;cost&#34;</span>])
</span></span><span style="display:flex;"><span>workers <span style="color:#ff7b72;font-weight:bold">=</span> range(n)
</span></span><span style="display:flex;"><span>patients <span style="color:#ff7b72;font-weight:bold">=</span> range(n)
</span></span><span style="display:flex;"><span>workers_patients <span style="color:#ff7b72;font-weight:bold">=</span> list(product(workers, patients))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>h <span style="color:#ff7b72;font-weight:bold">=</span> highspy<span style="color:#ff7b72;font-weight:bold">.</span>Highs()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># x[w,p] = 1 if worker w is assigned to patient p.</span>
</span></span><span style="display:flex;"><span>x <span style="color:#ff7b72;font-weight:bold">=</span> {(w, p): h<span style="color:#ff7b72;font-weight:bold">.</span>addBinary(obj<span style="color:#ff7b72;font-weight:bold">=</span>data[<span style="color:#a5d6ff">&#34;cost&#34;</span>][w][p]) <span style="color:#ff7b72">for</span> w, p <span style="color:#ff7b72;font-weight:bold">in</span> workers_patients}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Each worker is assigned to one patient.</span>
</span></span><span style="display:flex;"><span>h<span style="color:#ff7b72;font-weight:bold">.</span>addConstrs(sum(x[w, p] <span style="color:#ff7b72">for</span> p <span style="color:#ff7b72;font-weight:bold">in</span> patients) <span style="color:#ff7b72;font-weight:bold">==</span> <span style="color:#a5d6ff">1</span> <span style="color:#ff7b72">for</span> w <span style="color:#ff7b72;font-weight:bold">in</span> workers)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Each patient is assigned one worker.</span>
</span></span><span style="display:flex;"><span>h<span style="color:#ff7b72;font-weight:bold">.</span>addConstrs(sum(x[w, p] <span style="color:#ff7b72">for</span> w <span style="color:#ff7b72;font-weight:bold">in</span> workers) <span style="color:#ff7b72;font-weight:bold">==</span> <span style="color:#a5d6ff">1</span> <span style="color:#ff7b72">for</span> p <span style="color:#ff7b72;font-weight:bold">in</span> patients)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Primary objective: minimize cost.</span>
</span></span><span style="display:flex;"><span>h<span style="color:#ff7b72;font-weight:bold">.</span>setMinimize()
</span></span><span style="display:flex;"><span>h<span style="color:#ff7b72;font-weight:bold">.</span>solve()
</span></span><span style="display:flex;"><span>cost <span style="color:#ff7b72;font-weight:bold">=</span> h<span style="color:#ff7b72;font-weight:bold">.</span>getObjectiveValue()
</span></span></code></pre></div><p>Note that if the costs and affinities were lists instead of matrices, we could have used <code>h.addBinaries</code> instead of <code>h.addBinary</code>.</p>
<p>From here we&rsquo;ll be solving the model twice for every value of alpha. These expressions for total cost and affinity will make a code a little cleaner.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>cost_expr <span style="color:#ff7b72;font-weight:bold">=</span> sum(data[<span style="color:#a5d6ff">&#34;cost&#34;</span>][w][p] <span style="color:#ff7b72;font-weight:bold">*</span> x[w, p] <span style="color:#ff7b72">for</span> w, p <span style="color:#ff7b72;font-weight:bold">in</span> workers_patients)
</span></span><span style="display:flex;"><span>affinity_expr <span style="color:#ff7b72;font-weight:bold">=</span> sum(data[<span style="color:#a5d6ff">&#34;affinity&#34;</span>][w][p] <span style="color:#ff7b72;font-weight:bold">*</span> x[w, p] <span style="color:#ff7b72">for</span> w, p <span style="color:#ff7b72;font-weight:bold">in</span> workers_patients)
</span></span></code></pre></div><p>Now comes the hierarchical optimization logic. For every value of alpha, we find the best affinity possible while keeping cost within alpha of its best possible value.</p>
<ul>
<li>Update the objective function to maximize affinity (see the calls to <code>h.changeColCost</code> and <code>h.setMaximize</code>).</li>
<li>Constrain the cost to be within alpha of the original optimal cost (see <code>cost_cons</code>).</li>
<li>Re-optimize and save the maximal affinity.</li>
</ul>
<p>Now we constrain the affinity and re-optimize cost.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<ul>
<li>Update the objective function to minimize cost again.</li>
<li>Constrain the affinity.</li>
</ul>
<p>Once that&rsquo;s done, we remove the additional constraints and repeat for a new value of alpha.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#ff7b72">for</span> alpha <span style="color:#ff7b72;font-weight:bold">in</span> alphas:
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># Secondary objective: maximize affinity.</span>
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">for</span> (w, p), x_wp <span style="color:#ff7b72;font-weight:bold">in</span> x<span style="color:#ff7b72;font-weight:bold">.</span>items():
</span></span><span style="display:flex;"><span>        h<span style="color:#ff7b72;font-weight:bold">.</span>changeColCost(x_wp<span style="color:#ff7b72;font-weight:bold">.</span>index, data[<span style="color:#a5d6ff">&#34;affinity&#34;</span>][w][p])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># Constrain cost to be within alpha of maximum.</span>
</span></span><span style="display:flex;"><span>    cost_cons <span style="color:#ff7b72;font-weight:bold">=</span> h<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(cost_expr <span style="color:#ff7b72;font-weight:bold">&lt;=</span> (<span style="color:#a5d6ff">1</span> <span style="color:#ff7b72;font-weight:bold">+</span> alpha) <span style="color:#ff7b72;font-weight:bold">*</span> cost)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    h<span style="color:#ff7b72;font-weight:bold">.</span>setMaximize()
</span></span><span style="display:flex;"><span>    h<span style="color:#ff7b72;font-weight:bold">.</span>solve()
</span></span><span style="display:flex;"><span>    affinity <span style="color:#ff7b72;font-weight:bold">=</span> h<span style="color:#ff7b72;font-weight:bold">.</span>getObjectiveValue()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># Re-optimize with original cost objective, constraining affinity.</span>
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">for</span> (w, p), x_wp <span style="color:#ff7b72;font-weight:bold">in</span> x<span style="color:#ff7b72;font-weight:bold">.</span>items():
</span></span><span style="display:flex;"><span>        h<span style="color:#ff7b72;font-weight:bold">.</span>changeColCost(x_wp<span style="color:#ff7b72;font-weight:bold">.</span>index, data[<span style="color:#a5d6ff">&#34;cost&#34;</span>][w][p])
</span></span><span style="display:flex;"><span>    affinity_cons <span style="color:#ff7b72;font-weight:bold">=</span> h<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(affinity_expr <span style="color:#ff7b72;font-weight:bold">&gt;=</span> affinity)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    h<span style="color:#ff7b72;font-weight:bold">.</span>setMinimize()
</span></span><span style="display:flex;"><span>    h<span style="color:#ff7b72;font-weight:bold">.</span>solve()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">yield</span> alpha, h<span style="color:#ff7b72;font-weight:bold">.</span>getObjectiveValue(), affinity
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># Remove cost and affinity constraints for</span>
</span></span><span style="display:flex;"><span>    h<span style="color:#ff7b72;font-weight:bold">.</span>removeConstr(cost_cons)
</span></span><span style="display:flex;"><span>    h<span style="color:#ff7b72;font-weight:bold">.</span>removeConstr(affinity_cons)
</span></span></code></pre></div><p>Encouragingly, running this using the <code>model.py</code> linked below gives the same values as the Gurobi model, albeit not as quickly. Floating point values are rounded for readability.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-txt" data-lang="txt"><span style="display:flex;"><span>| alpha | cost     | affinity |
</span></span><span style="display:flex;"><span>| ----- | -------- | -------- |
</span></span><span style="display:flex;"><span>| 0.0   | 11212.0  | 53816.0  |
</span></span><span style="display:flex;"><span>| 0.05  | 11761.0  | 74001.0  |
</span></span><span style="display:flex;"><span>| 0.1   | 12332.0  | 79981.0  |
</span></span><span style="display:flex;"><span>| 0.15  | 12886.0  | 83103.0  |
</span></span><span style="display:flex;"><span>| 0.2   | 13454.0  | 85394.0  |
</span></span><span style="display:flex;"><span>| 0.25  | 13996.0  | 87136.0  |
</span></span><span style="display:flex;"><span>| 0.3   | 14557.0  | 88546.0  |
</span></span><span style="display:flex;"><span>| 0.35  | 15125.0  | 89751.0  |
</span></span><span style="display:flex;"><span>| 0.4   | 15670.0  | 90664.0  |
</span></span><span style="display:flex;"><span>| 0.45  | 16255.0  | 91345.0  |
</span></span><span style="display:flex;"><span>| 0.5   | 16816.0  | 91997.0  |
</span></span><span style="display:flex;"><span>| 0.55  | 17370.0  | 92537.0  |
</span></span><span style="display:flex;"><span>| 0.6   | 17924.0  | 93012.0  |
</span></span><span style="display:flex;"><span>| 0.65  | 18495.0  | 93491.0  |
</span></span><span style="display:flex;"><span>| 0.7   | 19055.0  | 93829.0  |
</span></span><span style="display:flex;"><span>| 0.75  | 19591.0  | 94228.0  |
</span></span><span style="display:flex;"><span>| 0.8   | 20167.0  | 94530.0  |
</span></span><span style="display:flex;"><span>| 0.85  | 20737.0  | 94833.0  |
</span></span><span style="display:flex;"><span>| 0.9   | 21295.0  | 95114.0  |
</span></span><span style="display:flex;"><span>| 0.95  | 21812.0  | 95361.0  |
</span></span><span style="display:flex;"><span>| 1.0   | 22402.0  | 95613.0  |
</span></span></code></pre></div><h2 id="resources">Resources</h2>
<ul>
<li><a href="/files/2024-11-11-hierarchical-optimization-with-highs/model.py"><code>model.py</code></a> hierarchical objectives HiGHS model</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>It gets even easier to mess up with more than two objectives.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Isn&rsquo;t it nice that MIP modeling is similar across different APIs?&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Exercise for the reader: why do we need to re-optimize cost?&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>👔 Hierarchical Optimization with Gurobi</title>
      <link>https://ryanjoneil.dev/posts/2024-11-08-hierarchical-optimization-with-gurobi/</link>
      <pubDate>Fri, 08 Nov 2024 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2024-11-08-hierarchical-optimization-with-gurobi/</guid>
      <description>Managing trade-offs between different objectives with Gurobi.</description>
      <content:encoded><![CDATA[<p>One of the first technology choices to make when setting up an optimization stack is which modeling interface to use. Even if we restrict our choices to Python interfaces for MIP modeling, there are lots of options to consider.</p>
<p>If you use a specific solver, you can opt for its native Python interface. Examples include libraries like <a href="https://pypi.org/project/gurobipy/"><code>gurobipy</code></a>, <a href="https://docs.mosek.com/latest/pythonfusion/index.html">Fusion</a>, <a href="https://pypi.org/project/highspy/"><code>highspy</code></a>, or <a href="https://github.com/scipopt/PySCIPOpt"><code>PySCIPOpt</code></a>. This approach provides access to important solver-specific features such as lazy constraints, heuristics, and various solver settings. However, it can also lock you into a solver before ready for that.</p>
<p>You can also choose a modeling API that targets multiple solvers. In the Python ecosystem. These are libraries like <a href="https://amplpy.ampl.com/en/latest/"><code>amplpy</code></a>, <a href="http://www.pyomo.org/">Pyomo</a>, <a href="https://github.com/metab0t/PyOptInterface">PyOptInterface</a>, and <a href="https://linopy.readthedocs.io/en/latest/"><code>linopy</code></a>. These interfaces target multiple solver backends (both open source and commercial) and provide a subset of the functionality of each. Since they make it easy to switch between solvers, this is usually where I start.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<h2 id="hierarchical-assignment">Hierarchical assignment</h2>
<p>However, there are plenty of times when solver-specific APIs are useful, or even critical. One example is hierarchical optimization. This is a simple technique for managing trade-offs between multiple objectives in a problem. Let&rsquo;s look at an example.</p>
<p>Imagine we are assigning in-home health care workers ($w \in W$) to patients ($p \in P$). For simplicity, let&rsquo;s say we have $n$ workers and $n$ patients, and we are assigning them one-to-one. Each worker has a given cost ($c_{wp}$) of assignment to each patient, which may reflect something like the travel time to get to them. We want to assign each worker to exactly one patient while minimizing the overall cost.</p>
<h3 id="model">Model</h3>
<p>So far, what we have is a simple linear sum assignment problem.</p>
<p>$$
\begin{align*}
&amp; \text{min}  &amp;&amp; z = \sum_{wp} c_{wp} x_{wp} \\
&amp; \text{s.t.} &amp;&amp; \sum_w x_{wp} = 1 &amp;&amp; \forall \quad p \in P \\
&amp;             &amp;&amp; \sum_p x_{wp} = 1 &amp;&amp; \forall \quad w \in W \\
&amp;             &amp;&amp; x \in \{0,1\}^{|W \times P|}
\end{align*}
$$</p>
<p>Solving this model gives us the minimum cost assignment. That&rsquo;s all well and good, but now say we have a secondary objective of maximizing <em>affinity</em> of workers to patients ($a_{wp}$). That is, we want to <em>prefer assignments that increase overall affinity while still minimizing cost</em>. This is actually a common goal in health care scheduling: if possible, send the same worker to a given patient that you usually send.</p>
<p>Hierarchical optimization gives us a simple way to solve this problem. First, we optimize the model as stated above. This gives us an optimal objective value $z^*$. Then we re-solve the same optimization model, while constraining the cost to be $z^*$ and using the secondary objective function. This says to the optimizer, &ldquo;improve the affinity as much as you can, but keep the cost optimal.&rdquo;</p>
<p>$$
\begin{align*}
&amp; \text{max}  &amp;&amp; w = \sum_{wp} a_{wp} x_{wp} \\
&amp; \text{s.t.} &amp;&amp; \sum_{wp} c_{wp} x_{wp} \le z^* \\
&amp;             &amp;&amp; \sum_w x_{wp} = 1 &amp;&amp; \forall \quad p \in P \\
&amp;             &amp;&amp; \sum_p x_{wp} = 1 &amp;&amp; \forall \quad w \in W \\
&amp;             &amp;&amp; x \in \{0,1\}^{|W \times P|}
\end{align*}
$$</p>
<p>From here, the natural question becomes: what if we trade off some cost for affinity? If we&rsquo;re willing to increase cost by some percentage, how much more affinity do we get? We can do this by setting a constant $\alpha \ge 0$ and solving the model a number of times.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>$$
\begin{align*}
&amp; \text{max}   &amp;&amp; w = \sum_{wp} a_{wp} x_{wp} \\
&amp; \text{s.t.} &amp;&amp; \sum_{wp} c_{wp} x_{wp} \le (1 + \alpha) z^* \\
&amp;             &amp;&amp; \sum_w x_{wp} = 1 &amp;&amp; \forall \quad p \in P \\
&amp;             &amp;&amp; \sum_p x_{wp} = 1 &amp;&amp; \forall \quad w \in W \\
&amp;             &amp;&amp; x \in \{0,1\}^{|W \times P|}
\end{align*}
$$</p>
<p>For example, if $\alpha = 0.05$, then we&rsquo;re willing to accept a 5% increase in overall cost to improve affinity. Setting different values of $\alpha$ lets us explore the space of that trade-off and its impact on cost and affinity.</p>
<p>Once we solve this and get the optimal affinity ($w^*$), we should re-optimize for the primary objective again while constraining the secondary one.</p>
<p>$$
\begin{align*}
&amp; \text{min}  &amp;&amp; \sum_{wp} c_{wp} x_{wp} \\
&amp; \text{s.t.} &amp;&amp; \sum_{wp} a_{wp} x_{wp} \ge w^* \\
&amp;             &amp;&amp; \sum_w x_{wp} = 1 &amp;&amp; \forall \quad p \in P \\
&amp;             &amp;&amp; \sum_p x_{wp} = 1 &amp;&amp; \forall \quad w \in W \\
&amp;             &amp;&amp; x \in \{0,1\}^{|W \times P|}
\end{align*}
$$</p>
<h3 id="code">Code</h3>
<p>So the math looks reasonable. How do we implement it? If we have a Gurobi license, we can use <a href="https://docs.gurobi.com/projects/optimizer/en/current/features/multiobjective.html#multiple-objectives">its built-in facilities for multiobjective optimization</a>. This means that, instead solving a model multiple times and adding constraints to keep cost within $\alpha$ of its optimal value, we can create a single model that does all of this for us.</p>
<p>Assume we have input data which looks like this.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#7ee787">&#34;cost&#34;</span>: [
</span></span><span style="display:flex;"><span>        [<span style="color:#a5d6ff">10</span>, <span style="color:#a5d6ff">20</span>, <span style="color:#f85149">...</span>],
</span></span><span style="display:flex;"><span>        [<span style="color:#a5d6ff">30</span>, <span style="color:#a5d6ff">40</span>, <span style="color:#f85149">...</span>],
</span></span><span style="display:flex;"><span>        <span style="color:#f85149">...</span>
</span></span><span style="display:flex;"><span>    ],
</span></span><span style="display:flex;"><span>    <span style="color:#7ee787">&#34;affinity&#34;</span>: [
</span></span><span style="display:flex;"><span>        [<span style="color:#a5d6ff">25</span>, <span style="color:#a5d6ff">15</span>, <span style="color:#f85149">...</span>],
</span></span><span style="display:flex;"><span>        [<span style="color:#a5d6ff">35</span>, <span style="color:#a5d6ff">25</span>, <span style="color:#f85149">...</span>],
</span></span><span style="display:flex;"><span>        <span style="color:#f85149">...</span>
</span></span><span style="display:flex;"><span>    ]
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>We start with a simple assignment problem formulation.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#ff7b72">import</span> <span style="color:#ff7b72">gurobipy</span> <span style="color:#ff7b72">as</span> <span style="color:#ff7b72">gp</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>n <span style="color:#ff7b72;font-weight:bold">=</span> len(data[<span style="color:#a5d6ff">&#34;cost&#34;</span>])
</span></span><span style="display:flex;"><span>workers <span style="color:#ff7b72;font-weight:bold">=</span> range(n)
</span></span><span style="display:flex;"><span>patients <span style="color:#ff7b72;font-weight:bold">=</span> range(n)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m <span style="color:#ff7b72;font-weight:bold">=</span> gp<span style="color:#ff7b72;font-weight:bold">.</span>Model()
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>ModelSense <span style="color:#ff7b72;font-weight:bold">=</span> gp<span style="color:#ff7b72;font-weight:bold">.</span>GRB<span style="color:#ff7b72;font-weight:bold">.</span>MINIMIZE
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># x[w,p] = 1 if worker w is assigned to patient p.</span>
</span></span><span style="display:flex;"><span>x <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>addVars(n, n, vtype<span style="color:#ff7b72;font-weight:bold">=</span>gp<span style="color:#ff7b72;font-weight:bold">.</span>GRB<span style="color:#ff7b72;font-weight:bold">.</span>BINARY)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> i <span style="color:#ff7b72;font-weight:bold">in</span> range(n):
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># Each worker is assigned to one patient.</span>
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(gp<span style="color:#ff7b72;font-weight:bold">.</span>quicksum(x[i, p] <span style="color:#ff7b72">for</span> p <span style="color:#ff7b72;font-weight:bold">in</span> patients) <span style="color:#ff7b72;font-weight:bold">==</span> <span style="color:#a5d6ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># Each patient is assigned one worker.</span>
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(gp<span style="color:#ff7b72;font-weight:bold">.</span>quicksum(x[w, i] <span style="color:#ff7b72">for</span> w <span style="color:#ff7b72;font-weight:bold">in</span> workers) <span style="color:#ff7b72;font-weight:bold">==</span> <span style="color:#a5d6ff">1</span>)
</span></span></code></pre></div><p>We add primary and secondary objectives, and call <code>optimize</code>. The objectives are solved in descending order of the <code>priority</code> flag for <code>Model.setObjectiveN</code>. <code>reltol</code> allows us to degrade the primary objective by some amount (e.g. 5%) to improve the secondary objective.</p>
<p>One catch is that the model only has one objective sense. Since we are <em>minimizing</em> the primary objective, we give the secondary objective a <code>weight</code> of <code>-1</code> in order to <em>maximize</em> it.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">itertools</span> <span style="color:#ff7b72">import</span> product
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Primary objective: minimize cost.</span>
</span></span><span style="display:flex;"><span>z <span style="color:#ff7b72;font-weight:bold">=</span> (data[<span style="color:#a5d6ff">&#34;cost&#34;</span>][w][p] <span style="color:#ff7b72;font-weight:bold">*</span> x[w, p] <span style="color:#ff7b72">for</span> w, p <span style="color:#ff7b72;font-weight:bold">in</span> product(workers, patients))
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>setObjectiveN(expr<span style="color:#ff7b72;font-weight:bold">=</span>gp<span style="color:#ff7b72;font-weight:bold">.</span>quicksum(z), index<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">0</span>, name<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">&#34;cost&#34;</span>, priority<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">1</span>, reltol<span style="color:#ff7b72;font-weight:bold">=</span>alpha)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Secondary objective: maximize affinity. Since the model sense is minimize,</span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># we negate the secondary objective in order to maximize it.</span>
</span></span><span style="display:flex;"><span>w <span style="color:#ff7b72;font-weight:bold">=</span> (data[<span style="color:#a5d6ff">&#34;affinity&#34;</span>][w][p] <span style="color:#ff7b72;font-weight:bold">*</span> x[w, p] <span style="color:#ff7b72">for</span> w, p <span style="color:#ff7b72;font-weight:bold">in</span> product(workers, patients))
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>setObjectiveN(
</span></span><span style="display:flex;"><span>    expr<span style="color:#ff7b72;font-weight:bold">=</span>gp<span style="color:#ff7b72;font-weight:bold">.</span>quicksum(w), index<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">1</span>, name<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">&#34;affinity&#34;</span>, priority<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">0</span>, weight<span style="color:#ff7b72;font-weight:bold">=-</span><span style="color:#a5d6ff">1</span>
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>optimize()
</span></span></code></pre></div><p>Then we use this magic syntax to pull out the optimal cost and affinity.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>params<span style="color:#ff7b72;font-weight:bold">.</span>ObjNumber <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#a5d6ff">0</span>
</span></span><span style="display:flex;"><span>cost <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>ObjNVal
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>params<span style="color:#ff7b72;font-weight:bold">.</span>ObjNumber <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#a5d6ff">1</span>
</span></span><span style="display:flex;"><span>affinity <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>ObjNVal
</span></span></code></pre></div><h3 id="results">Results</h3>
<p>If we solve this in a loop with alpha values from <code>0</code> to <code>1</code> in increments of <code>0.05</code>, we can plot the trade-off between cost and affinity. Going from $\alpha = 0$ to $\alpha = 0.05$ or $\alpha = 0.1$ gives a pretty sizable improvement in affinity. After that, the return starts to gradually level off. This allows us to make a more informed choice about these two objectives.</p>
<p><img alt="Pareto front - cost vs affinity" loading="lazy" src="/files/2024-11-08-hierarchical-optimization-with-gurobi/plot.png#center"></p>
<h2 id="resources">Resources</h2>
<ul>
<li><a href="/files/2024-11-08-hierarchical-optimization-with-gurobi/generate.py"><code>generate.py</code></a> generates input data</li>
<li><a href="/files/2024-11-08-hierarchical-optimization-with-gurobi/input-100x100.json"><code>input-100x100.json</code></a> contains input data</li>
<li><a href="/files/2024-11-08-hierarchical-optimization-with-gurobi/model.py"><code>model.py</code></a> hierarchical objectives Gurobi model</li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>While commercial libraries like AMPL have always focussed on modeling performance, some of the open source options targeting multiple solvers come with significant performance penalties during formulation and model handoff to the solver. Newer options like <code>linopy</code> (<a href="https://linopy.readthedocs.io/en/latest/benchmark.html">benchmarks</a>) and PyOptInterface (<a href="https://metab0t.github.io/PyOptInterface/benchmark.html">benchmarks</a>) don&rsquo;t have that issue.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>This gives us a <a href="https://en.wikipedia.org/wiki/Pareto_front">Pareto front</a>, which explores the trade-offs between different objectives.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>📅 Reducing Overscheduling</title>
      <link>https://ryanjoneil.dev/posts/2023-11-26-reducing-overscheduling/</link>
      <pubDate>Sun, 26 Nov 2023 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2023-11-26-reducing-overscheduling/</guid>
      <description>Minimize overbooking while scheduling team meetings</description>
      <content:encoded><![CDATA[<p>At a <a href="https://nextmv.io">Nextmv</a> <a href="https://www.youtube.com/watch?v=XTeit7TAWj4">tech talk</a> a couple weeks ago, I showed a <a href="https://en.wikipedia.org/wiki/Least_absolute_deviations">least absolute deviations</a> (LAD) regression model using OR-Tools. This isn&rsquo;t new &ndash; I pulled the formulation from Rob Vanderbei&rsquo;s &ldquo;<a href="https://vanderbei.princeton.edu/tex/LocalWarming/LocalWarmingSIREVrev.pdf">Local Warming</a>&rdquo; paper, and I&rsquo;ve shown similar models at conference talks in the past using other modeling APIs and solvers.</p>
<p>There are a couple reasons I keep coming back to this problem. One is that it&rsquo;s a great example of how to build a machine learning model using an optimization solver. Unless you have an optimization background, it&rsquo;s probably not obvious you can do this. Building a regression or classification model with a solver directly is a great way to understand the model better. And you can customize it in interesting ways, like adding <a href="https://www.robots.ox.ac.uk/~az/lectures/ml/2011/lect6.pdf">epsilon insensitivity</a>.</p>
<p>Another is that <a href="https://en.wikipedia.org/wiki/Least_squares">least squares</a>, while most commonly used regression form, has a fatal flaw: it isn&rsquo;t robust to outliers in the input data. This is because least squares minimize the <em>sum of squared residuals</em>, as shown in the formulation below. Here, $A$ is an $m \times n$ matrix of feature data, $b$ is a vector of observations to fit, and $x$ is a vector of coefficients the optimizer must find.</p>
<p>$$
\min f(x) = \Vert Ax-b \Vert^2
$$</p>
<p>Since the objective function minimizes squared residuals, outliers have a much bigger impact than other data. LAD regression solves this by simply summing the values of the residuals as they are.</p>
<p>$$
\min f(x) = \vert Ax-b \vert
$$</p>
<p>So why isn&rsquo;t this used more? Simple &ndash; least squares has a convenient analytical solution, while LAD requires an algorithm to solve. For instance, you can formulate LAD regression as a linear program, but now you need a solver.</p>
<p>$$
\begin{align*}
\min         \quad &amp; 1&rsquo;z \\
\text{s.t.}\ \quad &amp; z \ge Ax - b \\
&amp; z \ge b - Ax
\end{align*}
$$</p>
<p>While I like using this example, it paints a rather negative picture of squaring. If it does funny things to solvers, is there any good reason to square? Thus I&rsquo;ve been on the lookout for a practical example where squaring a variable or expression makes a model more useful.</p>
<p>Luckily for me, Erwin Kalvelagen recently <a href="https://yetanothermathprogrammingconsultant.blogspot.com/2023/10/scheduling-team-meetings.html">posted</a> about using optimization to schedule team meetings. This is an application where minimizing squared values of <em>overbooking</em> can be beneficial &ndash; it may be worse to be triple booked than double booked.</p>
<p>I won&rsquo;t recreate the reasoning behind Erwin&rsquo;s post here. You can read his blog for that. What we&rsquo;ll do is look at both the formulations in his post, along with a couple extras using <a href="https://julialang.org/">Julia</a> for code, <a href="https://jump.dev/">JuMP</a> for modeling, <a href="https://www.scipopt.org/">SCIP</a> for optimization, and <a href="https://gadflyjl.org/stable/">Gadfly</a> for visualization. All model code and data are linked in the resources section at the end.</p>
<h2 id="maximize-attendance">Maximize attendance</h2>
<p>To start off, I built a new data set, which you can find in the resources section. This differentiates team membership between two types of employees: individual contributors (starting with <code>ic</code> in the data), who attend meetings for 1 or 2 teams, and managers (prefixed with <code>mgr</code>), who attend meetings to coordinate across multiple teams. We schedule meetings for 10 teams (prefix <code>t</code>) into 3 time slots (<code>s</code>).</p>
<p>The first model in Erwin&rsquo;s post maximizes attendance. This means it tries to schedule team members for as many unique time slots as possible. It doesn&rsquo;t consider overbooking.</p>
<p>$$
\begin{align*}
\max\quad       &amp; \sum_{i,s} y_{i,s} \\
\text{s.t.}\quad&amp; \sum_{s} x_{t,s} = 1                  &amp;\quad\forall&amp;\ t   &amp; \text{schedule each team meeting once}\\
&amp; y_{i,s} \le \sum_{t} m_{i,t}\ x_{t,s} &amp;\quad\forall&amp;\ i,s &amp; \text{individuals attend team meetings}\\
&amp; x_{t,s} \in \{0,1\}                 &amp;\quad\forall&amp;\ t,s\\
&amp; y_{i,s} \in \{0,1\}                 &amp;\quad\forall&amp;\ i,s
\end{align*}
$$</p>
<p>This yields the following team schedule, with red representing a scheduled team meeting.</p>
<p><img alt="Maximize attendance - team schedules" loading="lazy" src="/files/2023-11-26-reducing-overscheduling/maximize-attendance-teams.svg#center"></p>
<p>If we look at the manager schedules, we&rsquo;ll see that every manager is completely booked. This makes sense. That&rsquo;s what managers do, right? Go to meetings?</p>
<p><img alt="Maximize attendance - manager attendance" loading="lazy" src="/files/2023-11-26-reducing-overscheduling/maximize-attendance-managers.svg#center"></p>
<h2 id="minimize-overbooking">Minimize overbooking</h2>
<p>The model gets more interesting once we account for overbooking. Erwin&rsquo;s post has a model that minimizes overbooking, where overbooking is the number of additional meetings in a time slot. If a team member is double booked, that&rsquo;s 1 overbooking. If they are triple booked, that&rsquo;s 2 overbookings.</p>
<h3 id="sum-of-overbooking">Sum of overbooking</h3>
<p>The second model in Erwin&rsquo;s post minimizes the sum of all overbookings. He does this by adding a continuous <code>c</code> vector that only incurs value once a team member goes over a single meeting in a given time slot.</p>
<p>$$
\begin{align*}
\min\quad       &amp; \sum_{i,s} c_{i,s} \\
\text{s.t.}\quad&amp; \sum_{s} x_{t,s} = 1                      &amp;\quad\forall&amp;\ t   &amp; \text{schedule each team meeting once}\\
&amp; c_{i,s} \ge \sum_{t} m_{i,t}\ x_{t,s} - 1 &amp;\quad\forall&amp;\ i,s &amp; \text{measure overbooking}\\
&amp; x_{t,s} \in \{0,1\}                     &amp;\quad\forall&amp;\ t,s\\
&amp; c_{i,s} \ge 0                             &amp;\quad\forall&amp;\ i,s
\end{align*}
$$</p>
<p>Given our data this results in the following team schedule, which is probably not all that interesting. I&rsquo;ll leave this visualization out from now on.</p>
<p><img alt="Minimize overbooking - team schedules" loading="lazy" src="/files/2023-11-26-reducing-overscheduling/minimize-overbooking-teams.svg#center"></p>
<p>Where it gets interesting is plotting overbookings for the managers. Here we see that 3 manager time slots are triple booked <em>(red)</em>, while 8 are double booked <em>(gray)</em>.</p>
<p><img alt="Minimize overbooking - manager overbooking" loading="lazy" src="/files/2023-11-26-reducing-overscheduling/minimize-overbooking-managers.svg#center"></p>
<h3 id="sum-of-squared-overbooking">Sum of squared overbooking</h3>
<p>Let&rsquo;s say it&rsquo;s worse to triple book (or, gasp, <em>quadruple</em> book) than to double book. How can the model account for this? One answer, if you have a MIQP-enabled solver, is to simply square the <code>c</code> values.</p>
<p>$$
\begin{align*}
\min\quad       &amp; \sum_{i,s} c_{i,s}^2 \\
\text{s.t.}\quad&amp; \sum_{s} x_{t,s} = 1                      &amp;\quad\forall&amp;\ t   &amp; \text{schedule each team meeting once}\\
&amp; c_{i,s} \ge \sum_{t} m_{i,t}\ x_{t,s} - 1 &amp;\quad\forall&amp;\ i,s &amp; \text{measure overbooking}\\
&amp; x_{t,s} \in \{0,1\}                     &amp;\quad\forall&amp;\ t,s\\
&amp; c_{i,s} \ge 0                             &amp;\quad\forall&amp;\ i,s
\end{align*}
$$</p>
<p>This completely eliminates triple booking, as shown below. No manager is worse off than being double booked, which seems normal given my experiences.</p>
<p><img alt="Minimize squared overbooking - manager overbooking" loading="lazy" src="/files/2023-11-26-reducing-overscheduling/minimize-overbooking-squared-managers.svg"></p>
<p>The problem with this is that the solver now takes a lot longer. It&rsquo;s not bad for the data in this example, but if you try it with something larger you&rsquo;ll see what I mean. You can find the data generator code in the resources section.</p>
<h3 id="constrained-bottleneck">Constrained bottleneck</h3>
<p>So how can we do something similar without the computational cost? One option is to continue using MILP formulations, but in the context of hierarchical optimization. This means splitting the model into two. First, we try to minimize the maximum overbookings for any team member (the <em>bottleneck</em>, if you will). This involves adding a variable $b$ representing that maximum.</p>
<p>$$ b = \max\Bigl\{\sum_{t} m_{i,t}\ x_{t,s} - 1 : i \in I, s \in S \Bigr\} $$</p>
<p>Now we can simply minimize $b$ using a MILP instead of a MIQP.</p>
<p>$$
\begin{align*}
\min\quad       &amp; b \\
\text{s.t.}\quad&amp; \sum_{s} x_{t,s} = 1                &amp;\quad\forall&amp;\ t   &amp; \text{schedule each team meeting once}\\
&amp; b \ge \sum_{t} m_{i,t}\ x_{t,s} - 1 &amp;\quad\forall&amp;\ i,s &amp; \text{maximum overbooking}\\
&amp; x_{t,s} \in \{0,1\}               &amp;\quad\forall&amp;\ t,s
\end{align*}
$$</p>
<p>Once we solve the first model, we get the minimal value of $b$, which we call $b^*$. We can simply use $b^*$ as an upper bound for overbookings in the second original model.</p>
<p>$$
\begin{align*}
\min\quad       &amp; \sum_{i,s} c_{i,s} \\
\text{s.t.}\quad&amp; \sum_{s} x_{t,s} = 1                      &amp;\quad\forall&amp;\ t   &amp; \text{schedule each team meeting once}\\
&amp; c_{i,s} \ge \sum_{t} m_{i,t}\ x_{t,s} - 1 &amp;\quad\forall&amp;\ i,s &amp; \text{measure overbooking}\\
&amp; x_{t,s} \in \{0,1\}                     &amp;\quad\forall&amp;\ t,s\\
&amp; 0 \le c_{i,s} \le b^*                     &amp;\quad\forall&amp;\ i,s
\end{align*}
$$</p>
<p>As we see below, this model also eliminates triple bookings, and it&rsquo;s quite a bit faster to solve than the MIQP.</p>
<p><img alt="Minimize bottleneck - manager overbooking" loading="lazy" src="/files/2023-11-26-reducing-overscheduling/minimize-bottleneck-managers.svg#center"></p>
<h2 id="resources">Resources</h2>
<ul>
<li><a href="/files/2023-11-26-reducing-overscheduling/main.go"><code>main.go</code></a> generates input data</li>
<li><a href="/files/2023-11-26-reducing-overscheduling/membership.csv"><code>membership.csv</code></a> contains input data</li>
<li><a href="/files/2023-11-26-reducing-overscheduling/maximize-attendance.jl"><code>maximize-attendance.jl</code></a> MILP model</li>
<li><a href="/files/2023-11-26-reducing-overscheduling/minimize-overbooking.jl"><code>minimize-overbooking.jl</code></a> MILP model</li>
<li><a href="/files/2023-11-26-reducing-overscheduling/minimize-overbooking-squared.jl"><code>minimize-overbooking-squared.jl</code></a> MIQP model</li>
<li><a href="/files/2023-11-26-reducing-overscheduling/minimize-bottleneck.jl"><code>minimize-bottleneck.jl</code></a> hierarchical MILP models</li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>⭕ Chebyshev Centers of Polygons with Gurobi</title>
      <link>https://ryanjoneil.dev/posts/2014-02-03-chebyshev-centers-of-polygons-with-gurobi/</link>
      <pubDate>Mon, 03 Feb 2014 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2014-02-03-chebyshev-centers-of-polygons-with-gurobi/</guid>
      <description>Finding the maximum area inscribed circle inside a polygon.</description>
      <content:encoded><![CDATA[<p><em>Note: This post was written before Gurobi supported nonlinear optimization. It has been updated to work with Python 3.</em></p>
<p>A common problem in handling geometric data is determining the center of a given polygon. This is not quite so easy as it sounds as there is not a single definition of center that makes sense in all cases. For instance, sometimes computing the center of a polygon&rsquo;s bounding box may be sufficient. In some instances this may give a point on an edge (consider a right triangle). If the given polygon is non-convex, that point may not even be inside or on its boundary.</p>
<p>This post looks at computing Chebyshev centers for arbitrary convex polygons. We employ <a href="https://cvxopt.org/examples/book/centers.html">essentially the same model</a> as in Boyd &amp; Vandenberghe&rsquo;s <a href="https://www.stanford.edu/~boyd/cvxbook/">Convex Optimization</a> text, but using Gurobi instead of CVXOPT.</p>
<p>Consider a polygon defined by the intersection of a finite number of half-spaces, $Au \le b$. We assume we are given the set of vertices, $V$, in clockwise order around the polygon. $E$ is the set of edges connecting these vertices. Each edge in $E$ defines a boundary of the half-space $a_i^\intercal u \le b_i$</p>
<p><img alt="Intersection of half-spaces" loading="lazy" src="/files/2014-02-03-chebyshev-centers-of-polygons-with-gurobi/intersection-of-half-spaces.svg#center"></p>
<p>$$
V = {(1,1), (2,5), (5,4), (6,2), (4,1)}\\
E = {((1,1),(2,5)), ((2,5),(5,4)), ((5,4),(6,2)), ((6,2),(4,1)), ((4,1),(1,1))}
$$</p>
<p>The Chebyshev center of this polygon is the center point $(x, y)$ of the maximum radius inscribed circle. That is, if we can find the largest circle that will fit inside our polygon without going outside its boundary, its center is the point we are looking for. Our decision variables are the center $(x, y)$ and the maximum inscribed radius, $r$.</p>
<p>In order to do this, we consider the edges independently. The long line segment below shows an arbitrary edge, $a_i^\intercal u \le b_i$. The short line connected to it is orthogonal in the direction $a$. $(x, y)$ satisfies the inequality.</p>
<p><img alt="Inequality" loading="lazy" src="/files/2014-02-03-chebyshev-centers-of-polygons-with-gurobi/inequality.svg#center"></p>
<p>The shortest distance from $(x, y)$ will be in the direction of $a$. We&rsquo;ll call this distance $r$. If we were to move the edge so it had the same slope but went through $(x, y)$, its distance from $a_i^\intercal u = b_i$ would be $r||a_i||_2$. Thus we can add a constraint of the form $a_i&rsquo;u + r||a_i||_2 \le b_i$ for each edge and maximize the value of $r$ as our objective function.</p>
<p>$$
\begin{align*}
&amp; \text{max}  &amp;&amp; r \\
&amp; \text{s.t.} &amp;&amp; (y_i-y_j)x + (x_j-x_i)y + r\sqrt{(x_j-x_i)^2 + (y_j-y_i)^2} \le (y_i-y_j)x_i + (x_j-x_i)y_i \\
&amp;             &amp;&amp; \quad \forall \quad ((x_i,y_i), (x_j,y_j)) \in E \\
\end{align*}
$$</p>
<p>As this is linear, we can solve it using any LP solver. The following code does so with Gurobi.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic">#!/usr/bin/env python3</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">gurobipy</span> <span style="color:#ff7b72">import</span> Model, GRB
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">math</span> <span style="color:#ff7b72">import</span> sqrt
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>vertices <span style="color:#ff7b72;font-weight:bold">=</span> [(<span style="color:#a5d6ff">1</span>,<span style="color:#a5d6ff">1</span>), (<span style="color:#a5d6ff">2</span>,<span style="color:#a5d6ff">5</span>), (<span style="color:#a5d6ff">5</span>,<span style="color:#a5d6ff">4</span>), (<span style="color:#a5d6ff">6</span>,<span style="color:#a5d6ff">2</span>), (<span style="color:#a5d6ff">4</span>,<span style="color:#a5d6ff">1</span>)]
</span></span><span style="display:flex;"><span>edges <span style="color:#ff7b72;font-weight:bold">=</span> zip(vertices, vertices[<span style="color:#a5d6ff">1</span>:] <span style="color:#ff7b72;font-weight:bold">+</span> [vertices[<span style="color:#a5d6ff">0</span>]])
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m <span style="color:#ff7b72;font-weight:bold">=</span> Model()
</span></span><span style="display:flex;"><span>r <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>addVar()
</span></span><span style="display:flex;"><span>x <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(lb<span style="color:#ff7b72;font-weight:bold">=-</span>GRB<span style="color:#ff7b72;font-weight:bold">.</span>INFINITY)
</span></span><span style="display:flex;"><span>y <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(lb<span style="color:#ff7b72;font-weight:bold">=-</span>GRB<span style="color:#ff7b72;font-weight:bold">.</span>INFINITY)
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>update()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> (x1, y1), (x2, y2) <span style="color:#ff7b72;font-weight:bold">in</span> edges:
</span></span><span style="display:flex;"><span>    dx <span style="color:#ff7b72;font-weight:bold">=</span> x2 <span style="color:#ff7b72;font-weight:bold">-</span> x1
</span></span><span style="display:flex;"><span>    dy <span style="color:#ff7b72;font-weight:bold">=</span> y2 <span style="color:#ff7b72;font-weight:bold">-</span> y1
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addConstr((dx<span style="color:#ff7b72;font-weight:bold">*</span>y <span style="color:#ff7b72;font-weight:bold">-</span> dy<span style="color:#ff7b72;font-weight:bold">*</span>x) <span style="color:#ff7b72;font-weight:bold">+</span> (r <span style="color:#ff7b72;font-weight:bold">*</span> sqrt(dx<span style="color:#ff7b72;font-weight:bold">**</span><span style="color:#a5d6ff">2</span> <span style="color:#ff7b72;font-weight:bold">+</span> dy<span style="color:#ff7b72;font-weight:bold">**</span><span style="color:#a5d6ff">2</span>)) <span style="color:#ff7b72;font-weight:bold">&lt;=</span> dx<span style="color:#ff7b72;font-weight:bold">*</span>y1 <span style="color:#ff7b72;font-weight:bold">-</span> dy<span style="color:#ff7b72;font-weight:bold">*</span>x1)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>setObjective(r, GRB<span style="color:#ff7b72;font-weight:bold">.</span>MAXIMIZE)
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>optimize()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>print(<span style="color:#a5d6ff">&#39;r = </span><span style="color:#a5d6ff">%.04f</span><span style="color:#a5d6ff">&#39;</span> <span style="color:#ff7b72;font-weight:bold">%</span> r<span style="color:#ff7b72;font-weight:bold">.</span>x)
</span></span><span style="display:flex;"><span>print(<span style="color:#a5d6ff">&#39;(x, y) = (</span><span style="color:#a5d6ff">%.04f</span><span style="color:#a5d6ff">, </span><span style="color:#a5d6ff">%.04f</span><span style="color:#a5d6ff">)&#39;</span> <span style="color:#ff7b72;font-weight:bold">%</span> (x<span style="color:#ff7b72;font-weight:bold">.</span>x, y<span style="color:#ff7b72;font-weight:bold">.</span>x))
</span></span></code></pre></div><p>The model output shows our center and its maximum inscribed radius.</p>
<p>$$
r = 1.7466\\
(x, y) = (3.2370, 2.7466)
$$</p>
<p><img alt="Center" loading="lazy" src="/files/2014-02-03-chebyshev-centers-of-polygons-with-gurobi/center.svg#center"></p>
<p>Question for the reader: in certain circumstances, such as rectangles, the Chebyshev center is ambiguous. How might one get around this ambiguity?</p>
]]></content:encoded>
    </item>
    <item>
      <title>🏖️ Langrangian Relaxation with Gurobi</title>
      <link>https://ryanjoneil.dev/posts/2012-09-22-lagrangian-relaxation-with-gurobi/</link>
      <pubDate>Sat, 22 Sep 2012 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2012-09-22-lagrangian-relaxation-with-gurobi/</guid>
      <description>Solving integer programs with Lagrangian relaxation and Gurobi.</description>
      <content:encoded><![CDATA[<p><em>Note: This post was updated to work with Python 3 and the 2nd edition of &ldquo;Integer Programming&rdquo; by Laurence Wolsey.</em></p>
<p>We&rsquo;ve been studying Lagrangian Relaxation (LR) in the Advanced Topics in Combinatorial Optimization course I&rsquo;m taking this term, and I had some difficulty finding a simple example covering its application. In case anyone else finds it useful, I&rsquo;m posting a Python version for solving the <a href="https://en.wikipedia.org/wiki/Generalized_assignment_problem">Generalized Assignment Problem</a> (GAP). This won&rsquo;t discuss the theory of LR at all, just give example code using Gurobi.</p>
<h2 id="generalized-assignment">Generalized assignment</h2>
<p>The GAP as defined by <a href="https://onlinelibrary.wiley.com/doi/book/10.1002/9781119606475">Wolsey</a> consists of a maximization problem subject to a set of set packing constraints followed by a set of knapsack constraints.</p>
<p>$$
\begin{align*}
&amp; \text{max}  &amp;&amp; \sum_i \sum_j c_{ij} x_{ij} \\
&amp; \text{s.t.} &amp;&amp; \sum_j x_{ij} \leq 1             &amp;&amp; \forall i \\
&amp;             &amp;&amp; \sum_i a_{ij} x_{ij} \leq b_{ij} &amp;&amp; \forall j \\
&amp;             &amp;&amp; x_{ij} \in {0, 1}
\end{align*}
$$</p>
<h3 id="naive-model">Naive model</h3>
<p>A naive version of this model using Gurobi might look like the following.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic">#!/usr/bin/env python</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># This is the GAP per Wolsey, pg 208.</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">gurobipy</span> <span style="color:#ff7b72">import</span> Model, GRB, quicksum <span style="color:#ff7b72">as</span> qsum
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m <span style="color:#ff7b72;font-weight:bold">=</span> Model(<span style="color:#a5d6ff">&#34;GAP per Wolsey&#34;</span>)
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>modelSense <span style="color:#ff7b72;font-weight:bold">=</span> GRB<span style="color:#ff7b72;font-weight:bold">.</span>MAXIMIZE
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>setParam(<span style="color:#a5d6ff">&#34;OutputFlag&#34;</span>, <span style="color:#79c0ff">False</span>)  <span style="color:#8b949e;font-style:italic"># turns off solver chatter</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>b <span style="color:#ff7b72;font-weight:bold">=</span> [<span style="color:#a5d6ff">15</span>, <span style="color:#a5d6ff">15</span>, <span style="color:#a5d6ff">15</span>]
</span></span><span style="display:flex;"><span>c <span style="color:#ff7b72;font-weight:bold">=</span> [
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">6</span>, <span style="color:#a5d6ff">10</span>, <span style="color:#a5d6ff">1</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">12</span>, <span style="color:#a5d6ff">12</span>, <span style="color:#a5d6ff">5</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">15</span>, <span style="color:#a5d6ff">4</span>, <span style="color:#a5d6ff">3</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">10</span>, <span style="color:#a5d6ff">3</span>, <span style="color:#a5d6ff">9</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">8</span>, <span style="color:#a5d6ff">9</span>, <span style="color:#a5d6ff">5</span>],
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>a <span style="color:#ff7b72;font-weight:bold">=</span> [
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">5</span>, <span style="color:#a5d6ff">7</span>, <span style="color:#a5d6ff">2</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">14</span>, <span style="color:#a5d6ff">8</span>, <span style="color:#a5d6ff">7</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">10</span>, <span style="color:#a5d6ff">6</span>, <span style="color:#a5d6ff">12</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">8</span>, <span style="color:#a5d6ff">4</span>, <span style="color:#a5d6ff">15</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">6</span>, <span style="color:#a5d6ff">12</span>, <span style="color:#a5d6ff">5</span>],
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># x[i][j] = 1 if i is assigned to j</span>
</span></span><span style="display:flex;"><span>x <span style="color:#ff7b72;font-weight:bold">=</span> [[m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(vtype<span style="color:#ff7b72;font-weight:bold">=</span>GRB<span style="color:#ff7b72;font-weight:bold">.</span>BINARY) <span style="color:#ff7b72">for</span> _ <span style="color:#ff7b72;font-weight:bold">in</span> row] <span style="color:#ff7b72">for</span> row <span style="color:#ff7b72;font-weight:bold">in</span> c]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># sum j: x_ij &lt;= 1 for all i</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> x_i <span style="color:#ff7b72;font-weight:bold">in</span> x:
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(sum(x_i) <span style="color:#ff7b72;font-weight:bold">&lt;=</span> <span style="color:#a5d6ff">1</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># sum i: a_ij * x_ij &lt;= b[j] for all j</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> j, b_j <span style="color:#ff7b72;font-weight:bold">in</span> enumerate(b):
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(qsum(a[i][j] <span style="color:#ff7b72;font-weight:bold">*</span> x_i[j] <span style="color:#ff7b72">for</span> i, x_i <span style="color:#ff7b72;font-weight:bold">in</span> enumerate(x)) <span style="color:#ff7b72;font-weight:bold">&lt;=</span> b_j)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># max sum i,j: c_ij * x_ij</span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>setObjective(
</span></span><span style="display:flex;"><span>    qsum(qsum(c_ij <span style="color:#ff7b72;font-weight:bold">*</span> x_ij <span style="color:#ff7b72">for</span> c_ij, x_ij <span style="color:#ff7b72;font-weight:bold">in</span> zip(c_i, x_i)) <span style="color:#ff7b72">for</span> c_i, x_i <span style="color:#ff7b72;font-weight:bold">in</span> zip(c, x))
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>optimize()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Pull solution out of m.</span>
</span></span><span style="display:flex;"><span>print(<span style="color:#79c0ff">f</span><span style="color:#a5d6ff">&#34;z = </span><span style="color:#a5d6ff">{</span>m<span style="color:#ff7b72;font-weight:bold">.</span>objVal<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">&#34;</span>)
</span></span><span style="display:flex;"><span>print(<span style="color:#a5d6ff">&#34;x = [&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> x_i <span style="color:#ff7b72;font-weight:bold">in</span> x:
</span></span><span style="display:flex;"><span>    print(<span style="color:#79c0ff">f</span><span style="color:#a5d6ff">&#34;  </span><span style="color:#a5d6ff">{</span>[<span style="color:#a5d6ff">1</span> <span style="color:#ff7b72">if</span> x_ij<span style="color:#ff7b72;font-weight:bold">.</span>x <span style="color:#ff7b72;font-weight:bold">&gt;=</span> <span style="color:#a5d6ff">0.5</span> <span style="color:#ff7b72">else</span> <span style="color:#a5d6ff">0</span> <span style="color:#ff7b72">for</span> x_ij <span style="color:#ff7b72;font-weight:bold">in</span> x_i]<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">&#34;</span>)
</span></span><span style="display:flex;"><span>print(<span style="color:#a5d6ff">&#34;]&#34;</span>)
</span></span></code></pre></div><p>The solver quickly finds the following optimal solution of this toy problem.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-text" data-lang="text"><span style="display:flex;"><span>z = 46.0
</span></span><span style="display:flex;"><span>x = [
</span></span><span style="display:flex;"><span>  [0, 1, 0]
</span></span><span style="display:flex;"><span>  [0, 1, 0]
</span></span><span style="display:flex;"><span>  [1, 0, 0]
</span></span><span style="display:flex;"><span>  [0, 0, 1]
</span></span><span style="display:flex;"><span>  [0, 0, 0]
</span></span><span style="display:flex;"><span>]
</span></span></code></pre></div><h3 id="lagrangian-model">Lagrangian model</h3>
<p>There are two sets of constraints we can dualize. It can be beneficial to apply Lagrangian Relaxation against problems composed of knapsack constraints, so we will dualize the set packing ones.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># sum j: x_ij &lt;= 1 for all i</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> x_i <span style="color:#ff7b72;font-weight:bold">in</span> x:
</span></span><span style="display:flex;"><span>    model<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(sum(x_i) <span style="color:#ff7b72;font-weight:bold">&lt;=</span> <span style="color:#a5d6ff">1</span>)
</span></span></code></pre></div><p>We replace these with a new set of variables, <code>penalties</code>, which take the values of the slacks on the set packing constraints. We then modify the objective function, adding Lagrangian multipliers times these penalties.</p>
<p>Instead of optimizing once, we do so iteratively. An important consideration is we may get nothing more than a dual bound from this process. Any integer solution is not guaranteed to be primal feasible unless it satisfies complementary slackness conditions &ndash; for each dualized constraint either its multiplier or penalty must be zero.</p>
<p>We then set the initial multiplier values to 2 and use sub-gradient optimization with a step size of <code>1 / (iteration #)</code> to adjust them.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic">#!/usr/bin/env python</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># This is the GAP per Wolsey, pg 208, using Lagrangian Relaxation.</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">gurobipy</span> <span style="color:#ff7b72">import</span> Model, GRB, quicksum <span style="color:#ff7b72">as</span> qsum
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m <span style="color:#ff7b72;font-weight:bold">=</span> Model(<span style="color:#a5d6ff">&#34;GAP per Wolsey with Lagrangian Relaxation&#34;</span>)
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>modelSense <span style="color:#ff7b72;font-weight:bold">=</span> GRB<span style="color:#ff7b72;font-weight:bold">.</span>MAXIMIZE
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>setParam(<span style="color:#a5d6ff">&#34;OutputFlag&#34;</span>, <span style="color:#79c0ff">False</span>)  <span style="color:#8b949e;font-style:italic"># turns off solver chatter</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>b <span style="color:#ff7b72;font-weight:bold">=</span> [<span style="color:#a5d6ff">15</span>, <span style="color:#a5d6ff">15</span>, <span style="color:#a5d6ff">15</span>]
</span></span><span style="display:flex;"><span>c <span style="color:#ff7b72;font-weight:bold">=</span> [
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">6</span>, <span style="color:#a5d6ff">10</span>, <span style="color:#a5d6ff">1</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">12</span>, <span style="color:#a5d6ff">12</span>, <span style="color:#a5d6ff">5</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">15</span>, <span style="color:#a5d6ff">4</span>, <span style="color:#a5d6ff">3</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">10</span>, <span style="color:#a5d6ff">3</span>, <span style="color:#a5d6ff">9</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">8</span>, <span style="color:#a5d6ff">9</span>, <span style="color:#a5d6ff">5</span>],
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>a <span style="color:#ff7b72;font-weight:bold">=</span> [
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">5</span>, <span style="color:#a5d6ff">7</span>, <span style="color:#a5d6ff">2</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">14</span>, <span style="color:#a5d6ff">8</span>, <span style="color:#a5d6ff">7</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">10</span>, <span style="color:#a5d6ff">6</span>, <span style="color:#a5d6ff">12</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">8</span>, <span style="color:#a5d6ff">4</span>, <span style="color:#a5d6ff">15</span>],
</span></span><span style="display:flex;"><span>    [<span style="color:#a5d6ff">6</span>, <span style="color:#a5d6ff">12</span>, <span style="color:#a5d6ff">5</span>],
</span></span><span style="display:flex;"><span>]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># x[i][j] = 1 if i is assigned to j</span>
</span></span><span style="display:flex;"><span>x <span style="color:#ff7b72;font-weight:bold">=</span> [[m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(vtype<span style="color:#ff7b72;font-weight:bold">=</span>GRB<span style="color:#ff7b72;font-weight:bold">.</span>BINARY) <span style="color:#ff7b72">for</span> _ <span style="color:#ff7b72;font-weight:bold">in</span> row] <span style="color:#ff7b72">for</span> row <span style="color:#ff7b72;font-weight:bold">in</span> c]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># As stated, the GAP has these following constraints. We dualize these into</span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># penalties instead, using variables so we can easily extract their values.</span>
</span></span><span style="display:flex;"><span>penalties <span style="color:#ff7b72;font-weight:bold">=</span> [m<span style="color:#ff7b72;font-weight:bold">.</span>addVar() <span style="color:#ff7b72">for</span> _ <span style="color:#ff7b72;font-weight:bold">in</span> x]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Dualized constraints: sum j: x_ij &lt;= 1 for all i</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> p, x_i <span style="color:#ff7b72;font-weight:bold">in</span> zip(penalties, x):
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(p <span style="color:#ff7b72;font-weight:bold">==</span> <span style="color:#a5d6ff">1</span> <span style="color:#ff7b72;font-weight:bold">-</span> sum(x_i))
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># sum i: a_ij * x_ij &lt;= b[j] for all j</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> j, b_j <span style="color:#ff7b72;font-weight:bold">in</span> enumerate(b):
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addConstr(qsum(a[i][j] <span style="color:#ff7b72;font-weight:bold">*</span> x_i[j] <span style="color:#ff7b72">for</span> i, x_i <span style="color:#ff7b72;font-weight:bold">in</span> enumerate(x)) <span style="color:#ff7b72;font-weight:bold">&lt;=</span> b_j)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># u[i] = Lagrangian Multiplier for the set packing constraint i</span>
</span></span><span style="display:flex;"><span>u <span style="color:#ff7b72;font-weight:bold">=</span> [<span style="color:#a5d6ff">2.0</span>] <span style="color:#ff7b72;font-weight:bold">*</span> len(x)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Re-optimize until either we have run a certain number of iterations</span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># or complementary slackness conditions apply.</span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> k <span style="color:#ff7b72;font-weight:bold">in</span> range(<span style="color:#a5d6ff">1</span>, <span style="color:#a5d6ff">101</span>):
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># max sum i,j: c_ij * x_ij</span>
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>setObjective(
</span></span><span style="display:flex;"><span>        qsum(
</span></span><span style="display:flex;"><span>            <span style="color:#8b949e;font-style:italic"># Original objective function</span>
</span></span><span style="display:flex;"><span>            sum(c_ij <span style="color:#ff7b72;font-weight:bold">*</span> x_ij <span style="color:#ff7b72">for</span> c_ij, x_ij <span style="color:#ff7b72;font-weight:bold">in</span> zip(c_i, x_i))
</span></span><span style="display:flex;"><span>            <span style="color:#ff7b72">for</span> c_i, x_i <span style="color:#ff7b72;font-weight:bold">in</span> zip(c, x)
</span></span><span style="display:flex;"><span>        )
</span></span><span style="display:flex;"><span>        <span style="color:#ff7b72;font-weight:bold">+</span> qsum(
</span></span><span style="display:flex;"><span>            <span style="color:#8b949e;font-style:italic"># Penalties for dualized constraints</span>
</span></span><span style="display:flex;"><span>            u_j <span style="color:#ff7b72;font-weight:bold">*</span> p_j
</span></span><span style="display:flex;"><span>            <span style="color:#ff7b72">for</span> u_j, p_j <span style="color:#ff7b72;font-weight:bold">in</span> zip(u, penalties)
</span></span><span style="display:flex;"><span>        )
</span></span><span style="display:flex;"><span>    )
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>optimize()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    print(
</span></span><span style="display:flex;"><span>        <span style="color:#79c0ff">f</span><span style="color:#a5d6ff">&#34;iteration </span><span style="color:#a5d6ff">{</span>k<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">: z = </span><span style="color:#a5d6ff">{</span>m<span style="color:#ff7b72;font-weight:bold">.</span>objVal<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">, u = </span><span style="color:#a5d6ff">{</span>u<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">, penalties = </span><span style="color:#a5d6ff">{</span>[p<span style="color:#ff7b72;font-weight:bold">.</span>x <span style="color:#ff7b72">for</span> p <span style="color:#ff7b72;font-weight:bold">in</span> penalties]<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">&#34;</span>
</span></span><span style="display:flex;"><span>    )
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#8b949e;font-style:italic"># Test for complementary slackness</span>
</span></span><span style="display:flex;"><span>    stop <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#79c0ff">True</span>
</span></span><span style="display:flex;"><span>    eps <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#a5d6ff">10e-6</span>
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">for</span> u_i, p_i <span style="color:#ff7b72;font-weight:bold">in</span> zip(u, penalties):
</span></span><span style="display:flex;"><span>        <span style="color:#ff7b72">if</span> abs(u_i) <span style="color:#ff7b72;font-weight:bold">&gt;</span> eps <span style="color:#ff7b72;font-weight:bold">and</span> abs(p_i<span style="color:#ff7b72;font-weight:bold">.</span>x) <span style="color:#ff7b72;font-weight:bold">&gt;</span> eps:
</span></span><span style="display:flex;"><span>            stop <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#79c0ff">False</span>
</span></span><span style="display:flex;"><span>            <span style="color:#ff7b72">break</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">if</span> stop:
</span></span><span style="display:flex;"><span>        print(<span style="color:#a5d6ff">&#34;primal feasible &amp; optimal&#34;</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#ff7b72">break</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">else</span>:
</span></span><span style="display:flex;"><span>        s <span style="color:#ff7b72;font-weight:bold">=</span> <span style="color:#a5d6ff">1.0</span> <span style="color:#ff7b72;font-weight:bold">/</span> k
</span></span><span style="display:flex;"><span>        <span style="color:#ff7b72">for</span> i <span style="color:#ff7b72;font-weight:bold">in</span> range(len(x)):
</span></span><span style="display:flex;"><span>            u[i] <span style="color:#ff7b72;font-weight:bold">=</span> max(u[i] <span style="color:#ff7b72;font-weight:bold">-</span> s <span style="color:#ff7b72;font-weight:bold">*</span> (penalties[i]<span style="color:#ff7b72;font-weight:bold">.</span>x), <span style="color:#a5d6ff">0.0</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Pull solution out of m.</span>
</span></span><span style="display:flex;"><span>print(<span style="color:#79c0ff">f</span><span style="color:#a5d6ff">&#34;z = </span><span style="color:#a5d6ff">{</span>m<span style="color:#ff7b72;font-weight:bold">.</span>objVal<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">&#34;</span>)
</span></span><span style="display:flex;"><span>print(<span style="color:#a5d6ff">&#34;x = [&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> x_i <span style="color:#ff7b72;font-weight:bold">in</span> x:
</span></span><span style="display:flex;"><span>    print(<span style="color:#79c0ff">f</span><span style="color:#a5d6ff">&#34;  </span><span style="color:#a5d6ff">{</span>[<span style="color:#a5d6ff">1</span> <span style="color:#ff7b72">if</span> x_ij<span style="color:#ff7b72;font-weight:bold">.</span>x <span style="color:#ff7b72;font-weight:bold">&gt;=</span> <span style="color:#a5d6ff">0.5</span> <span style="color:#ff7b72">else</span> <span style="color:#a5d6ff">0</span> <span style="color:#ff7b72">for</span> x_ij <span style="color:#ff7b72;font-weight:bold">in</span> x_i]<span style="color:#a5d6ff">}</span><span style="color:#a5d6ff">&#34;</span>)
</span></span><span style="display:flex;"><span>print(<span style="color:#a5d6ff">&#34;]&#34;</span>)
</span></span></code></pre></div><p>Again, the example converges very quickly to an optimal solution.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-txt" data-lang="txt"><span style="display:flex;"><span>iteration 1: z = 48.0, u = [2.0, 2.0, 2.0, 2.0, 2.000], penalties = [0.0, 0.0, 0.0, 0.0, 1.0]
</span></span><span style="display:flex;"><span>iteration 2: z = 47.0, u = [2.0, 2.0, 2.0, 2.0, 1.000], penalties = [0.0, 0.0, 0.0, 0.0, 1.0]
</span></span><span style="display:flex;"><span>iteration 3: z = 46.5, u = [2.0, 2.0, 2.0, 2.0, 0.500], penalties = [0.0, 0.0, 0.0, 0.0, 1.0]
</span></span><span style="display:flex;"><span>iteration 4: z = 46.2, u = [2.0, 2.0, 2.0, 2.0, 0.167], penalties = [0.0, 0.0, 0.0, 0.0, 1.0]
</span></span><span style="display:flex;"><span>iteration 5: z = 46.0, u = [2.0, 2.0, 2.0, 2.0, 0.000], penalties = [0.0, 0.0, 0.0, 0.0, 1.0]
</span></span><span style="display:flex;"><span>primal feasible &amp; optimal
</span></span><span style="display:flex;"><span>z = 46.0
</span></span><span style="display:flex;"><span>x = [
</span></span><span style="display:flex;"><span>  [0, 1, 0]
</span></span><span style="display:flex;"><span>  [0, 1, 0]
</span></span><span style="display:flex;"><span>  [1, 0, 0]
</span></span><span style="display:flex;"><span>  [0, 0, 1]
</span></span><span style="display:flex;"><span>  [0, 0, 0]
</span></span><span style="display:flex;"><span>]
</span></span></code></pre></div><p>Exercise for the reader: change the script to dualize the knapsack constraints instead of the set packing constraints. What is the result of this change in terms of convergence?</p>
<h2 id="resources">Resources</h2>
<ul>
<li><a href="/files/2012-09-22-lagrangian-relaxation-with-gurobi/gap.py"><code>gap.py</code></a></li>
<li><a href="/files/2012-09-22-lagrangian-relaxation-with-gurobi/gap-lagrangian.py"><code>gap-lagrangian.py</code></a></li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>🔲 Normal Magic Squares</title>
      <link>https://ryanjoneil.dev/posts/2012-01-13-normal-magic-squares/</link>
      <pubDate>Fri, 13 Jan 2012 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2012-01-13-normal-magic-squares/</guid>
      <description>An integer programming formulation of the normal magic squares problem.</description>
      <content:encoded><![CDATA[<p><em>Note: This post was updated to work with Python 3 and <a href="https://github.com/scipopt/PySCIPOpt">PySCIPOpt</a>. The original version used Python 2 and <a href="https://pythonhosted.org/python-zibopt/">python-zibopt</a>. It has also been edited for clarity.</em></p>
<p>As a followup to the <a href="../2012-01-12-magic-squares-and-big-ms/">last post</a>, I created <a href="/files/2012-01-13-normal-magic-squares/normal-magic-square.py">another SCIP example</a> for finding Normal Magic Squares. This is similar to <a href="https://github.com/CPMpy/cpmpy/blob/master/examples/quickstart_sudoku.ipynb">solving a Sudoku problem</a>, except that here the number of binary variables depends on the square size. In the case of Sudoku, each cell has 9 binary variables &ndash; one for each potential value it might take. For a normal magic square, there are $n^2$ possible values for each cell, $n^2$ cells, and one variable representing the row, column, and diagonal sums. This makes a total of $n^4$ binary variables and one continuous variables in the model.</p>
<p>However, there are no big-Ms.</p>
<p>I think the neat part of this code is in this section:</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># Construct an expression for each cell that is the sum of</span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># its binary variables with their associated coefficients.</span>
</span></span><span style="display:flex;"><span>sums <span style="color:#ff7b72;font-weight:bold">=</span> []
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> row <span style="color:#ff7b72;font-weight:bold">in</span> matrix:
</span></span><span style="display:flex;"><span>    sums_row <span style="color:#ff7b72;font-weight:bold">=</span> []
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">for</span> cell <span style="color:#ff7b72;font-weight:bold">in</span> row:
</span></span><span style="display:flex;"><span>        sums_row<span style="color:#ff7b72;font-weight:bold">.</span>append(sum((i <span style="color:#ff7b72;font-weight:bold">+</span> <span style="color:#a5d6ff">1</span>) <span style="color:#ff7b72;font-weight:bold">*</span> x <span style="color:#ff7b72">for</span> i, x <span style="color:#ff7b72;font-weight:bold">in</span> enumerate(cell)))
</span></span><span style="display:flex;"><span>    sums<span style="color:#ff7b72;font-weight:bold">.</span>append(sums_row)
</span></span></code></pre></div><p>It creates sums of the $n^2$ variables for each cell with their appropriate coefficients ($1$ to $n^2$) and stores those expressions to make the subsequent constraint creation simpler.</p>
<p>Another interesting exercise for the reader: Change <a href="/files/2012-01-13-normal-magic-squares/normal-magic-square.py">the code</a> to minimize the sum of each column. How does that impact the solution time?</p>
]]></content:encoded>
    </item>
    <item>
      <title>🔲 Magic Squares and Big-Ms</title>
      <link>https://ryanjoneil.dev/posts/2012-01-12-magic-squares-and-big-ms/</link>
      <pubDate>Thu, 12 Jan 2012 00:00:00 +0000</pubDate>
      <guid>https://ryanjoneil.dev/posts/2012-01-12-magic-squares-and-big-ms/</guid>
      <description>An integer programming formulation of the magic squares problem.</description>
      <content:encoded><![CDATA[<p><em>Note: This post was updated to work with Python 3 and <a href="https://github.com/scipopt/PySCIPOpt">PySCIPOpt</a>. The original version used Python 2 and <a href="https://pythonhosted.org/python-zibopt/">python-zibopt</a>. It has also been edited for clarity.</em></p>
<p>Back in October of 2011, I started toying with a model for finding <a href="https://en.wikipedia.org/wiki/Magic_square">magic squares</a> using SCIP. This is a fun modeling exercise and a challenging problem. First one constructs a square matrix of integer-valued variables.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">pyscipopt</span> <span style="color:#ff7b72">import</span> Model
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#8b949e;font-style:italic"># [...snip...]</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m <span style="color:#ff7b72;font-weight:bold">=</span> Model()
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>matrix <span style="color:#ff7b72;font-weight:bold">=</span> []
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> i <span style="color:#ff7b72;font-weight:bold">in</span> range(size):
</span></span><span style="display:flex;"><span>    row <span style="color:#ff7b72;font-weight:bold">=</span> [m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(vtype<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">&#34;I&#34;</span>, lb<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">1</span>) <span style="color:#ff7b72">for</span> _ <span style="color:#ff7b72;font-weight:bold">in</span> range(size)]
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">for</span> x <span style="color:#ff7b72;font-weight:bold">in</span> row:
</span></span><span style="display:flex;"><span>        m<span style="color:#ff7b72;font-weight:bold">.</span>addCons(x <span style="color:#ff7b72;font-weight:bold">&lt;=</span> M)
</span></span><span style="display:flex;"><span>    matrix<span style="color:#ff7b72;font-weight:bold">.</span>append(row)
</span></span></code></pre></div><p>Then one adds the following constraints:</p>
<ul>
<li>All variables ≥ 1.</li>
<li>All rows, columns, and the diagonal sum to the same value.</li>
<li>All variables take different values.</li>
</ul>
<p>The first two constraints are trivial to implement, and relatively easy for the solver. What I do is add a single extra variable then set it equal to the sums of each row, column, and the diagonal.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span>sum_val <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(vtype<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">&#34;M&#34;</span>)
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> i <span style="color:#ff7b72;font-weight:bold">in</span> range(size):
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addCons(sum(matrix[i]) <span style="color:#ff7b72;font-weight:bold">==</span> sum_val)
</span></span><span style="display:flex;"><span>    m<span style="color:#ff7b72;font-weight:bold">.</span>addCons(sum(matrix[j][i] <span style="color:#ff7b72">for</span> j <span style="color:#ff7b72;font-weight:bold">in</span> range(size)) <span style="color:#ff7b72;font-weight:bold">==</span> sum_val)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>m<span style="color:#ff7b72;font-weight:bold">.</span>addCons(sum(matrix[i][i] <span style="color:#ff7b72">for</span> i <span style="color:#ff7b72;font-weight:bold">in</span> range(size)) <span style="color:#ff7b72;font-weight:bold">==</span> sum_val)
</span></span></code></pre></div><p>It&rsquo;s the third that messes things up. You can think of this as saying, for every possible pair of integer-valued variables $x$ and $y$:</p>
<p>$$ x \ge y + 1 \quad \text{or} \quad x \le y - 1 $$</p>
<p>Why is this hard? Because we can&rsquo;t add both constraints to the model. That would make it infeasible. Instead, we add write them in such a way that exactly one will be active for any any given solution. This requires, for each pair of variables, an additional binary variable $z$ and a (possibly big) constant $M$. Thus we reformulate the above as:</p>
<p>$$
x \ge (y + 1) - M z \
x \le (y - 1) + M (1-z) \
z \in {0,1}
$$</p>
<p>In code this looks like:</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#ff7b72">from</span> <span style="color:#ff7b72">itertools</span> <span style="color:#ff7b72">import</span> chain
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>all_vars <span style="color:#ff7b72;font-weight:bold">=</span> list(chain(<span style="color:#ff7b72;font-weight:bold">*</span>matrix))
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">for</span> i, x <span style="color:#ff7b72;font-weight:bold">in</span> enumerate(all_vars):
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">for</span> y <span style="color:#ff7b72;font-weight:bold">in</span> all_vars[i<span style="color:#ff7b72;font-weight:bold">+</span><span style="color:#a5d6ff">1</span>:]:
</span></span><span style="display:flex;"><span>        z <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(vtype<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">&#34;B&#34;</span>)
</span></span><span style="display:flex;"><span>        m<span style="color:#ff7b72;font-weight:bold">.</span>addCons(x <span style="color:#ff7b72;font-weight:bold">&gt;=</span> y <span style="color:#ff7b72;font-weight:bold">+</span> <span style="color:#a5d6ff">1</span> <span style="color:#ff7b72;font-weight:bold">-</span> M<span style="color:#ff7b72;font-weight:bold">*</span>z)
</span></span><span style="display:flex;"><span>        m<span style="color:#ff7b72;font-weight:bold">.</span>addCons(x <span style="color:#ff7b72;font-weight:bold">&lt;=</span> y <span style="color:#ff7b72;font-weight:bold">-</span> <span style="color:#a5d6ff">1</span> <span style="color:#ff7b72;font-weight:bold">+</span> M<span style="color:#ff7b72;font-weight:bold">*</span>(<span style="color:#a5d6ff">1</span><span style="color:#ff7b72;font-weight:bold">-</span>z))
</span></span></code></pre></div><p>However, <a href="https://orinanobworld.blogspot.com/2011/07/perils-of-big-m.html">here be dragons</a>. We may not know how big (or small) to make $M$. Generally we want it as small as possible to make the LP relaxation of our integer programming model tighter. Different values of $M$ have unpredictable effects on solution time.</p>
<p>Which brings us to an interesting idea:</p>
<p>SCIP now supports bilinear constraints. This means that I can make $M$ a variable in the above model.</p>
<div class="highlight"><pre tabindex="0" style="color:#e6edf3;background-color:#0d1117;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#ff7b72">import</span> <span style="color:#ff7b72">sys</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">try</span>:
</span></span><span style="display:flex;"><span>    M <span style="color:#ff7b72;font-weight:bold">=</span> int(sys<span style="color:#ff7b72;font-weight:bold">.</span>argv[<span style="color:#a5d6ff">2</span>])
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">except</span> <span style="color:#f0883e;font-weight:bold">IndexError</span>:
</span></span><span style="display:flex;"><span>    M <span style="color:#ff7b72;font-weight:bold">=</span> m<span style="color:#ff7b72;font-weight:bold">.</span>addVar(vtype<span style="color:#ff7b72;font-weight:bold">=</span><span style="color:#a5d6ff">&#34;M&#34;</span>, lb<span style="color:#ff7b72;font-weight:bold">=</span>size <span style="color:#ff7b72;font-weight:bold">*</span> size)
</span></span><span style="display:flex;"><span><span style="color:#ff7b72">else</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#ff7b72">assert</span> M <span style="color:#ff7b72;font-weight:bold">&gt;=</span> size <span style="color:#ff7b72;font-weight:bold">*</span> size
</span></span></code></pre></div><p>The magic square model linked to in this post provides both options. The first command line argument it requires is the matrix size. The second one, $M$, is optional. If not given, it leaves $M$ up to the solver.</p>
<p>An interesting exercise for the reader: Change <a href="/files/2012-01-12-magic-squares-and-big-ms/magic-square.py">the code</a> to search for a <em>minimal</em> magic square, which minimizes either the value of $M$ or the sums of the columns, rows, and diagonal.</p>
]]></content:encoded>
    </item>
  </channel>
</rss>
