<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.9.2">Jekyll</generator><link href="https://diyago.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://diyago.github.io/" rel="alternate" type="text/html" /><updated>2022-05-21T07:15:14+00:00</updated><id>https://diyago.github.io/feed.xml</id><title type="html">Machine &amp;amp; Deep Learning Blog by Insaf Ashrapov</title><subtitle>Machine &amp; Deep Learning Blog by Insaf Ashrapov, Senior Data Scientist 
</subtitle><entry><title type="html">Hackathon: Who is better to spot generated image?</title><link href="https://diyago.github.io/2021/01/12/gan-hackaton.html" rel="alternate" type="text/html" title="Hackathon: Who is better to spot generated image?" /><published>2021-01-12T00:00:00+00:00</published><updated>2021-01-12T00:00:00+00:00</updated><id>https://diyago.github.io/2021/01/12/gan-hackaton</id><content type="html" xml:base="https://diyago.github.io/2021/01/12/gan-hackaton.html">&lt;p&gt;The online hackathon by Digital Leader was held from 19.10 to 27.11.2020. I have shown that a trained neural network better distinguish generated face images that human.
As a result, I took 4th place and won swag prizes.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/gan_hack/poster.png&quot; alt=&quot;poster.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The work of other participants can be found here:
https://hackathon.digitalleader.org/?fv-page=2#contest&lt;/p&gt;</content><author><name></name></author><summary type="html">The online hackathon by Digital Leader was held from 19.10 to 27.11.2020. I have shown that a trained neural network better distinguish generated face images that human. As a result, I took 4th place and won swag prizes.</summary></entry><entry><title type="html">Graph classification by computer vision</title><link href="https://diyago.github.io/2020/11/07/graphs-vs-cv.html" rel="alternate" type="text/html" title="Graph classification by computer vision" /><published>2020-11-07T00:00:00+00:00</published><updated>2020-11-07T00:00:00+00:00</updated><id>https://diyago.github.io/2020/11/07/graphs-vs-cv</id><content type="html" xml:base="https://diyago.github.io/2020/11/07/graphs-vs-cv.html">&lt;p&gt;Graph analysis nowadays becomes more popular, but how does it perform compared to the computer vision approach? We will show while the training speed of computer vision models is much slower, they perform considerably well compared to graph theory.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Github repo with all code, &lt;a href=&quot;https://github.com/Diyago/Graph-clasification-by-computer-vision&quot;&gt;link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Originally posted &lt;a href=&quot;https://towardsdatascience.com/graph-classification-by-computer-vision-286572aaa750&quot;&gt;on Medium&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;graph-analysis&quot;&gt;Graph analysis&lt;/h2&gt;

&lt;p&gt;In general, graph theory represents pairwise relationships between objects. We won’t leave much detail here, but you may consider its some kind of network below:
&lt;img src=&quot;/images/graphs/title.jpg&quot; alt=&quot;title.jpg&quot; /&gt;
&lt;em&gt;Network. Photo by Alina Grubnyak on Unsplash&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The main point we need to know here, it is that by connecting objects with edges we may visualize graphs. Then we will be able to use classic computer vision models. Unfortunately, we may lose some initial information. For example, the graph may contain different types of objects, connection, maybe impossible to visualize it in 2D.&lt;/p&gt;

&lt;h3 id=&quot;libraries&quot;&gt;Libraries&lt;/h3&gt;

&lt;p&gt;There are plenty of libraries you look at if you willing to start working with them:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;networkx&lt;/strong&gt; — classical algorithms, visualizations&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;pytorch_geometric&lt;/strong&gt; — SOTA algorithms graph, a framework on top of pytorch&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;graph-tool&lt;/strong&gt; — classical algorithms, visualizations&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;scikit-network&lt;/strong&gt; — classical algorithms, sklearn like API&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;TensorFlow Graphics&lt;/strong&gt; — SOTA algorithms graph, a framework on top of Tensorflow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They are all aimed at their own specific role. This is why it depends on your task which one to use.&lt;/p&gt;

&lt;h3 id=&quot;theory&quot;&gt;Theory&lt;/h3&gt;

&lt;p&gt;This article aimed more at practical usage this why for the theory I will leave some only some links:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Hands-on Graph Neural Networks with PyTorch &amp;amp; PyTorch Geometric&lt;/li&gt;
  &lt;li&gt;CS224W: Machine Learning with Graphs&lt;/li&gt;
  &lt;li&gt;Graph classification will be based on Graph Convolutional Networks (GCN), &lt;a href=&quot;https://arxiv.org/abs/1609.02907&quot;&gt;arxiv link&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;model-architecture&quot;&gt;Model architecture&lt;/h3&gt;

&lt;p&gt;We will be using as baseline following architecture:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;* GCNConv - 6 blocks
* JumpingKnowledge for aggregation sconvolutions
* global_add_pool with relu
* Final layer is softmax
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;SimpleGNN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot;Original from http://pages.di.unipi.it/citraro/files/slides/Landolfi_tutorial.pdf&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;layers&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;nb&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SimpleGNN&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dataset&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;convs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ModuleList&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;convs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GCNConv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;in_channels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_node_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                  &lt;span class=&quot;n&quot;&gt;out_channels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;layers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
            &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;convs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;GCNConv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;in_channels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out_channels&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jk&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JumpingKnowledge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;cat&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jk_lin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Linear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;in_features&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;layers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out_features&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lin_1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Linear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;in_features&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out_features&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lin_2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Linear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;in_features&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hidden&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out_features&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_classes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;forward&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from_data_list&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dataset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;conv&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;convs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;relu&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;conv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;edge_index&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;edge_index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;xs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;relu&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jk_lin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;global_add_pool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;relu&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lin_1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;F&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;softmax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lin_2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The code link is based on this &lt;a href=&quot;http://pages.di.unipi.it/citraro/files/slides/Landolfi_tutorial.pdf&quot;&gt;tutorial&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;computer-vision&quot;&gt;Computer vision&lt;/h3&gt;
&lt;p&gt;All the required theory and technical skills you will get by following this article:
Guide how to learn and master computer vision in 2020
Besides, you should be familiar with the next topics:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;EfficientNet https://arxiv.org/abs/1905.11946&lt;/li&gt;
  &lt;li&gt;Focal Loss https://arxiv.org/abs/1708.02002&lt;/li&gt;
  &lt;li&gt;albumentations — augmentation library&lt;/li&gt;
  &lt;li&gt;pytorch-lightning — pytorch framework&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;model-architecture-1&quot;&gt;Model architecture&lt;/h3&gt;

&lt;p&gt;We will be using the following model without any hyper-parameter tuning::&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;efficientnet_b2b as encoder&lt;/li&gt;
  &lt;li&gt;FocalLoss and average precision as early stopping criteria&lt;/li&gt;
  &lt;li&gt;TTA with flip left right and up down&lt;/li&gt;
  &lt;li&gt;Augmentation with albumentation&lt;/li&gt;
  &lt;li&gt;Pytorch-lightning as training model framework&lt;/li&gt;
  &lt;li&gt;4 Folds Assembling&lt;/li&gt;
  &lt;li&gt;mixup
The code &lt;a href=&quot;https://github.com/Diyago/Graph-clasification-by-computer-vision/blob/main/fit_predict_graph.py#L48&quot;&gt;link&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;experiment&quot;&gt;Experiment&lt;/h2&gt;

&lt;h3 id=&quot;data&quot;&gt;Data&lt;/h3&gt;

&lt;p&gt;We will predict the activity (against COVID?) of different molecules.
Dataset sample:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;smiles, activity
OC=1C=CC=CC1CNC2=NC=3C=CC=CC3N2, 1
CC(=O)NCCC1=CNC=2C=CC(F)=CC12, 1
O=C([C@@H]1[C@H](C2=CSC=C2)CCC1)N, 1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;To generate images for the computer vision approach we first convert the graph to the networkx format and then get the desired images by calling draw_kamada_kawai function:&lt;/p&gt;
&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;s&quot;&gt;&quot;&quot;&quot; Full code link 
https://github.com/Diyago/Graph-clasification-by-computer-vision/blob/main/generate_images.py&quot;&quot;&quot;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;__main__&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ohd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;transforms&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;OneHotDegree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;max_degree&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;covid&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;COVID&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;root&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'./data/COVID/'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;transform&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ohd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;torch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arange&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;covid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)).&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;long&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;G&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;utils&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_networkx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;covid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;draw_kamada_kawai&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;G&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;plt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;savefig&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;./train/id_{}_y_{}.jpg&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
                                                    &lt;span class=&quot;n&quot;&gt;covid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]),&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;jpg&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                                            
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;/images/graphs/exmps.png&quot; alt=&quot;salts.png&quot; /&gt;
&lt;em&gt;Different molecules visualization will be used for the computer vision approach. Image by Insaf Ashrapov&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Link to the &lt;a href=&quot;https://github.com/yangkevin2/coronavirus_data/raw/master/data/mpro_xchem.csv&quot;&gt;dataset&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;experiment-results&quot;&gt;Experiment results&lt;/h2&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;TEST metrics
### Computer vision
* ROC AUC 0.697
* MAP 0.183

### Graph method
* ROC AUC 0.702
* MAP 0.199
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;As you can result practically similar. The graph method gets a bit higher results. Besides, it takes only 1 minute to train GNN and 30 minutes for CNN.
I have to say: this is mostly just a proof-of-concept project with many simplifications. In other words, you may visualize graphs and train well-known computer vision models instead of fancy-new GNN.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;
&lt;ul&gt;
  &lt;li&gt;Github repo with all code link by Insaf Ashrapov&lt;/li&gt;
  &lt;li&gt;GNN tutorial http://pages.di.unipi.it/citraro/files/slides/Landolfi_tutorial.pdf&lt;/li&gt;
&lt;/ul&gt;</content><author><name></name></author><summary type="html">Graph analysis nowadays becomes more popular, but how does it perform compared to the computer vision approach? We will show while the training speed of computer vision models is much slower, they perform considerably well compared to graph theory.</summary></entry><entry><title type="html">Talk: Automatic satellite building construction monitoring</title><link href="https://diyago.github.io/2020/09/20/datafest-satt.html" rel="alternate" type="text/html" title="Talk: Automatic satellite building construction monitoring" /><published>2020-09-20T00:00:00+00:00</published><updated>2020-09-20T00:00:00+00:00</updated><id>https://diyago.github.io/2020/09/20/datafest-satt</id><content type="html" xml:base="https://diyago.github.io/2020/09/20/datafest-satt.html">&lt;p&gt;&lt;em&gt;This weekend there was a big Data Science event - DataFest. I had a talk about “Automatic satellite building construction monitoring”. (in Russian)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Рассказал на Data Fest Online 2020 про один из наших исследовательских проектов: Автоматизированный спутниковый контроль за строительством жилых зданий.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/MOWbWHTgnng&quot; frameborder=&quot;0&quot; allow=&quot;autoplay; encrypted-media&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;</content><author><name></name></author><summary type="html">This weekend there was a big Data Science event - DataFest. I had a talk about “Automatic satellite building construction monitoring”. (in Russian)</summary></entry><entry><title type="html">GANs for tabular data</title><link href="https://diyago.github.io/2020/03/26/gans-tabular.html" rel="alternate" type="text/html" title="GANs for tabular data" /><published>2020-03-26T00:00:00+00:00</published><updated>2020-03-26T00:00:00+00:00</updated><id>https://diyago.github.io/2020/03/26/gans-tabular</id><content type="html" xml:base="https://diyago.github.io/2020/03/26/gans-tabular.html">&lt;p&gt;&lt;em&gt;We well know GANs for success in the realistic image generation. However, they can be applied in tabular data generation. We will review and examine some recent papers about tabular GANs in action.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;h4 id=&quot;originally-posted-on-medium&quot;&gt;Originally posted &lt;a href=&quot;https://towardsdatascience.com/review-of-gans-for-tabular-data-a30a2199342&quot;&gt;on Medium&lt;/a&gt;.&lt;/h4&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;h4 id=&quot;github-repo&quot;&gt;&lt;a href=&quot;https://github.com/Diyago/GAN-for-tabular-data&quot;&gt;Github repo&lt;/a&gt;&lt;/h4&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;what-is-gan&quot;&gt;What is GAN&lt;/h2&gt;

&lt;p&gt;“GAN composes of two deep networks: the &lt;strong&gt;generator&lt;/strong&gt; and the &lt;strong&gt;discriminator”&lt;/strong&gt; [1]. Both of them simultaneously trained. Generally, the model structure and training process presented this way:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/gan.jpeg&quot; alt=&quot;gan.jpeg&quot; /&gt;
&lt;em&gt;GAN training pipeline. By Jonathan Hui — What is Generative Adversarial Networks GAN? [1]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The task for the generator is to generate samples, which won’t be distinguished from real samples by the discriminator. I won’t give much detail here, but if you would like to dive into them, you can read the medium post and the original paper by Ian J. Goodfellow.
Recent architectures such as StyleGAN 2 can produce outstanding photo-realistic images.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/faces.png&quot; alt=&quot;png.jpeg&quot; /&gt;
&lt;em&gt;Hand-picked examples of human faces generated by StyleGAN 2, Source arXiv:1912.04958v2 [7]&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;problems&quot;&gt;Problems&lt;/h3&gt;

&lt;p&gt;While face generation seems to be not a problem anymore, there are plenty of issues we need to resolve:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Training speed&lt;/strong&gt;. For training StyleGAN 2 you need 1 week and DGX-1 (8x NVIDIA Tesla V100).&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Image quality&lt;/strong&gt; in specific domains. The state-of-the-art network still fails on other tasks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/cars.png&quot; alt=&quot;cars.png&quot; /&gt;
&lt;em&gt;Hand-picked examples of cars and cats generated by StyleGAN 2, Source arXiv:1912.04958v2 [7]&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;tabular-gans&quot;&gt;Tabular GANs&lt;/h2&gt;

&lt;p&gt;Even cats and dogs generation seem heavy tasks for GANs because of not trivial data distribution and high object type variety. Besides such domains, the image background becomes important, which GANs usually fail to generate.
Therefore, I’ve been wondering what GANs can achieve in tabular data. Unfortunately, there aren’t many articles. The next two articles appear to be the most promising.&lt;/p&gt;

&lt;h3 id=&quot;tgan-synthesizing-tabular-data-using-generative-adversarial-networks-arxiv181111264v1-3&quot;&gt;TGAN: Synthesizing Tabular Data using Generative Adversarial Networks arXiv:1811.11264v1 [3]&lt;/h3&gt;

&lt;p&gt;First, they raise several problems, why generating tabular data has own challenges:
the various data types (int, decimals, categories, time, text)
different shapes of distribution ( multi-modal, long tail, Non-Gaussian…)
sparse one-hot-encoded vectors and highly imbalanced categorical columns.&lt;/p&gt;

&lt;h4 id=&quot;task-formalizing&quot;&gt;Task formalizing&lt;/h4&gt;

&lt;p&gt;Let us say table T contains n_c continuous variables and n_d discrete(categorical) variables, and each row is C vector. These variables have an unknown joint distribution P. Each row is independently sampled from P. The object is to train a generative model M. M should generate new a synthetic table T_synth with the distribution similar to P. A machine learning model learned on T_synth should achieve a similar accuracy on a real test table T_test, as would a model trained on T.&lt;/p&gt;

&lt;h4 id=&quot;preprocessing-numerical-variables&quot;&gt;Preprocessing numerical variables&lt;/h4&gt;

&lt;p&gt;“Neural networks can effectively generate values with a distribution centered over (−1, 1) using tanh” [3]. However, they show that nets fail to generate suitable data with multi-modal data. Thus they cluster a numerical variable by using and training a Gaussian Mixture Model (GMM) with m (m=5) components for each of C.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/equa_1.png&quot; alt=&quot;equa_1.png&quot; /&gt;
&lt;em&gt;Normalizing using GMM using mean and standard deviation. Source arXiv:1811.11264v1 [3]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Finally, GMM is used to normalize C to get V. Besides, they compute the probability of C coming from each of the m Gaussian distribution as a vector U.&lt;/p&gt;

&lt;h4 id=&quot;preprocessing-categorical-variables&quot;&gt;Preprocessing categorical variables&lt;/h4&gt;

&lt;p&gt;Due to usually low cardinality, they found the probability distribution can be generated directly using softmax. But it necessary to convert categorical variables to one-hot-encoding representation with noise to binary variables
After prepossessing, they convert T with n_c + n_d columns to V, U, D vectors. This vector is the output of the generator and the input for the discriminator in GAN. “GAN does not have access to GMM parameters” [3].&lt;/p&gt;

&lt;h4 id=&quot;generator&quot;&gt;Generator&lt;/h4&gt;

&lt;p&gt;They generate a numerical variable in 2 steps. First, generate the value scalar V, then generate the cluster vector U eventually applying tanh. Categorical features generated as a probability distribution over all possible labels with softmax. To generate the desired row LSTM with attention mechanism is used. Input for LSTM in each step is random variable z, weighted context vector with previous hidden and embedding vector.&lt;/p&gt;

&lt;h4 id=&quot;discriminator&quot;&gt;Discriminator&lt;/h4&gt;

&lt;p&gt;Multi-Layer Perceptron (MLP) with LeakyReLU and BatchNorm is used. The first layer used concatenated vectors (V, U, D) among with mini-batch diversity with feature vector from LSTM. The loss function is the KL divergence term of input variables with the sum ordinal log loss function.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/disc.png&quot; alt=&quot;disc.png&quot; /&gt;
&lt;em&gt;Example of using TGAN to generate a simple census table. The generator generates T features one be one. The discriminator concatenates all features together. Then it uses Multi-Layer Perceptron (MLP) with LeakyReLU to distinguish real and fake data. Source arXiv:1811.11264v1 [3]&lt;/em&gt;&lt;/p&gt;

&lt;h4 id=&quot;results&quot;&gt;Results&lt;/h4&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/results.png&quot; alt=&quot;results.png&quot; /&gt;
&lt;em&gt;Accuracy of machine learning models trained on the real and synthetic training set. (BN — Bayesian networks, Gaussian Copula). Source arXiv:1811.11264v1 [3]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;They evaluate the model on two datasets KDD99 and covertype. For some reason, they used weak models without boosting (xgboost, etc). Anyway, TGAN performs reasonably well and robust, outperforming bayesian networks. The average performance gap between real data and synthetic data is 5.7%.&lt;/p&gt;

&lt;h2 id=&quot;modeling-tabular-data-using-conditional-gan-ctgan-arxiv190700503v2-4&quot;&gt;Modeling Tabular Data using Conditional GAN (CTGAN) arXiv:1907.00503v2 [4]&lt;/h2&gt;

&lt;p&gt;The key improvements over previous TGAN are applying the mode-specific normalization to overcome the non-Gaussian and multimodal distribution. Then a conditional generator and training-by-sampling to deal with the imbalanced discrete columns.&lt;/p&gt;

&lt;h4 id=&quot;task-formalizing-1&quot;&gt;Task formalizing&lt;/h4&gt;

&lt;p&gt;The initial data remains the same as it was in TGAN. However, they solve different problems.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Likelihood of fitness&lt;/strong&gt;. Do columns in T_syn follow the same joint distribution as T_train&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Machine learning efficacy&lt;/strong&gt;. When training model to predict one column using other columns as features, can such model learned from T_syn achieve similar performance on T_test, as a model learned on T_train&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;preprocessing&quot;&gt;Preprocessing&lt;/h4&gt;

&lt;p&gt;Preprocessing for discrete columns keeps the same.
For continuous variables, a variational Gaussian mixture model (VGM) is used. It first estimates the number of modes m and then fits a Gaussian mixture. After we normalize initial vector C almost the same as it was in TGAN, but the value is normalized within each mode. Mode is represented as one-hot vector betta ([0, 0, .., 1, 0]). Alpha is the normalized value of C.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/exm.png&quot; alt=&quot;exm.png&quot; /&gt;
&lt;em&gt;An example of mode-specific normalization. Source arXiv:1907.00503v2 [4]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;As a result, we get our initial row represented as the concatenation of one-hot’ ed discrete columns with representation discussed above of continues variables:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/equa_2.png&quot; alt=&quot;equa_2.png&quot; /&gt;
&lt;em&gt;Preprocessed row. Source arXiv:1907.00503v2 [4]&lt;/em&gt;&lt;/p&gt;

&lt;h4 id=&quot;training&quot;&gt;Training&lt;/h4&gt;

&lt;p&gt;“The final solution consists of three key elements, namely: the conditional vector, the generator loss, and the training-by-sampling method” [4].&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/ctgan.png&quot; alt=&quot;ctgan.png&quot; /&gt;
&lt;em&gt;&lt;strong&gt;CTGAN&lt;/strong&gt; model. The conditional generator can generate synthetic rows conditioned on one of the discrete columns. With training-by-sampling, the cond and training data are sampled according to the log-frequency of each category, thus CTGAN can evenly explore all possible discrete values. Source arXiv:1907.00503v2 [4]&lt;/em&gt;&lt;/p&gt;

&lt;h4 id=&quot;conditional-vector&quot;&gt;Conditional vector&lt;/h4&gt;

&lt;p&gt;Represents concatenated one-hot vectors of all discrete columns but with the specification of only one category, which was selected. “For instance, for two discrete columns, D1 = {1, 2, 3} and D2 = {1, 2}, the condition (D2 = 1) is expressed by the mask vectors m1 = [0, 0, 0] and m2 = [1, 0]; so cond = [0, 0, 0, 1, 0]” [4].&lt;/p&gt;

&lt;h4 id=&quot;generator-loss&quot;&gt;Generator loss&lt;/h4&gt;

&lt;p&gt;“During training, the conditional generator is free to produce any set of one-hot discrete vectors” [4]. But they enforce the conditional generator to produce d_i (generated discrete one-hot column)= m_i (mask vector) is to penalize its loss by adding the cross-entropy between them, averaged over all the instances of the batch.&lt;/p&gt;

&lt;h4 id=&quot;training-by-sampling&quot;&gt;Training-by-sampling&lt;/h4&gt;

&lt;p&gt;“Specifically, the goal is to resample efficiently in a way that all the categories from discrete attributes are sampled evenly during the training process, as a result, to get real data distribution during the test” [4].
In another word, the output produced by the conditional generator must be assessed by the critic, which estimates the distance between the learned conditional distribution P_G(row|cond) and the conditional distribution on real data P(row|cond). “The sampling of real training data and the construction of cond vector should comply to help critics estimate the distance” [4]. Properly sample the cond vector and training data can help the model evenly explore all possible values in discrete columns.
The model structure is given below, as opposite to TGAN, there is no LSTM layer. Trained with WGAN loss with gradient penalty.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/disc_2.png&quot; alt=&quot;disc_2.png&quot; /&gt;
&lt;em&gt;Generator. Source arXiv:1907.00503v2 [4]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/dd.png&quot; alt=&quot;dd.png&quot; /&gt;
&lt;em&gt;Discriminator. Source arXiv:1907.00503v2 [4]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Also, they propose a model based on Variational autoencoder (VAE), but it out of the scope of this article.&lt;/p&gt;

&lt;h4 id=&quot;results-1&quot;&gt;Results&lt;/h4&gt;

&lt;p&gt;Proposed network CTGAN and TVAE outperform other methods. As they say, TVAE outperforms CTGAN in several cases, but GANs do have several favorable attributes. The generator in GANs does not have access to real data during the entire training process, unlike TVAE.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/reults2.png&quot; alt=&quot;reults2.png&quot; /&gt;
&lt;em&gt;Benchmark results over three sets of experiments, namely Gaussian mixture simulated data (GM Sim.), Bayesian network simulated data (BN Sim.), and real data. They report the average of each metric. For real datasets (f1, etc). Source arXiv:1907.00503v2 [4]&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Besides, they published source code on GitHub, which with slight modification will be used further in the article.&lt;/p&gt;

&lt;h2 id=&quot;applying-ctgan-to-generating-data-for-increasing-train-semi-supervised&quot;&gt;Applying CTGAN to generating data for increasing train (semi-supervised)&lt;/h2&gt;

&lt;p&gt;This is a kind of vanilla dream for me to be examined. After brief familiarization with recent developments in GAN, I’ve been thinking about how to apply it to something that I solve on the work daily. So here is my idea.&lt;/p&gt;

&lt;h4 id=&quot;task-formalization&quot;&gt;Task formalization&lt;/h4&gt;

&lt;p&gt;Let say we have T_train and T_test (train and test set respectively). We need to train the model on T_train and make predictions on T_test. However, we will increase the train by generating new data by GAN, somehow similar to T_test, without using ground truth labels of it.&lt;/p&gt;

&lt;h4 id=&quot;experiment-design&quot;&gt;Experiment design&lt;/h4&gt;

&lt;p&gt;Let say we have T_train and T_test (train and test set respectively). The size of T_train is smaller and might have different data distribution. First of all, we train CTGAN on T_train with ground truth labels (step 1), then generate additional data T_synth (step 2). Secondly, we train boosting in an adversarial way on concatenated T_train and T_synth (target set to 0) with T_test (target set to 1) (steps 3 &amp;amp; 4). The goal is to apply newly trained adversarial boosting to obtain rows more like T_test. Note — original ground truth labels aren’t used for adversarial training. As a result, we take top rows from T_train and T_synth sorted by correspondence to T_test (steps 5 &amp;amp; 6). Finally, rain new boosting on them and check results on T_test.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/tabular-gan/exp.png&quot; alt=&quot;exp.png&quot; /&gt;
&lt;em&gt;Experiment design and workflow&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Of course for the benchmark purposes we will test ordinal training without these tricks and another original pipeline but without CTGAN (in step 3 we won’t use T_sync).&lt;/p&gt;

&lt;h4 id=&quot;code&quot;&gt;Code&lt;/h4&gt;

&lt;p&gt;Experiment code and results released as Github repo here. Pipeline and data preparation was based on Benchmarking Categorical Encoders’ article and its repo. We will follow almost the same pipeline, but for speed, only Single validation and Catboost encoder was chosen. Due to the lack of GPU memory, some of the datasets were skipped.&lt;/p&gt;

&lt;h4 id=&quot;datasets&quot;&gt;Datasets&lt;/h4&gt;

&lt;p&gt;All datasets came from different domains. They have a different number of observations, several categorical and numerical features. The aim of all datasets is a binary classification. Preprocessing of datasets was simple: removed all time-based columns from datasets. The remaining columns were either categorical or numerical. In addition, while training results were sampled T_train — 5%, 10%, 25%, 50%, 75%&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;Name&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Total points&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Train points&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Test points&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Number of features&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Number of categorical features&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;Short description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://www.kaggle.com/blastchar/telco-customer-churn&quot;&gt;Telecom&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;7.0k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;4.2k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;2.8k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;20&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;16&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Churn prediction for telecom data&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://www.kaggle.com/wenruliu/adult-income-dataset&quot;&gt;Adult&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;48.8k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;29.3k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;19.5k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;15&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;8&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Predict if persons’ income is bigger 50k&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://www.kaggle.com/c/amazon-employee-access-challenge/data&quot;&gt;Employee&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;32.7k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;19.6k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;13.1k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;10&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;9&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Predict an employee’s access needs, given his/her job role&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://www.kaggle.com/c/home-credit-default-risk/data&quot;&gt;Credit&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;307.5k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;184.5k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;123k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;121&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;18&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Loan repayment&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://www.crowdanalytix.com/contests/propensity-to-fund-mortgages&quot;&gt;Mortgages&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;45.6k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;27.4k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;18.2k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;20&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;9&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Predict if house mortgage is founded&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://www.crowdanalytix.com/contests/mckinsey-big-data-hackathon&quot;&gt;Taxi&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;892.5k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;535.5k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;357k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;8&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;5&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Predict the probability of an offer being accepted by a certain driver&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;&lt;a href=&quot;https://www.drivendata.org/competitions/50/worldbank-poverty-prediction/page/99/&quot;&gt;Poverty_A&lt;/a&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;37.6k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;22.5k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;15.0k&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;41&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;38&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Predict whether or not a given household for a given country is poor or not&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;em&gt;Datasets properties&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;results-2&quot;&gt;Results&lt;/h3&gt;

&lt;p&gt;From the first sight of view and in terms of metric and stability (std), GAN shows the worse results. However, sampling the initial train and then applying adversarial training we could obtain the best metric results and stability (sample_original). To determine the best sampling strategy, ROC AUC scores of each dataset were scaled (min-max scale) and then averaged among the dataset.&lt;/p&gt;

&lt;h2 id=&quot;results-3&quot;&gt;Results&lt;/h2&gt;

&lt;p&gt;To determine the best validation strategy, I compared the top score of each dataset for each type of validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Table 1.2&lt;/strong&gt; Different sampling results across the dataset, higher is better (100% - maximum per dataset ROC AUC)&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;dataset_name&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;None&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;gan&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;sample_original&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;credit&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.997&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.998&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.997&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;employee&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.986&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.966&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.972&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;mortgages&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.984&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.964&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.988&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;poverty_A&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.937&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.950&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.933&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;taxi&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.966&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.938&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.987&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;adult&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.995&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.967&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.998&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;telecom&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.995&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.868&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.992&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Table 1.3&lt;/strong&gt; Different sampling results, higher is better for a mean (ROC AUC), lower is better for std (100% - maximum per dataset ROC AUC)&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;sample_type&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;mean&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;std&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;None&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.980&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.036&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;gan&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.969&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.06&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;sample_original&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.981&lt;/strong&gt;&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;&lt;strong&gt;0.032&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;We can see that GAN outperformed other sampling types in 2 datasets. Whereas sampling from original outperformed other methods in 3 of 7 datasets. Of course, there isn’t much difference. but these types of sampling might be an option. Of course, there isn’t much difference. but these types of sampling might be an option.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Table 1.4&lt;/strong&gt; same_target_prop is equal 1 then the target rate for train and test are different no more than 5%. Higher is better.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th style=&quot;text-align: left&quot;&gt;sample_type&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;same_target_prop&lt;/th&gt;
      &lt;th style=&quot;text-align: right&quot;&gt;prop_test_score&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;None&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.964&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;None&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.985&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;gan&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.966&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;gan&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.945&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;sample_original&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.973&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td style=&quot;text-align: left&quot;&gt;sample_original&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;1&lt;/td&gt;
      &lt;td style=&quot;text-align: right&quot;&gt;0.984&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Let’s define same_target_prop is equal 1 then the target rate for train and test is different no more than 5%. So then we have almost the same target rate in train and test None and sample_original better. However, gan is starting performing noticeably better than target distribution changes.&lt;/p&gt;

&lt;p&gt;same_target_prop is equal 1 then the target rate for train and test are different only by 5%&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;p&gt;[1] Jonathan Hui. GAN — What is Generative Adversarial Networks GAN? (2018), medium article
[2] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio. Generative Adversarial Networks (2014). arXiv:1406.2661
[3] Lei Xu LIDS, Kalyan Veeramachaneni. Synthesizing Tabular Data using Generative Adversarial Networks (2018). arXiv:1811.11264v1 [cs.LG]
[4] Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, Kalyan Veeramachaneni. Modeling Tabular Data using Conditional GAN (2019). arXiv:1907.00503v2 [cs.LG]
[5] Denis Vorotyntsev. Benchmarking Categorical Encoders (2019). Medium post
[6] Insaf Ashrapov. GAN-for-tabular-data (2020). Github repository.
[7] Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila. Analyzing and Improving the Image Quality of StyleGAN (2019) arXiv:1912.04958v2 [cs.CV]&lt;/p&gt;</content><author><name></name></author><summary type="html">We well know GANs for success in the realistic image generation. However, they can be applied in tabular data generation. We will review and examine some recent papers about tabular GANs in action.</summary></entry><entry><title type="html">Guide how to learn and master computer vision in 2020</title><link href="https://diyago.github.io/2019/12/15/guide-cv.html" rel="alternate" type="text/html" title="Guide how to learn and master computer vision in 2020" /><published>2019-12-15T00:00:00+00:00</published><updated>2019-12-15T00:00:00+00:00</updated><id>https://diyago.github.io/2019/12/15/guide-cv</id><content type="html" xml:base="https://diyago.github.io/2019/12/15/guide-cv.html">&lt;p&gt;&lt;em&gt;This post will focus on resources, which I believe will boost your knowledge in computer vision the most and mainly based on my own experience.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;h4 id=&quot;original-medium-post&quot;&gt;Original &lt;a href=&quot;https://towardsdatascience.com/guide-to-learn-computer-vision-in-2020-36f19d92c934&quot;&gt;Medium post&lt;/a&gt;&lt;/h4&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before starting learning computer vision getting knowledge about basics in machine learning and python will be great.&lt;/p&gt;

&lt;h2 id=&quot;frameworks&quot;&gt;Frameworks&lt;/h2&gt;
&lt;p&gt;&lt;img src=&quot;/images/guide_cv/keras_vs_torch.png&quot; alt=&quot;keras_vs_torch.png&quot; /&gt;
&lt;em&gt;Star Wars: Luke Skywalker &amp;amp; Darth Vader&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;You don’t have to choose it from the beginning, but applying newly gained knowledge is necessary.
There is no much to options: pytorch or keras (TensorFlow). Pytorch may require more code to write but gives much flexibility in return, so use it. Besides, most researchers in deep learning started to use pytoch.
Albumentation (image augmentation) and catalyst (framework, high-level API on the top of pytorch) might be useful as well, use them, especially the first one.&lt;/p&gt;

&lt;h2 id=&quot;hardware&quot;&gt;Hardware&lt;/h2&gt;

&lt;p&gt;Nvidia GPU 10xx+ will be more than enough ($300+)
Kaggle kernels — only 30 hours/week (free)
Google Colab — 12 hours session limit, unknown week limits (free)&lt;/p&gt;

&lt;h2 id=&quot;theory--practise&quot;&gt;Theory &amp;amp; Practise&lt;/h2&gt;

&lt;h3 id=&quot;online-courses&quot;&gt;Online courses&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CS231n&lt;/code&gt; is the top online, which covers all necessary fundamentals in the computer vision. Youtube online videos. They even have exercises but I can’t advise to solve them. (free)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Fast.ai&lt;/code&gt; is the next course you should watch off. Also, fast.ai is the high-level framework on the top of pytorch, but they change their API too frequent and the lack of documentation makes it unreliable to use. However, theory and useful tricks are just fantastic to spend time watching this course. (free)
While taking these courses I encourage you to put theory into practice applying it to one of the frameworks.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;articles-and-code&quot;&gt;Articles and code&lt;/h3&gt;

&lt;p&gt;ArXiv.org — information about all recent will be here. (free)
https://paperswithcode.com/sota — the state of the art in most common deep learning tasks, not only computer vision. (free)
Github — if something was implemented you will find it here. (free)&lt;/p&gt;

&lt;h3 id=&quot;books&quot;&gt;Books&lt;/h3&gt;

&lt;p&gt;There is no much to read, but these two books I believe will be useful, no matter pytorch or keras you choose to use
Deep Learning with Python by Keras creator and Google AI researcher François Chollet. Easy to use and may get some insight you didn’t know before. (not free)
Deep learning with Pytorch by pytorch team Eli Stevens &amp;amp; Luca Antiga (free)&lt;/p&gt;

&lt;h3 id=&quot;kaggle&quot;&gt;Kaggle&lt;/h3&gt;

&lt;p&gt;Competitions — kaggle is well known online platform for different variety of machine learning competitions, many of them are about computer vision. You can start participating even without finishing courses, because from competition beginning there will be many open kernels (end-to-end code) which you can run directly from the browser. (free)&lt;/p&gt;

&lt;h2 id=&quot;tough-jedi-way&quot;&gt;Tough (jedi) way&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;/images/guide_cv/jedi.png&quot; alt=&quot;jedi.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Star Wars`s Jedi: Yoda&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Another alternative path could be tough but you will get required knowledge not only to do fit-predict but perform own research. From Sergei Belousov aka bes.
You just need to read and implement all the articles below (free). Just reading them also will be great.&lt;/p&gt;

&lt;h2 id=&quot;architectures&quot;&gt;Architectures&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;AlexNet: &lt;a href=&quot;https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks&quot;&gt;https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;ZFNet: &lt;a href=&quot;https://arxiv.org/abs/1311.2901&quot;&gt;https://arxiv.org/abs/1311.2901&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;VGG16: &lt;a href=&quot;https://arxiv.org/abs/1505.06798&quot;&gt;https://arxiv.org/abs/1505.06798&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;ResNet: &lt;a href=&quot;https://arxiv.org/abs/1704.06904&quot;&gt;https://arxiv.org/abs/1704.06904&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;GoogLeNet: &lt;a href=&quot;https://arxiv.org/abs/1409.4842&quot;&gt;https://arxiv.org/abs/1409.4842&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Inception: &lt;a href=&quot;https://arxiv.org/abs/1512.00567&quot;&gt;https://arxiv.org/abs/1512.00567&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Xception: &lt;a href=&quot;https://arxiv.org/abs/1610.02357&quot;&gt;https://arxiv.org/abs/1610.02357&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;MobileNet: &lt;a href=&quot;https://arxiv.org/abs/1704.04861&quot;&gt;https://arxiv.org/abs/1704.04861&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;semantic-segmentation&quot;&gt;Semantic Segmentation&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;FCN: &lt;a href=&quot;https://arxiv.org/abs/1411.4038&quot;&gt;https://arxiv.org/abs/1411.4038&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;SegNet: &lt;a href=&quot;https://arxiv.org/abs/1511.00561&quot;&gt;https://arxiv.org/abs/1511.00561&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;UNet: &lt;a href=&quot;https://arxiv.org/abs/1505.04597&quot;&gt;https://arxiv.org/abs/1505.04597&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;PSPNet: &lt;a href=&quot;https://arxiv.org/abs/1612.01105&quot;&gt;https://arxiv.org/abs/1612.01105&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;DeepLab: &lt;a href=&quot;https://arxiv.org/abs/1606.00915&quot;&gt;https://arxiv.org/abs/1606.00915&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;ICNet: &lt;a href=&quot;https://arxiv.org/abs/1704.08545&quot;&gt;https://arxiv.org/abs/1704.08545&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;ENet: &lt;a href=&quot;https://arxiv.org/abs/1606.02147&quot;&gt;https://arxiv.org/abs/1606.02147&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;generative-adversarial-networks&quot;&gt;Generative adversarial networks&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;GAN: &lt;a href=&quot;https://arxiv.org/abs/1406.2661&quot;&gt;https://arxiv.org/abs/1406.2661&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;DCGAN: &lt;a href=&quot;https://arxiv.org/abs/1511.06434&quot;&gt;https://arxiv.org/abs/1511.06434&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;WGAN: &lt;a href=&quot;https://arxiv.org/abs/1701.07875&quot;&gt;https://arxiv.org/abs/1701.07875&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Pix2Pix: &lt;a href=&quot;https://arxiv.org/abs/1611.07004&quot;&gt;https://arxiv.org/abs/1611.07004&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;CycleGAN: &lt;a href=&quot;https://arxiv.org/abs/1703.10593&quot;&gt;https://arxiv.org/abs/1703.10593&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;object-detection&quot;&gt;Object detection&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;RCNN: &lt;a href=&quot;https://arxiv.org/abs/1311.2524&quot;&gt;https://arxiv.org/abs/1311.2524&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Fast-RCNN: &lt;a href=&quot;https://arxiv.org/abs/1504.08083&quot;&gt;https://arxiv.org/abs/1504.08083&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Faster-RCNN: &lt;a href=&quot;https://arxiv.org/abs/1506.01497&quot;&gt;https://arxiv.org/abs/1506.01497&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;SSD: &lt;a href=&quot;https://arxiv.org/abs/1512.02325&quot;&gt;https://arxiv.org/abs/1512.02325&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;YOLO: &lt;a href=&quot;https://arxiv.org/abs/1506.02640&quot;&gt;https://arxiv.org/abs/1506.02640&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;YOLO9000: &lt;a href=&quot;https://arxiv.org/abs/1612.08242&quot;&gt;https://arxiv.org/abs/1612.08242&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;instance-segmentation&quot;&gt;Instance Segmentation&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Mask-RCNN: &lt;a href=&quot;https://arxiv.org/abs/1703.06870&quot;&gt;https://arxiv.org/abs/1703.06870&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;YOLACT: &lt;a href=&quot;https://arxiv.org/abs/1904.02689&quot;&gt;https://arxiv.org/abs/1904.02689&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;pose-estimation&quot;&gt;Pose estimation&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;PoseNet: &lt;a href=&quot;https://arxiv.org/abs/1505.07427&quot;&gt;https://arxiv.org/abs/1505.07427&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;DensePose: &lt;a href=&quot;https://arxiv.org/abs/1802.00434&quot;&gt;https://arxiv.org/abs/1802.00434&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</content><author><name></name></author><summary type="html">This post will focus on resources, which I believe will boost your knowledge in computer vision the most and mainly based on my own experience.</summary></entry><entry><title type="html">Severstal Steel Defect Detection Challenge on Kaggle</title><link href="https://diyago.github.io/2019/11/20/kaggle-severstal.html" rel="alternate" type="text/html" title="Severstal Steel Defect Detection Challenge on Kaggle" /><published>2019-11-20T00:00:00+00:00</published><updated>2019-11-20T00:00:00+00:00</updated><id>https://diyago.github.io/2019/11/20/kaggle-severstal</id><content type="html" xml:base="https://diyago.github.io/2019/11/20/kaggle-severstal.html">&lt;p&gt;&lt;em&gt;&lt;strong&gt;Top 2% (31/2431) solution write-up&lt;/strong&gt;. Steel is one of the most important building materials of modern times. Steel buildings are resistant to natural and man-made wear which has made the material ubiquitous around the world. To help make production of steel more efficient, this competition will help identify defects.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;h3 id=&quot;31-place-solution-on-github-&quot;&gt;31 place &lt;a href=&quot;https://github.com/Diyago/Severstal-Steel-Defect-Detection&quot;&gt;solution on Github &lt;/a&gt;&lt;/h3&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Can you detect and classify defects in steel? Segmentation in Pytorch
https://www.kaggle.com/c/severstal-steel-defect-detection/overview&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/kaggle-severstal/input_data.png&quot; alt=&quot;input_data.png&quot; /&gt;
&lt;em&gt;Input data&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Team - [ods.ai] stainless&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Insaf Ashrapov&lt;/li&gt;
  &lt;li&gt;Igor Krashenyi&lt;/li&gt;
  &lt;li&gt;Pavel Pleskov&lt;/li&gt;
  &lt;li&gt;Anton Zakharenkov&lt;/li&gt;
  &lt;li&gt;Nikolai Popov&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Models&lt;/strong&gt; 
We tried almost every type of model from qubvel`s segmentation model library - unet, fpn, pspnet with different encoders from resnet to senet152. FPN with se-resnext50 outperformed other models. Lighter models like resnet34 performed aren’t well enough but were useful in the final blend. Se-resnext101 possibly could perform much better with more time training, but we didn’t test that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Augmentations and Preprocessing&lt;/strong&gt;
From &lt;strong&gt;Albumentations&lt;/strong&gt; library:
Hflip, VFlip, RandomBrightnessContrast – training speed was not to fast so these basic augmentations performed well enough. In addition, we used big crops for training or/and finetuning on the full image size, because attention blocks in image tasks rely on the same input size for the training and inference phase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Training&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;We used both pure pytorch and Catalyst framework for training.&lt;/li&gt;
  &lt;li&gt;Losses: bce and bce with dice performed quite well, but lovasz loss dramatically outperformed them in terms of validation and public score. However, combining with classification model bce with dice gave a better result, that could be because Lovasz helped the model to filter out false-positive masks. Focal loss performed quite poor due to not very good labeling.&lt;/li&gt;
  &lt;li&gt;Optimizer: Adam with RAdam. LookAHead, Over900 didn’t work well to use.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Crops with a mask, BalanceClassSampler with upsampler mode from catalyst significantly increased training speed.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;We tried own classification model (resnet34 with CBAM) by setting the goal to improve f1 for each class. The optimal threshold was disappointingly unstable but we reached averaged f1 95.1+. As a result, Cheng`s classification was used.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Validation: kfold with 10 folds. Despite the shake-up – local, public and private correlated surprisingly good.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Pseudolabing; We did two rounds of pseudo labeling by training on the best public submit and validating on the out of fold. It didn’t work for the third time but gave us a huge improvement.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Postprocessing: filling holes, removing the small mask by the threshold. We tried to remove small objects by connected components with no improvements.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;Hardware: bunch of nvidia cards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Ensembling&lt;/strong&gt;
Simple segmentation models averaging with different encoders, both FPN and Unet applied to images classified having a mask. One of the unchosen submit could give as 16th place.&lt;/p&gt;</content><author><name></name></author><summary type="html">Top 2% (31/2431) solution write-up. Steel is one of the most important building materials of modern times. Steel buildings are resistant to natural and man-made wear which has made the material ubiquitous around the world. To help make production of steel more efficient, this competition will help identify defects.</summary></entry><entry><title type="html">Talk: Banking models interpretation</title><link href="https://diyago.github.io/2019/11/09/banking-inter.html" rel="alternate" type="text/html" title="Talk: Banking models interpretation" /><published>2019-11-09T00:00:00+00:00</published><updated>2019-11-09T00:00:00+00:00</updated><id>https://diyago.github.io/2019/11/09/banking-inter</id><content type="html" xml:base="https://diyago.github.io/2019/11/09/banking-inter.html">&lt;p&gt;&lt;em&gt;Talk was given at AI Journey Conference in Moscow. Conference with leading international and Russian experts in AI and data analysis, top companies in the development and application of AI in business&lt;/em&gt;&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/hnr4pkxUMpk&quot; frameborder=&quot;0&quot; allow=&quot;autoplay; encrypted-media&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;</content><author><name></name></author><summary type="html">Talk was given at AI Journey Conference in Moscow. Conference with leading international and Russian experts in AI and data analysis, top companies in the development and application of AI in business</summary></entry><entry><title type="html">Kaggle APTOS 2019 Blindness Detection Challenge</title><link href="https://diyago.github.io/2019/10/04/kaggle-blindness.html" rel="alternate" type="text/html" title="Kaggle APTOS 2019 Blindness Detection Challenge" /><published>2019-10-04T00:00:00+00:00</published><updated>2019-10-04T00:00:00+00:00</updated><id>https://diyago.github.io/2019/10/04/kaggle-blindness</id><content type="html" xml:base="https://diyago.github.io/2019/10/04/kaggle-blindness.html">&lt;p&gt;&lt;em&gt;&lt;strong&gt;Top 3% (76/2943) solution write-up&lt;/strong&gt; for the &lt;a href=&quot;https://www.kaggle.com/c/aptos2019-blindness-detection&quot;&gt;Kaggle APTOS 2019 Blindness Detection&lt;/a&gt;. Imagine being able to detect blindness before it happened. Millions of people suffer from diabetic retinopathy, the leading cause of blindness among working aged adults&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This repository consists of code and configs that were used to train our best single model. The solution is powered by awesome &lt;a href=&quot;https://github.com/catalyst-team/catalyst&quot;&gt;Catalyst&lt;/a&gt; library.&lt;/p&gt;

&lt;h3 id=&quot;data&quot;&gt;Data&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/images/kaggle_blindness/input.png&quot; alt=&quot;input.png&quot; /&gt;
&lt;em&gt;Input Data&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;2015 competition data was used for pretraining all our models. Without it out models performed much worse. We used different techniques: first train on old data, then finetuning on the new train, another technique train on both data, the finetune on new train data. Besides, starting finetuning with freezing all layers and training only last FC layer gave us more stable results.&lt;/p&gt;

&lt;h3 id=&quot;models-and-preprocessing&quot;&gt;Models and Preprocessing&lt;/h3&gt;

&lt;p&gt;From the beginning, efficientnet outperformed other models. Using fp16 (available in kaggle kernels) allowed to use bigger batch size - speeded up training and inference.&lt;/p&gt;

&lt;p&gt;Models used in the final submission:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;EfficientNet-B5 (best single model): 224x224 (tta with Hflip, preprocessing - crop_from_gray, circle_crop, ben_preprocess=10)&lt;/li&gt;
  &lt;li&gt;EfficientNet-B4: 256x256 (tta with Hflip, preprocessing - crop_from_gray, circle_crop, ben_preprocess=20)&lt;/li&gt;
  &lt;li&gt;EfficientNet-B5: 256x256 (tta with Hflip, preprocessing - crop_from_gray, circle_crop, ben_preprocess=30)&lt;/li&gt;
  &lt;li&gt;EfficientNet-B5: (256x256) without specific preprocess, two models with different augmentations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We tried bigger image sizes but it gave worse results. EfficientNet-B2 and EfficientNet-B6 gave worse results as well.&lt;/p&gt;

&lt;h3 id=&quot;augmentations&quot;&gt;Augmentations&lt;/h3&gt;
&lt;p&gt;From &lt;a href=&quot;https://github.com/albu/albumentations&quot;&gt;Albumentations&lt;/a&gt; library:
Hflip, VFlip,  RandomScale, CenterCrop,  RandomBrightnessContrast, ShiftScaleRotate, RandomGamma, RandomGamma, JpegCompression, HueSaturationValue, RGBShift, ChannelShuffle, ToGray, Cutout&lt;/p&gt;

&lt;h3 id=&quot;training&quot;&gt;Training&lt;/h3&gt;
&lt;p&gt;First 3 models were trained using &lt;a href=&quot;https://github.com/catalyst-team/catalyst&quot;&gt;Catalyst&lt;/a&gt; library and the last one with FastAi, both of them work on top of Pytorch.&lt;/p&gt;

&lt;p&gt;We used both ordinal regression and regression. Models with classification tasks weren’t well enough to use them.
Adam with OneCycle was used for training. WarmUp helped to get more stable results. RAdam, Label smoothing didn’t help to improve the score.&lt;/p&gt;

&lt;p&gt;We tried to use leak investigated &lt;a href=&quot;https://www.kaggle.com/miklgr500/leakage-detection-about-8-test-dataset&quot;&gt;here&lt;/a&gt; and &lt;a href=&quot;https://www.kaggle.com/konradb/adversarial-validation-quick-fast-ai-approach&quot;&gt;here&lt;/a&gt; by fixing output results. Almost 10% of the public test data were part of the train. Results dropped significantly, which means training data annotation were pretty bad.&lt;/p&gt;

&lt;p&gt;We tried kappa coefficient optimization, it didn’t give reliable improvement on public, but could help us on private almost +0.003 score.&lt;/p&gt;

&lt;h3 id=&quot;hardware&quot;&gt;Hardware&lt;/h3&gt;
&lt;p&gt;We used 1x&lt;em&gt;2080, 1x&lt;/em&gt; Tesla v40, 1x*1070ti
Ensembling&lt;/p&gt;

&lt;h3 id=&quot;team&quot;&gt;Team&lt;/h3&gt;
&lt;p&gt;&lt;a href=&quot;https://www.kaggle.com/mamatml&quot;&gt;Mamat Shamshiev&lt;/a&gt;, &lt;a href=&quot;https://www.kaggle.com/insaff&quot;&gt;Insaf Ashrapov&lt;/a&gt;, &lt;a href=&quot;https://www.kaggle.com/mnikita&quot;&gt;Mishunyayev Nikita&lt;/a&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">Top 3% (76/2943) solution write-up for the Kaggle APTOS 2019 Blindness Detection. Imagine being able to detect blindness before it happened. Millions of people suffer from diabetic retinopathy, the leading cause of blindness among working aged adults</summary></entry><entry><title type="html">Road detection using segmentation models and albumentations libraries on Keras</title><link href="https://diyago.github.io/2019/08/25/road-detection.html" rel="alternate" type="text/html" title="Road detection using segmentation models and albumentations libraries on Keras" /><published>2019-08-25T00:00:00+00:00</published><updated>2019-08-25T00:00:00+00:00</updated><id>https://diyago.github.io/2019/08/25/road-detection</id><content type="html" xml:base="https://diyago.github.io/2019/08/25/road-detection.html">&lt;p&gt;&lt;em&gt;In this article, I will show how to write own data generator and how to use albumentations as augmentation library. Along with segmentation_models library, which provides dozens of pretrained heads to Unet and other unet-like architectures. For the full code go to Github. Link to dataset.&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;h4 id=&quot;original-medium-post&quot;&gt;Original &lt;a href=&quot;https://towardsdatascience.com/road-detection-using-segmentation-models-and-albumentations-libraries-on-keras-d5434eaf73a8&quot;&gt;Medium post&lt;/a&gt;&lt;/h4&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;theory&quot;&gt;Theory&lt;/h2&gt;

&lt;p&gt;The task of semantic image segmentation is to label each pixel of an image with a corresponding class of what is being represented. For such a task, Unet architecture with different variety of improvements has shown the best result. The core idea behind it just few convolution blocks, which extracts deep and different type of image features, following by so-called deconvolution or upsample blocks, which restore the initial shape of the input image. Besides after each convolution layers, we have some skip-connections, which help the network to remember about initial image and help against fading gradients. For more detailed information you can read the arxiv article or another article.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/segmentation_road/unet.png&quot; alt=&quot;unet.png&quot; /&gt;
&lt;em&gt;Vanilla U-Net https://arxiv.org/abs/1505.04597&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;We came for practice, lets go for it.&lt;/p&gt;

&lt;h2 id=&quot;datasetsatellite-images&quot;&gt;Dataset—satellite images&lt;/h2&gt;

&lt;p&gt;For segmentation we don’t need much data to start getting a decent result, even 100 annotated photos will be enough. For now, we will be using Massachusetts Roads Dataset from https://www.cs.toronto.edu/~vmnih/data/, there about 1100+ annotated train images, they even provide validation and test dataset. Unfortunately, there is no download button, so we have to use a script. This script will get the job done (it might take some time to complete).
Lets take a look at image examples:
&lt;img src=&quot;/images/segmentation_road/input_data.png&quot; alt=&quot;input_data.png&quot; /&gt;
&lt;em&gt;Massachusetts Roads Dataset image and ground truth mask ex.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Annotation and image quality seem to be pretty good, the network should be able to detect roads.&lt;/p&gt;

&lt;h2 id=&quot;libraries-installation&quot;&gt;Libraries installation&lt;/h2&gt;

&lt;p&gt;First of all, you need Keras with TensorFlow to be installed. For Unet construction, we will be using Pavel Yakubovskiy`s library called segmentation_models, for data augmentation albumentation library. I will write more detailed about them later. Both libraries get updated pretty frequently, so I prefer to update them directly from git.&lt;/p&gt;

&lt;div class=&quot;language-console highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;go&quot;&gt;conda install -c conda-forge keras
pip install git+https://github.com/qubvel/efficientnet
pip install git+https://github.com/qubvel/classification_models.git
pip install git+https://github.com/qubvel/segmentation_models
pip install git+https://github.com/albu/albumentations
pip install tta-wrapper
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;defining-data-generator&quot;&gt;Defining data generator&lt;/h2&gt;

&lt;p&gt;As a data generator, we will be using our custom generator. It should inherit keras.utils.Sequence and should have defined such methods:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__init__&lt;/code&gt; (class initializing)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__len__&lt;/code&gt; (return lengths of dataset)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;on_epoch_end&lt;/code&gt; (behavior at the end of epochs)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__getitem__&lt;/code&gt; (generated batch for feeding into a network)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One main advantage of using a custom generator is that you can work with every format data you have and you can do whatever you want — just don’t forget about generating desired output(batch) for keras.&lt;/p&gt;

&lt;p&gt;Here we defining &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__init__&lt;/code&gt; method. The main part of it is setting paths for images (self.image_filenames) and mask names (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;self.mask_names&lt;/code&gt;). Don’t forget to sort them, because for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;self.image_filenames[i]&lt;/code&gt; corresponding mask should be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;self.mask_names[i]&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;root_dir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'../data/val_test'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;image_folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'img/'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask_folder&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'masks/'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
             &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;768&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nb_y_features&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
             &lt;span class=&quot;n&quot;&gt;augmentation&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
             &lt;span class=&quot;n&quot;&gt;suffle&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_filenames&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;listdir_fullpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;root_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;image_folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mask_names&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;listdir_fullpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;root_dir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask_folder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;
    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;augmentation&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;augmentation&lt;/span&gt;
    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;
    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nb_y_features&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nb_y_features&lt;/span&gt;
    &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;suffle&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;suffle&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;listdir_fullpath&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sort&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;os&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;listdir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Next important thing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__getitem__&lt;/code&gt;. Usually, we can not store all images in RAM, so every time we generate a new batch of data we should read corresponding images. Below we define the method for training. For that, we create an empty numpy array (np.empty), which will store images and mask. Then we read images by read_image_mask method, apply augmentation into each pair of image and mask. Eventually, we return batch (X, y), which is ready to be fitted into the network.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;__getitem__&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;data_index_min&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;data_index_max&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_filenames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)))&lt;/span&gt;

      &lt;span class=&quot;n&quot;&gt;indexes&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_filenames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_index_min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;data_index_max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;this_batch_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;indexes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# The last batch can be smaller than the others
&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;this_batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;float32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;this_batch_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nb_y_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dtype&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;np&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;uint8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

      &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sample_index&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;enumerate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;indexes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;

          &lt;span class=&quot;n&quot;&gt;X_sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y_sample&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_image_mask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_filenames&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; 
                                                  &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mask_names&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;

          &lt;span class=&quot;c1&quot;&gt;# if augmentation is defined, we assume its a train set
&lt;/span&gt;          &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;augmentation&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;

              &lt;span class=&quot;c1&quot;&gt;# Augmentation code
&lt;/span&gt;              &lt;span class=&quot;n&quot;&gt;augmented&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;augmentation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X_sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y_sample&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;image_augm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;augmented&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'image'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;mask_augm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;augmented&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'mask'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reshape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;nb_y_features&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;c1&quot;&gt;# divide by 255 to normalize images from 0 to 1
&lt;/span&gt;              &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;image_augm&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;255&lt;/span&gt;
              &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;...]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask_augm&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
              &lt;span class=&quot;p&quot;&gt;...&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;test_generator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataGeneratorFolder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;root_dir&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'./data/road_segmentation_ideal/training'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                           &lt;span class=&quot;n&quot;&gt;image_folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'input/'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                           &lt;span class=&quot;n&quot;&gt;mask_folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'output/'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                           &lt;span class=&quot;n&quot;&gt;nb_y_features&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;train_generator&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataGeneratorFolder&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;root_dir&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'./data/road_segmentation_ideal/training'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                                      &lt;span class=&quot;n&quot;&gt;image_folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'input/'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                                      &lt;span class=&quot;n&quot;&gt;mask_folder&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'output/'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                                      &lt;span class=&quot;n&quot;&gt;batch_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                      &lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;512&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                                      &lt;span class=&quot;n&quot;&gt;nb_y_features&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;augmentation&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aug_with_crop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;data-augmentation--albumentations&quot;&gt;Data augmentation — albumentations&lt;/h2&gt;

&lt;p&gt;Data augmentation is a strategy that enables to significantly increase the diversity of data available for training models, without actually collecting new data. It helps to prevent over-fitting and make the model more robust.
There are plenty of libraries for such task: imaging, augmentor, solt, built-in methods to keras/pytorch, or you can write your custom augmentation with OpenCV library. But I highly recommend albumentations library. It’s super fast and convenient to use. For usage examples go to the official repository or take a look at example notebooks.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/segmentation_road/segmentation_output.png&quot; alt=&quot;segmentation_output.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;In our task, we will be using basic augmentations such as flips and contrast with non-trivial such ElasticTransform. Example of them you can in the image above.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;aug_with_crop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;256&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;crop_prob&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Compose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;RandomCrop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;width&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;height&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;crop_prob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;HorizontalFlip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;VerticalFlip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;RandomRotate90&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Transpose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ShiftScaleRotate&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;shift_limit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.01&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;scale_limit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.04&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rotate_limit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.25&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;RandomBrightnessContrast&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;RandomGamma&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.25&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;IAAEmboss&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.25&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Blur&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.01&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blur_limit&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;OneOf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;ElasticTransform&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;alpha&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;120&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sigma&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;120&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.05&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;alpha_affine&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;120&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.03&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;GridDistortion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;OpticalDistortion&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;distort_limit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;shift_limit&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After defining the desired augmentation you can easily get your output this:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;augmented&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;aug_with_crop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image_size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1024&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;image&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mask&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;image_aug&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;augmented&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'image'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;mask_aug&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;augmented&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'mask'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;callbacks&quot;&gt;Callbacks&lt;/h2&gt;

&lt;p&gt;We will be using common callbacks:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;ModelCheckpoint — allows you to save weights of the model while training&lt;/li&gt;
  &lt;li&gt;ReduceLROnPlateau — reduces training if a validation metric stops to increase&lt;/li&gt;
  &lt;li&gt;EarlyStopping — stop training once metric on validation stops to increase several epochs&lt;/li&gt;
  &lt;li&gt;TensorBoard — the great way to monitor training progress&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;keras.callbacks&lt;/span&gt; &lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ModelCheckpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ReduceLROnPlateau&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EarlyStopping&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TensorBoard&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# reduces learning rate on plateau
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lr_reducer&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ReduceLROnPlateau&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;factor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                               &lt;span class=&quot;n&quot;&gt;cooldown&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                               &lt;span class=&quot;n&quot;&gt;patience&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                               &lt;span class=&quot;n&quot;&gt;min_lr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1e-5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# model autosave callbacks
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode_autosave&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ModelCheckpoint&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;./weights/road_crop.efficientnetb0imgsize.h5&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                                &lt;span class=&quot;n&quot;&gt;monitor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'val_iou_score'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; 
                                &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'max'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;save_best_only&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;period&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;# stop learining as metric on validatopn stop increasing
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;early_stopping&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;EarlyStopping&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;patience&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mode&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;'auto'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; 

&lt;span class=&quot;c1&quot;&gt;# tensorboard for monitoring logs
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tensorboard&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TensorBoard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;log_dir&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;'./logs/tenboard'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;histogram_freq&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                          &lt;span class=&quot;n&quot;&gt;write_graph&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;write_images&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;callbacks&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mode_autosave&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lr_reducer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tensorboard&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;early_stopping&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;training&quot;&gt;Training&lt;/h2&gt;

&lt;p&gt;As the model, we will be using Unet. The easiest way to use it just get from segmentation_models library.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;backbone_name: name of classification model for using as an encoder. EfficientNet currently is state-of-the-art in the classification model, so let us try it. While it should give faster inference and has less training params, it consumes more GPU memory than well-known resnet models. There are many other options to try&lt;/li&gt;
  &lt;li&gt;encoder_weights — using imagenet weights speeds up training&lt;/li&gt;
  &lt;li&gt;encoder_freeze: if True set all layers of an encoder (backbone model) as non-trainable. It might be useful firstly to freeze and train model and then unfreeze&lt;/li&gt;
  &lt;li&gt;decoder_filters — you can specify numbers of decoder block. In some cases, a heavier encoder with simplified decoder might be useful.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After initializing Unet model, you should compile it. Also, we set IOU ( intersection over union) as metric we will to monitor and bce_jaccard_loss (binary cross-entropy plus jaccard loss) as the loss we will optimize. I gave links, so won’t go here for further detail for them.
&lt;img src=&quot;/images/segmentation_road/tens_logs.png&quot; alt=&quot;tens_logs.png&quot; /&gt;
&lt;em&gt;Tensorboard logs&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;After starting training you can for watching tensorboard logs. As we can see model train pretty well, even after 50 epoch we didn’t reach global/local optima.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/segmentation_road/metrics.png&quot; alt=&quot;metrics.png&quot; /&gt;
&lt;em&gt;Loss and IOU metric history&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;inference&quot;&gt;Inference&lt;/h3&gt;

&lt;p&gt;So we have 0.558 IOU on validation, but every pixel prediction higher than 0 we count as a mask. By picking the appropriate threshold we can further increase our result by 0.039 (7%).
&lt;img src=&quot;/images/segmentation_road/inference_code.png&quot; alt=&quot;inference_code.png&quot; /&gt;
&lt;em&gt;Validation threshold adjusting&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/segmentation_road/finish.png&quot; alt=&quot;finish.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Metrics are quite interesting for sure, but a much more insightful model prediction. From the images below we see that our network caught up the task pretty good, which is great. For the inference code and for calculating metrics you can read full code.&lt;/p&gt;

&lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;@phdthesis{MnihThesis,
    author = {Volodymyr Mnih},
    title = {Machine Learning for Aerial Image Labeling},
    school = {University of Toronto},
    year = {2013}
}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;</content><author><name></name></author><summary type="html">In this article, I will show how to write own data generator and how to use albumentations as augmentation library. Along with segmentation_models library, which provides dozens of pretrained heads to Unet and other unet-like architectures. For the full code go to Github. Link to dataset.</summary></entry><entry><title type="html">Poster: Automatic salt deposits segmentation: A deep learning approach</title><link href="https://diyago.github.io/2019/06/20/salt-poster.html" rel="alternate" type="text/html" title="Poster: Automatic salt deposits segmentation: A deep learning approach" /><published>2019-06-20T00:00:00+00:00</published><updated>2019-06-20T00:00:00+00:00</updated><id>https://diyago.github.io/2019/06/20/salt-poster</id><content type="html" xml:base="https://diyago.github.io/2019/06/20/salt-poster.html">&lt;p&gt;&lt;em&gt;Being honored to present a poster about image segmentation at the last international summit, Machines Can See 2019 , Moscow, Russia #deeplearning #cv #poster&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/kaggle-salt/poster.png&quot; alt=&quot;poster.png&quot; /&gt;&lt;/p&gt;

&lt;p&gt;-&amp;gt; &lt;strong&gt;10 th&lt;/strong&gt;
&lt;img src=&quot;/images/kaggle-salt/plan.png&quot; alt=&quot;plan.png&quot; /&gt;&lt;/p&gt;</content><author><name></name></author><summary type="html">Being honored to present a poster about image segmentation at the last international summit, Machines Can See 2019 , Moscow, Russia #deeplearning #cv #poster</summary></entry></feed>