<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Johncomonitski on Medium]]></title>
        <description><![CDATA[Stories by Johncomonitski on Medium]]></description>
        <link>https://medium.com/@johncomonitski?source=rss-e8cb1575753c------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*njdiYA2djz3EmPpUwmYZAQ.jpeg</url>
            <title>Stories by Johncomonitski on Medium</title>
            <link>https://medium.com/@johncomonitski?source=rss-e8cb1575753c------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Wed, 15 Apr 2026 19:05:26 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@johncomonitski/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Building a Simple EPV Model to Evaluate Player Decision Making]]></title>
            <link>https://medium.com/@johncomonitski/building-a-simple-epv-model-to-evaluate-player-decision-making-9fe3b73d1b46?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/9fe3b73d1b46</guid>
            <category><![CDATA[soccer-analytics]]></category>
            <category><![CDATA[football-data-analysis]]></category>
            <category><![CDATA[premier-league]]></category>
            <category><![CDATA[soccer]]></category>
            <category><![CDATA[data-analytics]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Wed, 01 Apr 2026 11:45:35 GMT</pubDate>
            <atom:updated>2026-04-01T11:45:35.821Z</atom:updated>
            <content:encoded><![CDATA[<p><em>Building an Expected Pass Value (EPV) model using the Football Match Analysis Library in only 12 lines of code!</em></p><p>In this article, we will build an EPV model to evaluate player decision-making when passing. With the combination of Tracking Data, Event Data, Expected Threat, and Passing Probability, we will be able to grade a player’s decision-making when completing passes. The goal is to examine a pass and answer the question: <em>Was the right pass made?</em></p><p>We are going to use this moment from a match. After receiving the ball, Player 5 drove from the edge of the pitch to the center. Once there, he had multiple passing options. Ultimately, he decides to pass the ball to Player 8, but was this the right choice? Using EPV, we can evaluate a player’s decision-making and decide whether the right pass was made.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/800/1*PCOCs1_IfsOqSf0RB4zgHg.gif" /><figcaption>Player 5 passing to Player 8</figcaption></figure><p><strong>Step 1: Calculate the xT Generated by a Pass</strong></p><p>To calculate the xT generated by a pass, we start by calculating the xT of the ball’s current location. This will be as simple as getting the moment the pass was played, grabbing the start and end points of the pass, running them through our xT model and getting the difference.</p><pre># Get Moment<br>moment = match.get_moment(27942)<br>p = moment.event<br><br># Get Start and End of Pass<br>start = (p[&quot;Start X&quot;], p[&quot;Start Y&quot;])<br>end = (p[&quot;End X&quot;], p[&quot;End Y&quot;])<br><br># Get xT and xT Generated From The Pass<br>start_xt = get_xt(start)<br>&gt;&gt;&gt; 0.0092<br>end_xt = get_xt(end)<br>&gt;&gt;&gt; 0.0128<br>xt_gained = end_xt - start_xt<br>&gt;&gt;&gt; 0.0036</pre><p>In this case, we start the pass with an xT of 0.0092. Then we look at the ball’s final location and see that having the ball at that location yields an xT of 0.0128. With the xT before and after, we can subtract them to get the xT generated by that pass, which was 0.0036.</p><p><strong>Step 2: Calculating the Probability of a Successful Pass</strong></p><p>To calculate the probability of a pass succeeding, there are several factors at play, such as the ball and every player on the pitch’s current position, trajectory, speed and distance from the target. This is a lot to consider, but thankfully, the FMA Library leverages the heavy lifting by <a href="https://www.youtube.com/@friendsoftracking755">Friends of Tracking</a>. Built on top of the <a href="https://github.com/Friends-of-Tracking-Data-FoTD/LaurieOnTracking">Friends of Tracking’s Metrica Analysis library</a>, the FMA Library can estimate the probability of a pass’s success with just one line of code.</p><pre># Probability of a Completed Pass<br>p = moment.pass_probability(end)<br>&gt;&gt;&gt; 0.9830343653386462</pre><p>In this example, a pass from Player 5 to Player 8 has a ~98.3% chance of success.</p><p><strong>Step 3: We Must Also Consider Failure</strong></p><p>When valuing a pass, it’s important to consider what we gain from a successful pass, but it’s also just as important to consider what value the opponent may gain from intercepting the pass. This is our risk of attempting the pass.</p><pre># Probability of a Failed Pass<br>p_failure = 1 - p<br>&gt;&gt;&gt; 0.01696563466<br><br># xT of Opponent, If They Intercept the Pass <br>opp_end_xt = end_xt = get_xt(end, invert=True)<br>&gt;&gt;&gt; 0.0064<br><br># Change in xT If Pass is Intercepted<br>xt_lossed = (-1 * opp_end_xt) - start_xt<br>&gt;&gt;&gt; -0.01</pre><p>The pass has a ~1.7% chance of failure and if the opponent receives the ball there, their xT will be 0.0064 (-0.0064 for us). Considering our current xT is 0.0036, that intercepted pass represents a -0.01 change in xT.</p><p><strong>Step 4: Calculating Our Final EPV</strong></p><p>Putting everything together, our final EPV formula is as follows:</p><p><em>EPV = ( P(Success) * △ xT ) + ( (1 — P(Success)) * Opp △ xT )</em></p><p>The EPV of a pass will be defined as the probability of a pass succeeding times the <em>△ </em>xT of the pass. We then subtract the probability of a pass failing times the <em>△</em> xT of the opponent that could gain by intercepting the ball at that location.</p><p><em>EPV = ( .98.3 * 0.0036 ) + ( 0.017 * -0.01 )</em></p><pre>p# Final EPV<br>epv = (p * xt_gained) + (p_failure * xt_lossed)<br>&gt;&gt;&gt; 0.003274259814502008</pre><p>In this example, the EPV of the pass made was 0.00327, meaning that a pass from Player 5 to Player 8 increased our chances of scoring by ~0.327%. It’s nothing incredible, but this now provides us with a framework to evaluate decision-making.</p><p><strong>Step 5: Identifying Passing Options</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*MiL86eZemHktXidX8tqc4w.png" /><figcaption>The exact moment Player 5 plays a pass to Player 8.</figcaption></figure><p>Let’s assume Player 5 is playing a pass at this exact moment. There is no time to turn around, so the next pass must be played within the peripheral vision of Player 5. Thankfully, as indicated by the arrows, we have Player 5’s velocity, so we know which way they’re likely facing. If we assume all players in front of him must be within 90 degrees of his velocity, we can, with a little geometry, we can identify Players 6, 8, 9 and 10 as possible passing targets.</p><pre>home_team = moment.home_team()<br>player_5 = moment.possession()<br>in_sight = []<br>for player in home_team:<br>    if player_5.in_peripheral_vision(player.x, player.y):<br>        in_sight.append(player)</pre><p><strong>Step 6: Finding the Optimal Pass</strong></p><p>A player can receive the ball in multiple locations. We can play the ball squarely to them, we can play it behind them, or we can play it in front of them to run onto. To find where we should play the ball, we are going to look at each player’s velocity arrow and imagine it like a radius. With that radius, well make a circle around the player and scan each point in that circle. We will, however, exclude points that are not directly in front of the player (points not within 45 degrees of the player’s velocity). After calculating all eligible points’ EPV, the point with the highest EPV will be the optimal passing location for that player.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*97DF-y-lXPlGUFtcDXNEjQ.png" /><figcaption>We will scan the circles around the player and calculate the EPV of points within 45 degrees of their velocity to find the optimal passing location</figcaption></figure><pre># Calculate the EPV of a pass based on the code we previous wrote<br>def calculate_epv(moment, target):<br>    # Get Start and End of Pass<br>    ball = moment.ball<br>    start = (ball.x, ball.y)<br><br>    # Get xT and xT Generated From The Pass<br>    start_xt = get_xt(start)<br>    end_xt = get_xt(target)<br>    xt_gained = end_xt - start_xt<br><br>    # Probability of a Completed Pass<br>    p = moment.pass_probability(target)<br>    # Probability of a Failed Pass<br>    p_failure = 1 - p<br><br>    # xT of Opponent, If They Intercept the Pass <br>    opp_end_xt = end_xt = get_xt(target, invert=True)<br>    # Change in xT If Pass is Intercepted<br>    xt_lossed = (-1 * opp_end_xt) - start_xt<br><br>    # Final EPV<br>    epv = (p * xt_gained) + (p_failure * xt_lossed)<br>    return epv<br><br>passes = {}<br>for player in in_sight:<br>    epv, best_pass = None, None<br><br>    # Scan Radius Around Player<br>    for target in player.coords_in_radius(radius=player.speed):<br>        # Ensure Ball is Played Infront of Player<br>        if player.in_direct_view(target[0], target[1]):<br>            # Calculate EPV<br>            pot_epv = calculate_epv(moment, target)<br>      <br>            # Find Best EPV<br>            if epv is None or pot_epv &gt; epv:<br>                epv = pot_epv<br>                best_pass = target<br><br>    # Store Results<br>    passes[f&quot;Player{player.name}&quot;] = { &quot;EPV&quot; : epv, &quot;Pass&quot; : best_pass}</pre><p><strong>Step 7: Reviewing the Results</strong></p><p>After running our script, we can now compare the optimal pass to each player and the pass with the highest EPV will be our best passing option:</p><p>EPV of a Pass to Player 8: 0.00327 (0.327%)<br>EPV of a Pass to Player 6: 0.00095 (0.095%)<br>EPV of a Pass to Player 9: -0.00021 (-0.0021%)<br>EPV of a Pass to Player 10: 0.00225 (0.225%)</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*FXjT1W-YVLvgnN0I5u_HYg.png" /><figcaption>Evaluating the best possible passing options for Player 5 at this given moment.</figcaption></figure><p>As it turns out, the optimal pass was played. As previously mentioned, a pass from Player 5 to Player 8 increased our chances of scoring by ~0.327%, which edges out a pass to Player 10 (our second best option) by ~0.102%.</p><p><strong>Why Does This Matter?</strong></p><p>I will admit that a difference in the likelihood of scoring between ~0.225% and ~0.327% feels like splitting hairs. It’s a marginal difference and this individual moment and decision will likely mean very little in the grand scheme of a match. That’s why you might want to think about EPV in terms of the whole match.</p><p>This EPV model provides a means to evaluate player decision-making and can be functionalized down to just 12 lineof code. We can then, quite easily, run that function over every pass a player plays during a match and aggregate the results to score a player’s decision-making over the entire match. Why stop there, though? We can run this decision making aggregation over every pass played by every player and identify who the best passing decision maker is on the pitch.</p><p>From a scouting perspective, this is hugely important. In the past, you could say “this player sees passes others don’t”, but it’s hard to prove. You might have one or two great passing clips from their last match, but how do you know you’re not falling into recency bias? That’s the problem we aim to solve. With EPV and this model for player evaluation, you can point to a hard number that says, this player is quantitatively the best passing decision maker on the pitch!</p><p><strong>Run The Code Yourself!</strong></p><p><a href="https://github.com/JohnComonitski/FMATutorials/blob/main/advanced_models/3%20epv.ipynb">FMATutorials/advanced_models/3 epv.ipynb at main · JohnComonitski/FMATutorials</a></p><p><em>If you’re interested in playing with EPV calculations and the function I wrote in this article, check out the FMA Library and the tutorial notebook I wrote based on this article.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9fe3b73d1b46" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Generating Footballing Event Data From Match Tracking Data]]></title>
            <link>https://medium.com/@johncomonitski/generating-footballing-event-data-from-match-tracking-data-1730c58ad598?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/1730c58ad598</guid>
            <category><![CDATA[premier-league]]></category>
            <category><![CDATA[football-analytics]]></category>
            <category><![CDATA[soccer]]></category>
            <category><![CDATA[soccer-analytics]]></category>
            <category><![CDATA[machine-learning]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Sat, 14 Mar 2026 14:55:45 GMT</pubDate>
            <atom:updated>2026-03-14T14:55:45.512Z</atom:updated>
            <content:encoded><![CDATA[<p><em>Automatically generating football event data from match tracking data with an eye towards a future where tracking data from football any matches will be ubiquitous.</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Ma5aPN8wTi4_pRKhj2548Q.png" /><figcaption>Event data generated from tracking data.</figcaption></figure><p><strong>Tracking Data Is Set to Explode</strong></p><p>Last summer, I published Soccer Vision. Soccer Vision was a prototype of a football tracking data generator. The aim was to take television match footage and use computer vision to generate tracking data.</p><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FQPDWg-aExpo%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DQPDWg-aExpo&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FQPDWg-aExpo%2Fhqdefault.jpg&amp;type=text%2Fhtml&amp;schema=youtube" width="854" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/61855e87ef0a3c7a870a5e982b66bc40/href">https://medium.com/media/61855e87ef0a3c7a870a5e982b66bc40/href</a></iframe><p>This project was motivated in part by delusions of grandeur. I believed (and still do) that this technology is the first step to any team being able to have tracking and event data from their matches. This word gets thrown around a lot in tech, but I think we’re on the cusp of a genuine democratization of tracking data. So much so that, in theory, even a Sunday league team could get its hands on this type of data.</p><p><a href="https://medium.com/@johncomonitski/machine-learning-computer-vision-a-better-match-tactician-than-pep-guardiola-1b44924968e9">Machine Learning &amp; Computer Vision, A Better Match Tactician Than Pep Guardiola</a></p><p>Of course, Sunday leagues probably don’t want or need this data, but I do think this is the path we’re on. Every day, I’m seeing new examples of tracking data all over Twitter. In my opinion, this is where the future of sports technology is going. It’s an exciting future! There is just one hitch. It’s not enough.</p><iframe src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A//x.com/Gautam_A_k/status/2031452541939335621&amp;image=" width="500" height="281" frameborder="0" scrolling="no"><a href="https://medium.com/media/920dcbfbcc350bdfd137a43c582056dd/href">https://medium.com/media/920dcbfbcc350bdfd137a43c582056dd/href</a></iframe><p>Tracking data is only so useful. You might be able to generate physical statistics for players, but a series of X and Y coordinates does not tell the story of a match. There are major gaps in your ability to analyze a match with tracking data alone. You need event data. That is what we’re setting out to generate today. My goal is to fill in the gap left by Soccer Vision and generate event data from tracking data!</p><p><strong>Event Data from Tracking Data</strong></p><p>The first step to converting event data to tracking data was to track possessions. My goal was to create a list of possessions that define who is on the ball and when. This process wasn’t exactly simple, but in a nutshell, a possession began when a ball was within .75 units of a player. Once the ball left that player, then that’s the end of a possession. There are of course exceptions for instances of failed tackles or the ball just temporarily moving past a player, but once we sorted through all that noise, we had a list of every individual player’s possession through the whole match.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*UM3025NpSxoCoOVfPfzGlQ.png" /><figcaption>Example of a possession object generated from reviewing tracking data.</figcaption></figure><p>With our big list of possessions, the plan was to use a evaluate these possessions using relatively simple rules to generate events. For example, Player 15 is on the Ball. The ball leaves him and goes to his teammate Player 19. We can generate a PASS event from Player 15 to Player 19.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*-78bDsqFM9QJWAjednOLDQ.png" /><figcaption>Capturing of a pass event.</figcaption></figure><p>Another example would be Player 17 on the ball. The ball leaves him and goes to Player 2 from the opposing team. We can generate a BALL LOST event from Player 17 and a RECOVERY INTERCEPTION event from Player 2.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*NAPhqjC38OQ5WeRKAeGs2A.png" /><figcaption>Capturing of an interception event</figcaption></figure><p>These events are relatively straightforward to detect. A more complicated example would be the goal. In this match, Player 2 very briefly has the ball and redirects the it. We note that the ball, after leaving Player 2, is moving in the direction of the net. We see that the ball went in the net and is suddenly seen in the center of the pitch. That’s a Shot (and Goal) by Player 2, followed by a kickoff.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*aR2EWpo6dJ2QT5DPHSCZtg.png" /><figcaption>Capturing of a shot event</figcaption></figure><p>With enough work, I was eventually able to keep building on these rules until I had an algorithm that converts tracking data to possession data and then turns possession data into event data.</p><p><strong>Event Data Generation Algorithm</strong></p><p>At the moment, my algorithm evaluates possessions in groups of 3. The idea is to evaluate events by looking at previous possession, the current possession and the next possession. Evaluating before, during and after have so far been enough to identify what happened to the ball and generate events</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*wcpBly0k0zTBRyffJ-po8w.png" /><figcaption>The review_events method takes 3 possessions (before, current and after) and attempts to parse if a footballing event has occoured during these possessions.</figcaption></figure><p>As we are generating possessions, every possession (along with its possession before and after) are passed to <em>review_events</em> and it looks like this in action.</p><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FS5FcKecB5bU%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DS5FcKecB5bU&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FS5FcKecB5bU%2Fhqdefault.jpg&amp;type=text%2Fhtml&amp;schema=youtube" width="640" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/7693291b1d50b46efdfb962d4de47b3e/href">https://medium.com/media/7693291b1d50b46efdfb962d4de47b3e/href</a></iframe><p><strong>The Final Results</strong></p><p>I tested my algorithm by running it through the first 4,000 frames (roughly a 2 1/2 minutes) of a match. This 4,000 frame sample had a nice variety of events (passes, interceptions, a challenge, a goal and a corner kick) to help prove this project is viable.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*54LCnYuzZYKrKanUMhd4DA.png" /><figcaption>Final list of events generated over 4000 frames of match tracking data</figcaption></figure><p>Overall, I was satisfied with the results. The official Metrica event data for the first 4000 frames included 39 events. My algorithm was able to capture 37 of those events.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*mhXdGpMxOSKh3d_SpMj-GA.png" /><figcaption>Official Metrica event data compared to event data generated by my algorithm. Missing events or details are highlighted in yellow and some of their corresponding data generated by my algorithm are highlighted in red.</figcaption></figure><p>After reviewing the results, I discovered that the two events we missed were two CHALLENGE events (AERIAL-LOST and AERIAL-WON). Capturing AERIAL subtype events was always going to be a challenge. When we flatten match footage onto a 2D plane, aerial information will be lost. Additionally, if you watch the moment back, the tracking data isn’t very clear. Given the inconsistencies in the data, that we’re working with, I think this is an acceptable set of events to miss.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*rH5pXK25IiY3rJWuC61EHw.gif" /><figcaption>Given the way the ball moves past player 2 and 18, snaps to them, and then bounces away. It would be difficult to detect, even for a human, that this was a challenge.</figcaption></figure><p>Overall, I was please to see my algorithm captured every pass and interception, pick up on the goal and differentiate between set pieces such as a corner and a goal kick. There are still shortcomings, however. My algorithm currently does not attempt to parse passing subtypes. I do think passing subtypes like CROSS and LONG BALL have clear paths to detection and I will likely attempt to detect them them in a future iteration of this project, but the HEAD passing subtype will remain difficult. Until additional information is encoded into the tracking data, much like the AERIAL subtypes, they simply cannot be detected. You see this pitfall again with theHEAD-INTERCEPTION that was labeled just an INTERCEPTION and the HEAD-ON TARGET GOAL that was labeled an ON TARGET GOAL.</p><p>The next steps of this project include refining shot detection, encoding other set pieces, and, as previously mentioned, adding subtypes to passing based on where the pass was made from and how far it traveled. I’m excited to see this project grow. Feel free to follow along as I update the project as part of this repo!</p><p><a href="https://github.com/JohnComonitski/TrackingDataToEventData">GitHub - JohnComonitski/TrackingDataToEventData: Converting Football Tracking Data to Event Data</a></p><p>We are rapidly heading towards a world where tracking data will be everywhere. There will then be a need to to turn that tracking data into event data. I hope to have this project ready for that day!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1730c58ad598" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[An Ecological Solution to Opta Killing FBRef’s Football Statistics]]></title>
            <link>https://medium.com/@johncomonitski/an-ecological-solution-to-opta-killing-fbrefs-football-statistics-1b5231dcc076?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/1b5231dcc076</guid>
            <category><![CDATA[sport-analytics]]></category>
            <category><![CDATA[football-analytics]]></category>
            <category><![CDATA[soccer]]></category>
            <category><![CDATA[soccer-analytics]]></category>
            <category><![CDATA[premier-league]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Tue, 17 Feb 2026 14:12:38 GMT</pubDate>
            <atom:updated>2026-02-17T14:12:38.424Z</atom:updated>
            <content:encoded><![CDATA[<p><em>Utilizing the concept of indicator species from ecology to evolve as a amateur football analyst in a post-FBRef footballing environment.</em></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*sXmAo8hawFXoTx5_sk_d1w.png" /><figcaption>A typical FBRef statistics table.</figcaption></figure><p><strong>FBRef is dead! And Opta killed it!</strong></p><p>FBRef is dead! Well, it’s not exactly dead, but it’s been severely handicapped. Opta pulled the rug out from under FBRef and with that went its advanced analytics. It’s tragic, because this data was such a gateway drug into football analysis. For myself personally, FBref was one of the backbones to <a href="https://soccerapipy.com/">Soccer API</a>. Additionally, many of my articles in the past could not have been written without FBref’s advanced analytics.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ymroiNgpbruzOiiqrHZQCw.png" /><figcaption>My Football Player Profile was a an app that allowed you to build radar charts of player’s with FBRef data. It has been broken since Opta severed ties with FBRef.</figcaption></figure><p>It’s important to remember that this loss of footballing data doesn’t kill amateur football analyst, but it does mean we need to start getting clever. For a while, I’ve been sitting on the idea of “Footballing Indicator Species”. Indicator species come from the world of ecology and are utilized in research situations where direct measurements are either unavailable, expensive, or impractical. The WWF defines an indicator species as “a species or group of species chosen as an indicator of, or proxy for, the state of an ecosystem or of a certain process within that ecosystem”. Essentially, we are using patterns from known data to draw conclusions about conditions and data we can’t directly observe.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*wBDOYIqfxw6Ji5hTT7Fbaw.png" /><figcaption>Indicator species in ecology.</figcaption></figure><p>For example, imagine we’re at a body of water and we want to determine if the water is fresh water. For some reason, we don’t have the tools on hand required to test the water ourselves. We need to get creative! We do see there is a large population of Mayfly larvae. Mayfly larvae are highly sensitive to pollution and thrive in cold, well-oxygenated, unpolluted freshwater. Given the large presence of Mayfly larvae (our indictor species), there’s a high probability this body of water is fresh and clean (data we can’t directly measure). This works in the inverse as well. If the Mayfly larvae population begins to decrease, it could be a sign that the body of water has become polluted.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*kMC1F63AcmsbZNlRex7uPw.png" /><figcaption>Mayfly larvae</figcaption></figure><p><strong>Indicator Species in Football</strong></p><p>How are ecology and indicator species related to football? Footballing statistics don’t happen in a vacuum, so imagine the football pitch like an ecosystem. A team with a high number of goals likely has a high number of shots. The presence of one statistic likely means there are other tactics, player roles and team behaviors that either come from or are a result of other statistics. For this reason, we can, as football analysts, find ourselves in a similar situation as the ecologist looking at Mayfly larvae to identify fresh water.</p><p>Let’s say, for-example, we’re looking for a player who can play Through Balls or Line-Breaking Passes. Before we lost our access to advanced statistics, this was a statistic available on FBRef. We could look at both Passes into the Final Third and Through Balls in our scouting process. Sadly that data is now gone, so we need to look elsewhere. Interestingly, FBRef does provide us with the number of Offsides each team has received. Considering one of the most common ways to become offside is by miss-timing your run onto a line-breaking pass, there’s potential to use it as an indicator species to identify teams with players who frequently make Passes into the Final Third or Through Balls</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Y3aW8zKaWehddyiqnojG4w.png" /><figcaption>A New Castle attacker is caught offsides running onto a Through Ball.</figcaption></figure><p>If we review FBRef for teams that are frequently offsides, we can be more targeted with our player recruitment. Once we find team with a large number of offsides per match, we can dig deeper into their squad to identify who is likely playing those passes. The players we pick out from this search could then be short listed for a further deep dive, where we review their match footage.</p><p>This type of recruitment is not as elegant a solution as jumping right to players with a large number of Passes into the Final Third or Through Balls, but by using Offsides as an indicator species, we are still able to find players worth reviewing without having to wander aimlessly through a list of thousands of players.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*goopL1580sd_8QYICkX9Iw.png" /><figcaption><em>Note: For this example, a large number of offsides could also just mean some attackers just have poor timing. That’s absolutely true, but we would not be doing this type of scouting if we had more information to work with. We can always eliminate players with poor timing, who popped up on our radar, during our review of their match footage. This is where a judgement calls may need to be made</em></figcaption></figure><p>As you can see, we’ve lost a lot of fantastic statistics, but not all hope is lost. Let’s look at some other statistics, that we may be able to find indicator species for. To do so, we are going utilize all the footballing data resources we can. In this case, we’re looking at FBRef, Fotmob and API Football.</p><p><strong>Pass Switches</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*8DzPkWCe3biawDwopgl57w.png" /></figure><p>If we’re looking for a player who switches the field a lot, we need to get creative with our indicator species. Looking at Fotmob, we’re able to search for players with a high number of Long Balls, a good Accurate Long Balls percentage and a low number of Crosses and Through Balls.</p><p>What we’re doing here is stacking our indicator species to improve our confidence. Typically, switches are long balls, but not all long balls are switches. They can, for-example, be Crosses and Through Balls. So we start by selecting players who frequently complete Long Balls and then filter out the player’s whose Long Balls are primarily Crosses and Through Balls. Finally, by the nature of when and where switches are played, they are often far less contested than Crosses and Through Balls. We can refine our search by filtering out players with inaccurate Long Balls. This improves our chances of identifying players who consistently use long balls to switch the field.</p><p><strong>Expected Assists</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*iYy9m6n7A8yuH4-xqg6J8w.png" /></figure><p>Finding players with a high Expected Assists total is straightforward. We can simply use a high number of Key Passes and Chances Created as our indicator species.</p><p>Losing the raw Expected Assists number is a shame, but Chances Created and Key Passes are good signals for Expected Assists. More often than not, either of these two actions will result in a decent Expected Assists value and a player with a large number of these stats will, in turn, have a large total Expected Assists.</p><p><strong>Average Shot Distance</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*4_xKjM7f-BzIvc_L_58AUg.png" /></figure><p>If we’re looking for a distance shooter (a high Average Shot Distance), they can potentially be identified by their high volume of Shots and a low total xG.</p><p>This is fairly intuitive when you consider that the farther out you shoot, the lower your shot’s xG will be. If someone has accumulated a low xG, but is shooting fairly regularly, there is a high probability that those shots are like from positions on the pitch that are not advantageous, such as around or past the 18-yard box.</p><p><strong>Carries Into the Final Third</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qMcCgnwuR0I5PwWAY1GI7Q.png" /></figure><p>How can we identify players who create deep offensive penetration with the ball at their feet? Let’s consider the environment. The final 3rd is often more contested than your defensive 3rd or the midfield. It’s also a region of the pitch where teams may find it worthwhile to cut off attacks with tactical fouls. With that in mind, two stats we want to pay attention to are Dribbles Won and Fouls Drawn.</p><p>If a midfielder’s or attacker’s statistics exhibit both high Dribbles Won and high Fouls Drawn relative to their position group, there’s a high probability that a significant portion of these events occur as they carry the ball into the final third. This is because the region between the middle third and final 3rd of the pitch is where take-ons have their highest risk/reward potential for attackers, and tactical fouls have the highest risk/reward potential for defenders.</p><p><strong>It’s Time for Amateur Analyst to Evolve</strong></p><p>I think it wasn’t until Opta’s advanced metrics were taken away from us that we realized how blessed we were to have FBRef as a resource, but losing them doesn’t mean that the community of amateur football analysis will suddenly goes extinct. To lean on the ecology metaphor one last time, catastrophic events leave mass extinction and destruction in its wake, but it also forces innovation and adaptation. It’s time for our community to innovate and adapt!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1b5231dcc076" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Key Passes and Disrupting Crucial Build Up Patterns]]></title>
            <link>https://medium.com/@johncomonitski/key-passes-and-disrupting-crucial-build-up-patterns-3a30e274dc5f?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/3a30e274dc5f</guid>
            <category><![CDATA[football-analysis]]></category>
            <category><![CDATA[football-data-analytics]]></category>
            <category><![CDATA[soccer]]></category>
            <category><![CDATA[premier-league]]></category>
            <category><![CDATA[football]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Wed, 04 Feb 2026 03:21:51 GMT</pubDate>
            <atom:updated>2026-02-04T03:21:51.275Z</atom:updated>
            <content:encoded><![CDATA[<p><em>Analyzing a team’s passing patterns via their event data with the goal of understanding which players are most influential in the generation of Key Passes and identifying crucial patterns in a build-up, which we can disrupt to stifle an opponent’s attacks.</em></p><p><strong>Let’s Play Opposition Scout Over Sample Data!</strong></p><p>Imagine I am an opposition scout for a club that’s playing Red Team FC. For some reason (unimportant to this exercise), we don’t have time to watch and analyze their match footage. Thankfully, our data provider provided us with tracking and event data from their recent victory over Blue Team United. I’ve been assigned to analyze this data with the aim of idenitfying who their most influential players are in attack.</p><p>To do so, we are going to analyze Key Passes and the passing chains that lead to those Key Passes. A Key Pass is a pass in which the next action on the ball is a shot. For example, in the example below, Player 6 passes to Player 10, Player 10 passes to Player 8, who then shoots. Player 8 will be awarded a shot and Player 10, although they won’t get an assist, will be rewarded with a Key Pass.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/800/1*GShOv82wXLwO3kLx-hP9kQ.gif" /><figcaption>Passing sequence from our sample match data of Red Team FC vs Blue Team United. In this example, Player 10’s Key Pass to Player 8 leads to a shot on target.</figcaption></figure><p>In this analysis, we will be analyzing the build-up to all Key Passes by Red Team FC in their match against Blue Team United to understand which players’ passing pairs are most common in the build-up to a Key Pass. Additionally, we’ll be looking at which players are most influential in the generation of Key Passes. In doing so, I hope to pick out crucial moments in a Red Team FC’s build-up, which we can disrupt to stifle their attacks.</p><p><strong>Ingesting Our Match Data and Finding Key Passes</strong></p><p>We’ll start by importing my Football Match Analysis library. This is a library I built atop the fantastic work of Friends of Tracking. This library is designed to ingest tracking and event data and simplify match analysis by providing a number of programmable objects featuring several helper functions.</p><p><a href="https://github.com/JohnComonitski/FootballMatchAnalysis">GitHub - JohnComonitski/FootballMatchAnalysis: Football match analysis using football event and tracking data</a></p><p>Once we’ve imported our library and match data, we’ll filter the event data to pick out all passes by the Red Team FC. This gives us 437 passes to review. To find Key Passes, we need to check if the next event by the receiving player on the ball is a shot. To simplify the process, in the Football Match Analysis library, I’ve written a method to do exactly that. <strong><em>is_key_pass</em></strong> will take an event and return <strong><em>True</em></strong><em>,</em><strong><em> </em></strong>if that event that leads to a shot. We can use this function as a filter over our 437 shots to find 14 Key Passes.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*_Ee57rRrUZIG7qDZf95i9A.png" /><figcaption>is_key_pass takes a Match object and an event (Pandas dataframe) and returns true of the event is a Key Pass. As seen below, this function can be used as a filter over event data to isolate Key Passes.</figcaption></figure><pre>from analysis.events import *<br>from objects.match import Match<br><br>match = Match(DATADIR=&#39;./data&#39;, game_id=1)<br><br>passes = match.get_events(&quot;PASS&quot;)<br>passes = passes[passes[&quot;Team&quot;] == &quot;Home&quot;]<br>print(passes.shape)<br>&gt;&gt;&gt; 437<br><br>key_passes = passes[passes.apply(is_key_pass, axis=1)]<br>print(key_passes.shape)<br>&gt;&gt;&gt; 14</pre><p>The next step is to retrace our steps. We will iterate over each Key Pass’s preceding pass to uncover the exact chain of passes that lead to each Key Pass. With all 14 passing chains that lead to a shot, we can now begin our analysis.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*lrZdr56zwG76dYbSmATrIA.png" /><figcaption>All Key Pass passing sequences as a CSV from Red Team FC in their match against Blue Team United.</figcaption></figure><p><strong>Review the Results, How Can We Stop Red Team FC?</strong></p><p>My first thought was to perform some kind of Key Pass visualization. I hoped visualizing the Key Passes would unlock some visually obvious secrets in Red Team FC’s build-up. I started by creating this simple sketch on paper.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Fnvg2JchfQZsrf43upLXkA.png" /><figcaption>My original attempt to visualize the Key Pass passing sequences. Multiple arrows imply the number of times we saw a specific passing combination.</figcaption></figure><p>It was interesting to see the flow of passes. You do get a feel for where passing traffic concentrates, but it didn’t jump out to me as particularly useful. I later toyed with some other ideas, such as a Sankey Diagram, but wasn’t thrilled with the results. I ultimately settled on running a series of aggregations on the raw passing data. The first aggregation examined which passing combinations were most common.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/717/1*YlTq7kYJGwOnf2W-suGuCw.png" /><figcaption>The most common passing combinations in our passing chains.</figcaption></figure><p>I then ranked the players by total involvement in passing chains (passes and shots).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*st76alQTOK7LARuviyxgZA.png" /><figcaption>The players with the most involvement in our 14 Key Pass passing sequence. Involvement includes shots or passes.</figcaption></figure><p>Finally, I wanted to see what percentage of passing chains that led to a shot, was each player was involved in.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*rYaFFWD_kBHZtuZDToqcMg.png" /><figcaption>Percentage of Key Pass passing sequences each player of Red Team FC was involved in (blue).</figcaption></figure><p>Reviewing the results, Player 10 jumps off the page as being one of the more influential players on the pitch. Player 10 had 3 shots and contributed a pass in 4 of the passing chains that led to a Key Pass. This makes sense. Player 10 is quite literally playing in the “Number 10” role for Red Team FC.</p><p>I do think, however, if our goal is to disrupt Red Team FC’s Key Pass passing chains, we need to take a step back (in the chain) and set our sights on Players 5 and 6. If Player 10 were to receive the ball, 75% of the time it came from Player 5 or 6. On top of that, they both lead the team in passes that lead to a shot with 5 a piece. If you focus on disrupting Player 5 and 6, not only do you keep a dangerous player like Player 10 off the ball, but you disrupt 57% of their eventual Key Passes. You’d cut their shots in half.</p><p><strong>Taking this Beyond Red Team FC</strong></p><p>You’re not going to win the match alone by leaving Player 10 isolated by shutting down Player 5 and 6, but it’s a good start to a defensive game plan. Consider this a taste of the kinds of analysis we can perform by identifying Key Passes amongst an entire match’s event data. With more time and data aggregated over a season, one could very easily and very quickly provide a comprehensive report of an opposition team’s attacking patterns.</p><p>Coaching staffs can utilize these reports to save significant time. They can lean heavily into the report findings to reduce their hours spent watching opposition film or they can use them as a primer that makes film study more efficient and effective. Either way, as tracking data and event data become more ubiquitous, teams’ abilities to leverage that data will open the door to massive advantages.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3a30e274dc5f" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Younger, Better & Cheaper Scouting Analysis]]></title>
            <link>https://medium.com/@johncomonitski/younger-better-cheaper-scouting-analysis-479372917373?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/479372917373</guid>
            <category><![CDATA[football]]></category>
            <category><![CDATA[soccer]]></category>
            <category><![CDATA[football-analytics]]></category>
            <category><![CDATA[premier-league]]></category>
            <category><![CDATA[scouting]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Sat, 16 Aug 2025 15:23:56 GMT</pubDate>
            <atom:updated>2025-08-16T15:24:11.616Z</atom:updated>
            <content:encoded><![CDATA[<p><em>Using Soccer API to perform the 21st Club’s Younger Better &amp; Cheaper style of scouting analysis and identify for Union Berlin a potential replacement for Benedict Hollerbach.</em></p><p>I read too many football books. I’m not kidding, it’s getting to the point that my significant other rolls her eyes every time a new football book arrives in the mail.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OVWXZVPzBIDhm_qTLFGCyQ.png" /><figcaption>A sad attempt to stitch all 31 of my football books into one photo</figcaption></figure><p>Many of these books are marked with sticky notes. I’ll be reading a chapter, it clicks in my brain, and inspiration strikes! Suddenly, I have a new project idea and that sticky note serves as a reminder to come back to. That was at least the intention. In reality, I rarely ever started, let alone completed, the project ideas.</p><p>However, one sticky note was different. Last March I read the <a href="https://www.twentyfirstgroup.com/">21st Club</a>’s Changing the Conversation, a collection of football essays detailing “fresh perspectives and creative approaches to crucial topics including strategy, succession planning, recruitment and performance”. Each essay takes you into the mind of a football board room member. As someone who aspires to work in football, I devoured this series.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/333/1*TlgCt6HLRe1-XqwvTgrnyg.png" /><figcaption>Changing the Conversation Volume 1</figcaption></figure><h4><strong>Younger, Better, Cheaper</strong></h4><p>The essay “Younger, Better, Cheaper” particularly hijacked my brain. It discusses a simple premise: If you had to replace a player, in theory, the best possible player you can sign is someone younger, better, and cheaper. In a perfect world, every signing would check all 3 boxes. Unfortunately, we do not live in a perfect world. If your club has limited resources or information, finding a player who checks all three boxes is like finding a needle in a haystack.</p><p>Perhaps your club has lost its star. You have some replacement options who are younger and cheaper, but they’re so inexperienced that they’re too raw to be better. You may also have an option who’s better and younger, but they likely already have a ton of resale potential, so they’re probably not cheaper. Checking all 3 boxes (younger, better, and cheaper) feels like an impossible matrix where you can have at most two, but not all three.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/1*a-S6XWqXUm-kkyQ8iH3usg.png" /><figcaption>21st Club’s attempt to find a Younger Better and Cheaper replacement for Ivan Perišić</figcaption></figure><p>Does this player even exist? Thankfully, with data, a lack of resources or information is no excuse! In just minutes, we can sift through tens of thousands of players and quantify exactly how many players check all three boxes, narrowing an ocean of players down to maybe a more manageable dozen possible recruits.</p><h4><strong>Implementing Younger, Better, Cheaper Into Soccer API</strong></h4><p>The 21st club wrote, “In recruitment, we don’t sign the best player available; we sign the best player we know. When scouting resources are scarce and noise from agents is high, a full perspective on market options is essential”.</p><p>I believe this quote was a significant reason why this sticky note project idea, unlike many before it, was actually completed. I had recently published Soccer API. A Python library designed to solve the challenges presented when “scouting resources are scarce” and provide amateur football analysts (like myself) “a full perspective on market options”. I read that sentence and I knew right away, Soccer API was the perfect tool to cut through the noise. I got to work!</p><p><a href="https://github.com/JohnComonitski/SoccerAPI">GitHub - JohnComonitski/SoccerAPI: Python library that allows for the easier collection of league, team and player data for use by amateur football data analysts.</a></p><p>I began by adding a new scouting module to Soccer API, which can be simply instantiated in a couple lines of code.</p><pre>class Scouting:<br>    def __init__(self):<br>        pass</pre><pre>from SoccerAPI.soccerapi.soccerapi import SoccerAPI<br><br>config = {<br>    &quot;fapi_host&quot; : &quot;XXXXXXXX&quot;,<br>    &quot;fapi_key&quot; : &quot;XXXXXXXX&quot;<br>}<br>api = SoccerAPI(config)<br><br>api.scouting</pre><p>Inside this new scouting class, I wrote a method, <strong><em>younger_better_cheaper</em></strong>, which takes a player we’d like to replace, a list of players to scout from and target statistic. With those parameters, we then perform a younger, better, cheaper analysis over those players.</p><pre>def younger_better_cheaper(self, player: Player, players_to_scout: list[Player], stat_key: str, year: str = None) -&gt; dict:<br>    if not year:<br>        current_date = datetime.now()<br>        year = str(current_date.year)<br><br>    stat = player.statistic(stat_key, year).value<br>    age = player.profile()[&quot;age&quot;]<br>    mv = player.market_value()<br><br>    younger_better_cheaper = []<br>    cheaper = []<br>    younger = []<br>    better = []<br>    for scouted_player in players_to_scout:<br>        scouted_player_stat = scouted_player.statistic(stat_key, year).value<br>        scouted_player_age = scouted_player.profile()[&quot;age&quot;]<br>        scouted_player_mv = scouted_player.market_value()<br><br>        res = {<br>            &quot;cheaper&quot; : 0,<br>            &quot;better&quot; : 0,<br>            &quot;younger&quot; : 0<br>        }<br><br>        if( scouted_player_stat &gt;= stat ):<br>            res[&quot;better&quot;] = 1<br>            better.append(scouted_player)<br><br>        if( scouted_player_age &lt;= age ):<br>            res[&quot;younger&quot;] = 1<br>            younger.append(scouted_player)<br><br>        if( scouted_player_mv &lt;= mv ):<br>            res[&quot;cheaper&quot;] = 1<br>            cheaper.append(scouted_player)<br><br>        if( res[&quot;cheaper&quot;] and res[&quot;younger&quot;] and res[&quot;better&quot;]):<br>            younger_better_cheaper.append(scouted_player)<br><br>    cheaper = sorted(cheaper, key=lambda x: x.market_value())<br>    cheaper.reverse()<br>    younger = sorted(younger, key=lambda x: x.market_value())<br>    younger.reverse()<br>    better = sorted(better, key=lambda x: x.market_value())<br>    better.reverse()<br><br>    return {<br>        &quot;Cheaper&quot; : cheaper,<br>        &quot;Younger&quot; : younger,<br>        &quot;Better&quot; : better,<br>        &quot;younger_better_cheaper&quot; :younger_better_cheaper<br>    }</pre><p>The method simply groups the <strong><em>players_to_scout</em></strong> into the three respective sets using the <strong><em>player</em></strong> parameter as the reference to measure against. “Younger” is simply identified by comparing the players’ ages. “Better” is determined by comparing the value of <strong><em>player.statistic()</em></strong> for a given statistic. Finally, “Cheaper” is determined by comparing <strong><em>player.market_value()</em></strong>, which grabs the latest Transfermarkt Market Value. Once divided into 3 sets, we can use Set Theory and Matplotlib to report our findings using a Venn Diagram just like the 21st Club did.</p><pre>def venn_diagram(self, sets: dict, params: dict):<br>    sets_list = []<br>    labels = []<br>    for key in sets.keys():<br>        labels.append(key)<br>        sets_list.append(sets[key])<br><br>    if(params and &quot;title&quot; not in params):<br>        params[&quot;title&quot;] = &quot;Comparing Players by &quot; + &quot;, &quot;.join(labels) <br><br>    self.__set_up_vis(params)<br><br>    if(len(labels) &gt; 3 or len(labels) == 1):<br>        return { &quot;success&quot; : 0, &quot;res&quot; : {}, &quot;error_string&quot; : &quot;Error: Can only generate vendiagrams for 2 or 3 sets&quot; }<br><br>    venn = None<br>    region_map = {}<br><br>    if(len(labels) == 2):<br>        venn = venn2(sets_list, set_labels=(labels), ax=self.ax2)<br>        region_map = {<br>            &#39;100&#39;: (0,),     # A only<br>            &#39;010&#39;: (1,),     # B only<br>            &#39;110&#39;: (0, 1)   # A ∩ B<br>        }<br>    else:<br>        venn = venn3(sets_list, set_labels=(labels), ax=self.ax2)<br>        region_map = {<br>            &#39;100&#39;: (0,),     # A only<br>            &#39;010&#39;: (1,),     # B only<br>            &#39;001&#39;: (2,),     # C only<br>            &#39;110&#39;: (0, 1),   # A ∩ B<br>            &#39;101&#39;: (0, 2),   # A ∩ C<br>            &#39;011&#39;: (1, 2),   # B ∩ C<br>            &#39;111&#39;: (0, 1, 2) # A ∩ B ∩ C<br>        }<br><br>    for patch in venn.patches:<br>        if patch:<br>            patch.set_edgecolor(self.tertiary_color)<br>            patch.set_linewidth(1) <br><br>    for text in venn.set_labels:<br>        if text:<br>            text.set_color(self.tertiary_color)<br><br>    for text in venn.subset_labels:<br>        if text:<br>            text.set_color(self.tertiary_color)<br><br>    def get_label_for_region(indices):<br>        included = set.intersection(*(sets_list[i] for i in indices))<br>        excluded = set.union(*[sets_list[i] for i in range(len(labels)) if i not in indices], set())<br>        return included - excluded<br><br>    for code, indices in region_map.items():<br>        label = venn.get_label_by_id(code)<br>        if label:<br>            items = get_label_for_region(indices)<br>            label_text = &quot;&quot;<br>            if(len(items) &gt; 3):<br>                label_text = f&quot;{len(items)} total, including...\n&quot; + &quot;\n&quot;.join(list(items)[0:2])<br>            elif(len(items) &gt; 0):<br>                label_text = &quot;\n&quot;.join(items)<br>            label.set_text(label_text)<br><br>    if( &quot;filename&quot; not in params):<br>        params[&quot;filename&quot;] = &quot;_&quot;.join(labels) + &quot;_venndiagram.png&quot;<br><br>    self.__output_vis(params)</pre><h4><strong>Putting Soccer API to the Test</strong></h4><p>With a new scouting library at our disposal, we now need a challenge to put Soccer API to the test. I’ve decided to draw inspiration from a club I know well, Union Berlin. In May, Union transferred Benedict Hollerbach to Mainz for a reported 7.5 million euros. With 9 goals last season (Union didn’t score many goals last season…), Hollerbach was a massive reason Union survived relegation. He was an offensive spark whose tireless motor and courageous attacks were a constant threat. His departure leaves a huge hole in Union’s attack.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/708/1*NcQaZIlG_nLSLc7xj1NPLg.jpeg" /><figcaption>Benedict Hollerbach with Un Belrinion</figcaption></figure><p>Union has likely found its answer to losing Hollerbach. This summer they signed attacking options Ilyas Ansah and Oliver Burke and they signed Andrej Ilic, who was brought in on loan last season, to a permanent deal. Let’s pretend however, they didn’t make these signings and we want to use Soccer API to identify potential replacements.</p><p><strong>Hollerbach’s Younger, Better, Cheaper Replacement</strong></p><p>The first step is to understand Hollerbach as a player and identify which statistics we can use to profile his playing style. As an attacker, Hollerbach is quite versatile. Union deployed him as a lone strike, along side another strike in a strike duo and as a more traditional winger.</p><p>His greatest strength is when he has the ball at his feet. If Hollerbach can play direct with the ball and get up the pitch as fast as possible, he’s likely thriving. This pairs well with the fact that he’s not afraid to take on players at high speeds. He’ll use his speed, creativity, and skill to beat a man, which then opens up space elsewhere on the pitch and creates opportunities for others or himself 1 to 2 passes later. An additional thing to note is his motor. Hollerbach’s a high-energy relentless player who’s not afraid to do the dirty work as an attacker. He’s not the most effective presser, but he will be a pest.</p><p>For this exercise, we’re going to define five key statistics to summarize Hollerbach’s scouting profile…</p><ul><li><strong>Goals</strong> — We need a player with offensive output.</li><li><strong>Progressive</strong> <strong>Carries</strong>, <strong>Take-Ons Won</strong> &amp; <strong>Passes into Penalty Area</strong> — These three statistics describe his ability to create chances and opportunities with the ball at his feet.</li><li><strong>Tackles</strong> — We need an attacker who’s willing to put a defenseive shift in.</li></ul><p>Our next step is to run a Younger, Better &amp; Cheaper analysis for each of these five statistics and use the results to identify a short list of players to recruit from. For this experiment, we’re going to be scouting the Bundesliga, MLS and the Championship. This is because these are three leagues that Soccer API can access advanced analytics for. In total, we have 2043 players to scout. Before we run our analysis, though, we are going to narrow this list down to only attacking or midfield players who made 10 appearances last season, scored more than 5 goals and have a market value above €500,000. This narrowed down 2043 players to only 168 players that we need to perform a Younger, Better &amp; Cheaper analysis over.</p><p><em>Note: I understand, given the difference in league quality, there is a flaw in directly comparing MLS statistics to Championship statistics or Championship statistics to Bundesliga statistics. This is just an expirement. In theory, a proper scout could use their better judgement to determine if a player’s lower league statistical output can translate into the Bundesliga.</em></p><pre>from SoccerAPI.soccerapi.soccerapi import SoccerAPI<br><br>config = {<br>    &quot;fapi_host&quot; : &quot;XXXXXXXX&quot;,<br>    &quot;fapi_key&quot; : &quot;XXXXXXXX&quot;<br>}<br><br>api = SoccerAPI(config)<br><br># Hollerbach &amp; Leagues to Scout<br>hollerbach = api.db.get(&quot;players&quot;, 93766)<br>buli = api.db.get(&quot;leagues&quot;, 13)<br>mls = api.db.get(&quot;leagues&quot;, 134)<br>champ = api.db.get(&quot;leagues&quot;, 117)<br>leagues = [buli, mls, champ]<br><br># Getting Players to Scout<br>players_to_scout = []<br>for league in leagues:<br>    for team in league.teams():<br>        for player in team.players():<br>            if api.scouting.same_position(hollerbach, player):<br>            if(player.statistic(&quot;Matches&quot;).value &gt; 10 and player.statistic(&quot;Goals&quot;).value &gt; 5):<br>                players_to_scout.append(player)<br>    <br># Peforming Younger, Better Cheaper Analysis           <br>outputs = {}<br>outputs[&quot;Goals&quot;] = api.scouting.younger_better_cheaper(hollerbach, players_to_scout, stat_key=&quot;Goals&quot;, year=&quot;2024&quot;)<br>outputs[&quot;Progressive Carries&quot;] = api.scouting.younger_better_cheaper(hollerbach, players_to_scout, stat_key=&quot;Progressive Carries&quot;, year=&quot;2024&quot;)<br>outputs[&quot;Take Ons Won&quot;] = api.scouting.younger_better_cheaper(hollerbach, players_to_scout, stat_key=&quot;Take Ons Won&quot;, year=&quot;2024&quot;)<br>outputs[&quot;Passes Into Penalty Area&quot;] = api.scouting.younger_better_cheaper(hollerbach, players_to_scout, stat_key=&quot;Passes Into Penalty Area&quot;, year=&quot;2024&quot;)<br>outputs[&quot;Tackles&quot;] = api.scouting.younger_better_cheaper(hollerbach, players_to_scout, stat_key=&quot;Tackles&quot;, year=&quot;2024&quot;)<br><br># Generating Venn Diagrams<br>for stat in outputs:<br>    params = {<br>        &quot;title&quot; : f&quot;Hollerbach Cheaper, Better, Younger - {stat}&quot;,<br>        &quot;description&quot; : f&quot;An analysis of midfielders &amp; attacker&#39;s {stat} from the Bundesliga, MLS and Championship who have +10 matches played, +5 Goals and a TM Market Value above €500,000.&quot; ,<br>        &quot;signature&quot; : &quot;@JohnComFootball&quot;,<br>        &quot;filename&quot; : f&quot;{stat}_YBC_Analysis.png&quot;<br>    }<br>    <br>    output = outputs[stat]<br>    sets = {}<br>    for set_key in output.keys():<br>        if set_key != &quot;younger_better_cheaper&quot;:<br>            objects = output[set_key]<br>            labels = []<br>            for object in objects:<br>                labels.append(object.short_name())<br>            sets[set_key] = set(labels)<br><br>    api.visualize.venn_diagram(sets, params)<br>    <br># Listing the &quot;Younger, Better Cheaper&quot; Shortlists<br>for stat in outputs:<br>    top_10_list = api.scouting.top_10_by_stat(outputs[stat][&quot;younger_better_cheaper&quot;], stat, &quot;2024&quot;)<br>    params = {<br>        &quot;title&quot; : f&quot;Top Players by {stat}&quot;,<br>        &quot;signature&quot; : &quot;@JohnComFootball&quot;,<br>        &quot;exclude_photo&quot; : 1,<br>    }<br>    api.visualize.top_10_list(top_10_list, params)</pre><p><strong>Hollerbach’s Younger, Better &amp; Cheaper Replacement</strong></p><p>I wrote and ran the script above and within a few minutes, I had a series of venn diagrams and lists that can help us identify Hollerbach’s replacement.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*oM0lwjMjscVyuFuvgCJFJQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*A1QJZPcglCcdSAJhzYwXrQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Qk1C5PI-GccRncnTfLdEdw.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*8AevxBiyIv7M-TmddaQeMA.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*bx2we5aO-CNuDbM4wtJCpQ.png" /></figure><p>Given that we are analyzing these players across five different statistics, I felt it was best to review the visualizations to see if any players continued to appear in multiple “Younger, Better, Cheaper” lists. LA Galaxy’s Gabriel Chaves (Pec) and (then) Sheffield Wednesday’s Djeidi Gassama appeared in 4/5 lists, while Jack Rudoni appeared in 5/5 lists.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*oejxOcnSIIg9yc7F3KAKKg.png" /><figcaption>Jack Rudoni, Djeidi Gassama and Gabriel Pec</figcaption></figure><p>Of these three players, I love Djeidi Gassama, but he just completed a move to Rangers and is likely off the table. Meanwhile, Jack Rudoni is being heavily pursued by Southampton and other Premier League clubs. He could be a difficult recruitment job by Union, but if Premier League interest fizzles, perhaps they can offer the allure of “Big 5 League Football” to beat out Southampton. Finally, Gabriel Pec, with 35 Goals + Assists in 38 matches, was vital to LA Galaxy’s 2024 MLS Cup championship. There are always massive question marks around an MLS player’s ability to make such a big jump in leagues, but his MLS production and playing profile, in my opinion, warrant a look.</p><p>Across the +2,000 players analyzed, we quickly found three players who appear to be younger, better, and cheaper options worth scouting to see if they can replace Hollerbach. Remember, the goal of this exercise was not to find the perfect replacement using only a single magical Python script. The goal was to provide market clarity and cut through the noise, because after all, that’s when the best decisions can be made.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=479372917373" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Machine Learning & Computer Vision, A Better Match Tactician Than Pep Guardiola]]></title>
            <link>https://medium.com/@johncomonitski/machine-learning-computer-vision-a-better-match-tactician-than-pep-guardiola-1b44924968e9?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/1b44924968e9</guid>
            <category><![CDATA[computer-vision]]></category>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[premier-league]]></category>
            <category><![CDATA[football-analysis]]></category>
            <category><![CDATA[football-analytics]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Mon, 09 Jun 2025 01:33:30 GMT</pubDate>
            <atom:updated>2025-06-09T01:33:30.285Z</atom:updated>
            <content:encoded><![CDATA[<h4>Machine learning and computer vision have brought tracking data to the football masses and it opens up the door to a new data revolution in football</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*NEp8ozjqtzhDjVNPB4jxGg.png" /><figcaption>Player tracking data generated using Python and Roboflow’s computer vision libraries</figcaption></figure><p>For a long time, I had the idea to use machine learning, computer vision, and object tracking to generate player tracking data from match footage. With rapidly-evolving open source models like Yolo enabling accurate object detection, the technology seemed available. In theory, a Yolo model could track players on the pitch. Then using linear algebra, their locations could be mapped onto a 2D plane and suddenly you’ve created player tracking data.</p><p>The player tracking data utilized by many clubs at the top level of football is expensive and requires extensive infrastructure. Clubs are either outfitting all of their players with sophisticated GPS trackers or installing dozens of specialized and costly cameras around their stadiums (<a href="https://www.hudl.com/blog/machine-vision-the-future-of-player-tracking-systems-with-hudl">Hudl</a>). For this reason, player tracking data is monopolized, inaccessible and too expensive for most clubs. However, if someone completes this project idea using open-source software, suddenly even a Sunday league team could generate professional-level player tracking data with just a camera and a laptop.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/800/1*k6IhGVXmOaFNqf1YHcYMlg.jpeg" /><figcaption>Specialized player tracking cameras being installed on Bronby’s training pitch (Hudl)</figcaption></figure><p>This is monopoly-shattering work, so the idea has tempted me for years and I’d continue to toy with the project idea. I’ve studied machine learning at university, I’ve built projects training custom models, and I’ve experimented with object detection. I felt capable of prototyping this project, but I lacked time (and, truthfully, probably the experience). As a result, this project idea has laid dormant. That was until one tweet changed everything.</p><h4><strong>One Tweet Inspires a Challenge</strong></h4><p>Piotr Skalski, the Open-source Lead at Roboflow (<a href="https://x.com/skalskip92">@skalskip92</a> on Twitter), pulled off what I could only dream of. Piotr tweeted a video of his AI Computer Vision model that was able to take match footage, track the players, and then perform meaningful match analysis with the generated tracking data. He had the experience and the time (200 hours to be exact) that I lacked and the project was impressive. The best part, he published a tutorial on Roboflow’s YouTube channel as well as several notebooks with his training and tracking code.</p><iframe src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A//x.com/skalskip92/status/1826693515189125433%3Fs%3D46%26t%3DSzgNUSDYManw_IwswDQ8zA&amp;image=" width="500" height="281" frameborder="0" scrolling="no"><a href="https://medium.com/media/7b2c8f82b6249d77a8057103d22d4757/href">https://medium.com/media/7b2c8f82b6249d77a8057103d22d4757/href</a></iframe><p>I was inspired and I took this as a sign from the algorithmic gods. I set myself a challenge to dedicate October to studying this project. Every day I’d spend 30 to 45 minutes studying Piotr’s tutorials, building an understanding of how his project worked, exploring Roboflow’s libraries, and building my own Football Computer Vision model. As part of my daily learning challenge, I would post what I learned, my updates, and my latest progress to a daily Twitter thread.</p><p>With my challenge set, I spent the next month learning. The first few days of the challenge were spent studying. I was watching tutorials and taking notes.</p><iframe src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A//x.com/JohnComFootball/status/1842763543134793860&amp;image=" width="500" height="281" frameborder="0" scrolling="no"><a href="https://medium.com/media/9347ee7854be875c01570ba303fd507b/href">https://medium.com/media/9347ee7854be875c01570ba303fd507b/href</a></iframe><p>After a few days, I was doing simple tasks to build my familiarity with Roboflow’s tools. We had basic tracking and I was exporting 3d tracking data to a CSV.</p><iframe src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A//x.com/JohnComFootball/status/1843818181439225915&amp;image=" width="500" height="281" frameborder="0" scrolling="no"><a href="https://medium.com/media/cf7a54dd49e0ebbeb28694bb62ff31ae/href">https://medium.com/media/cf7a54dd49e0ebbeb28694bb62ff31ae/href</a></iframe><p>There were also days I got my hands dirty with Roboflow’s libraries. I was writing code to take CSV tracking data and use that data to inflate their library’s tracking objects.</p><iframe src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A//x.com/JohnComFootball/status/1849293845239923030&amp;image=" width="500" height="281" frameborder="0" scrolling="no"><a href="https://medium.com/media/2d6f53f08a8371524baec0c67d135de9/href">https://medium.com/media/2d6f53f08a8371524baec0c67d135de9/href</a></iframe><p>I even dove into the research of others who had built on top of Piotr’s work for better ball tracking.</p><iframe src="https://cdn.embedly.com/widgets/media.html?type=text%2Fhtml&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;schema=twitter&amp;url=https%3A//x.com/JohnComFootball/status/1850541539115540489&amp;image=" width="500" height="281" frameborder="0" scrolling="no"><a href="https://medium.com/media/ae316424cca89b9e5bfeda6e8db56c67/href">https://medium.com/media/ae316424cca89b9e5bfeda6e8db56c67/href</a></iframe><p>Eventually, I had some tracking clips I was proud to present!</p><iframe src="https://cdn.embedly.com/widgets/media.html?url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DQPDWg-aExpo&amp;type=text%2Fhtml&amp;schema=youtube&amp;display_name=YouTube&amp;src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FQPDWg-aExpo%3Ffeature%3Doembed" width="854" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/c2e4a8aa2058b31b27d90d7aaf098065/href">https://medium.com/media/c2e4a8aa2058b31b27d90d7aaf098065/href</a></iframe><p>As October was ending my daily learning challenge was concluding. Not only had my ability and confidence with computer vision leapfrogged, but I had written an application that generates clean and accurate player tracking data from match footage and then returns the data in a CSV format that follows a data standard set by Metrica, an industry leader in player tracking data.</p><p><a href="https://github.com/JohnComonitski/FootballTrackingDataGeneration">GitHub - JohnComonitski/FootballTrackingDataGeneration: Using machine learning and computer vision to generate football tracking data from match footage</a></p><p><em>If you’re interested in working with my code or following my latest developments, all the code used to generate the visualization above can be found in this Git repository!</em></p><p><strong>Football Computer Vision’s Big Picture Idea</strong></p><p>This will sound silly, but the big win is not the flashy tracking video. That’s the icing on the cake. Exporting the tracking data to a CSV that follows an industry standard such as Metrica’s is the cake. Much like how my player tracking application is standing on the shoulders of giants, so can my football analysis.</p><p>Professional analyst have been working with tracking data for years and many of them have been incredibly generous and have published their analysis scripts. The <a href="https://www.youtube.com/@friendsoftracking755">Friends of Tracking</a> YouTube channel has a wealth of scripts you can run with minimal set up. Maybe you’re concerned about player fatigue and want to analyze how much distance a player has ran in a match? There’s is a script for that!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qsH1YKB1TAkZgKQn3aGvbA.png" /><figcaption>Friends of Tracking’s code to generate distance covered for each player from Metrica data.</figcaption></figure><p>Perhaps you want to understand the movements that allowed your team to concede a goal. Luckily, they have a script to visualize how every player on the pitch is moving moments before a goal is scored!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*pMKuCjFXj80yVJOHaUrsWg.png" /><figcaption>Friends of Tracking’s code to plot the movement and events leading up to a goal from Metrica data.</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/962/1*euTlUsZM1y8IoOSylFGn1g.png" /><figcaption>Results of the goal event code.</figcaption></figure><p>Finally, let’s say you’re concerned about pitch control and want a Voronoi Diagram to describe pitch control at any given moment of the match. Again, Friends of Tracking has you covered!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*jGX_emDCdVPPQWm-FzcKNw.png" /><figcaption>Friends of Tracking’s code to plot generate a Voronoi Diagram detailing pitch control from Metrica data.</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/950/1*PieLkBpsnd6M88u8IRrjxQ.png" /><figcaption>Results of the Voronoi Diagram code</figcaption></figure><p>By following a format that’s a common industry standard, there is a chance a professional analyst somewhere has already solved the football data problem you’re trying to solve and if you’re lucky, there’s a script online to help you.</p><p>Imagine what a tool like this looks like in the hands of a small club. Perhaps this club invests the salary of one player on a small team of data analysts. Potentially for the price of a single player’s wage, they could hold an advantage over every other team in the league. It’s like buying a superstar for the wage of an average player.</p><p>They can now implement this tech and leverage the publicly available work of some of the best analysts in the world and then perform around-the-clock team and opposition analysis. Let’s say most clubs in your league have 1 or 2 opposition scouts spending 3 to 4 hours a day analyzing opposition match footage to prepare tactics for a match on the weekend. Meanwhile, your club’s machine learning match analyst system spends every second of every day analyzing opposition footage. That system then continuously sends your 1 or 2 opposition scouts new reports with the latest opponent tendencies and vulnerabilities it has found. Your human scouts can dive into these findings with a human eye and help your training staff build a world-class tactical plan.</p><p>The word “democratization” gets thrown around a lot in tech, but it’s a genuine fact, that never in Football history has such a large advantage been up for grabs. All it takes is for the right club to invest in it.</p><p><em>Interested in the match analysis scripts mentioned in this article? Checkout my Match Analysis repository, where you’ll find a series of scripts analyzing match event data in conjunction with player tracking data. Note that these scripts are based on and utilize Friends of Tracking’s </em><a href="https://github.com/Friends-of-Tracking-Data-FoTD/LaurieOnTracking"><em>Metrica Data Analysis libraries</em></a><em>.</em></p><p><a href="https://github.com/JohnComonitski/FootballMatchAnalysis">GitHub - JohnComonitski/FootballMatchAnalysis: Football match analysis using football event and tracking data</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1b44924968e9" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Data Driven Scouting With Python and Soccer API]]></title>
            <link>https://medium.com/@johncomonitski/data-driven-scouting-with-python-and-soccer-api-88570c59f592?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/88570c59f592</guid>
            <category><![CDATA[soccer]]></category>
            <category><![CDATA[premier-league]]></category>
            <category><![CDATA[mls]]></category>
            <category><![CDATA[football]]></category>
            <category><![CDATA[football-analytics]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Sun, 23 Feb 2025 19:11:07 GMT</pubDate>
            <atom:updated>2025-02-23T19:11:07.317Z</atom:updated>
            <content:encoded><![CDATA[<p>In 2025, wannabe sports data analysts like myself are spoiled. Websites like FBRef, Transfermarkt, and Understat allow us to research players from almost any league in the world. It’s great for random research or digging into your club’s latest signing, but if you want to use these sites for proper data analysis, you need to dive head first into the world of programming and web scrapping.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*-erwlgb0pI-t_tzYQumLtQ.png" /><figcaption>A FBRef player profile and Python code to scrape it.</figcaption></figure><p>For most people, that’s a bit too much to ask. Even as a professional software developer myself, I find web scrapping tedious. Then, after an hour or two of teeth-pulling web scrapping, there may be an additional challenge of combining the collected data with data you scraped from other site. That’s why over the last year I created Soccer API, a Python library that seamlessly connects API-Football to several online data sources including FBRef, Transfermarkt, and Understat.</p><p>Soccer API’s backbone is a database that maps the API-Football, Transmarkt, FBRef, and Understat IDs of over 82,000 players, 4200 teams, and 200 leagues across world football. If you want to see Soccer API in action, I recommend you check out <a href="https://johncomonitski.com/mlsvstheworld/">MLS vs the World</a> or my <a href="https://johncomonitski.com/playerprofiles/">Player Profiler</a>. Soccer API has been a super powerful tool in my arsenal and over the past few months, I’ve made it even more powerful and easier to use by refactoring the API to be object-oriented. Most importantly, I’m making it publicly available!</p><p><a href="https://github.com/JohnComonitski/SoccerAPI">GitHub - JohnComonitski/SoccerAPI: Python library that allows for the easier collection of league, team and player data for use by amateur football data analysts.</a></p><h4><strong>We Need a Challenge!</strong></h4><p>As I prepare to publish Soccer API, I need a project that demonstrates its value. After some consideration, I settled on performing some data-driven scouting inspired by fellow Medium writer <a href="https://medium.com/@dhill19930104">Dhillon</a>. Using a data-driven approach, Dhillon scouted the Under 17 Euros to identify top prospects at key positions.</p><p><a href="https://medium.com/@dhill19930104/top-5-central-defensive-midfielders-from-euro-under-17-data-driven-de0948dd8ea5">Top 5 Central Defensive Midfielders from Euro Under 17 — Data Driven</a></p><p>In each article of Dhillon’s series, he would focused on a given position. The process began by identifying key metrics that define a high-quality performance at the given position. Then, he would graph those metrics across several scatter plots for all players who played that position at the Under 17 Euros. Finally, using these plots, he’d identify five players who consistently stood out. To conclude, he would provide a brief scouting report on those stand out players.</p><p>My goal is to mimic Dhillon’s data-driven style of player scouting in as few lines of code as possible using Soccer API. The API will first use API-Football to collect every player from a series of leagues. Then, it’ll use their Transfermarkt market value data and FBRef statistics to analyze and visualize these players.</p><p>With our challenge set, we now need a player or position to scout for! I decided to fall back on the team I know the best, the Philadelphia Union. Last summer the Union sold their top Striker, Julian Carranza, to Feyenoord Rotterdam. 6 months later, the Union’s 2025 season is fast approaching and they’ve yet to replace their talismans who registered with the club 63 goal contributions in 95 matches. Given Sporting Director Ernst Tanner is busy scouting the Serbian League for a new midfielder, I’ve decided Soccer API will take this job off his hands.</p><p><em>On a side note, it has become apparent that Ernst Tanner does not need my help. As I was getting ready to publish this article, the Union announced that they signed Uruguayan youth international </em><a href="https://x.com/MLS/status/1891869496689713611"><em>Bruno Damiani</em></a><em> for a club record fee.</em></p><h4><strong>What Kind of Player is Julian Carranza?</strong></h4><p>Before we can begin coding, it’s important to understand what kind of player we’re scouting for. So who is Julian Carranza as a player?</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*xG3a5qs1gZnxebUiDvbKwA.png" /><figcaption>Julian Carranza with the Philadelphia Union.</figcaption></figure><p>I’d sum up Carranza by say he is a gegenpressing machine, who constantly works to win the back the ball. Reviewing his statistics, the data proves that. Carranza appears consistently in the 94th, 99th, and 85th percentile among attackers for Tackles, Interceptions, and Blocks. This defensive work rate then is converted into offensive productivity, as he finds himself in the 86th percentile for shots from defensive action.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*bE5ktFrmn1xgoauZ-QU88A.png" /><figcaption>Julian Carranza’s defensive statistics as a percentile in comparison to other attackers (FBRef).</figcaption></figure><p>Carranza isn’t an attacker who just remains high up the pitch eager to press keepers and centerbacks. Carranza is in the 90th percentile for touches in the defensive 3rd and averages roughly 17 touches per match in the both middle 3rd and attacking 3rd. He’s all over the pitch. It is important to note however, Carranza is neither a playmaker nor a major factor in the build-up. His passing, key passes per match, and carrying statistics are not particularly remarkable.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*7skaA-g25WHlNyZqR0vbEw.png" /><figcaption>Additional Julian Carranza statistics as a percentile in comparison to other attackers (FBRef).</figcaption></figure><p>In my opinion, Carranza is the kind of striker who’s a pest throughout the entire pitch. He’s constantly working to wear down the defense and win the ball back. Once the ball is won, he will participate in the build-up by nature of being in that part of the field, but is more focused on being ready to impact play in the final 3rd once the ball gets there. If I had to define Carranza’s game using metrics, these are the statistics I would build his profile around.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*uRsCUOcJvclnf-4iRPUxoA.png" /><figcaption>The key metrics I would use to identify Julian Carranza’s style of player.</figcaption></figure><h4><strong>Soccer API Recruitment</strong></h4><p>At this stage, we have a scouting challenge and we’ve identified the type of player we’re looking for. It’s time to open up VS Code and write some Python!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*XjuKmB2Tr1h6Qw9zINht8g.png" /><figcaption>Setting up Soccer API and prepping leagues to be scraped.</figcaption></figure><p>I started by importing the Soccer API library and I fetching the two leagues I planning on scouting. I opted to scout MLS and the Championship for Carranza’s replacement. The decision was made because FBRef has advanced analytics for these leagues and an MLS team could realistically recruit from teams in these leagues. Together these two leagues give us 1,485 players to sift through, so we’re going to need to filter some players out.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*0TtbmiBYFKZzs4otdKt6kA.png" /><figcaption>Getting players from MLS and the Championship and filtering out player’s we are not interrested in.</figcaption></figure><p>The filter I wrote looks for players who are forwards. From there, I decided to only look at players who are under 24 years old and have a market value below 3 million Euros. This decision was informed by the Union’s usual MO of buying players cheap with an eye towards developing and potentially flipping them. Finally, I excluded all players who had played less than 15 matches. After running our filter, we were left with 49 candidates who could potentially replace Carranza.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*mYssmiru052zJIwF8Qy9cQ.png" /><figcaption>Visualizing our 49 recruitment candidates in a series of scatter plots based on a series of metric pairs.</figcaption></figure><p>Our final step is to take these players and start visualizing their metrics. I took our metrics and paired them into a series of metric pairings. I then generate a scatter plot for each pairing of metrics using our potential recruits. With our plots generated in only 32 lines of code, we can dig through them to find our Carranza replacement.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*swRm9kxdBpMfm_swKxX0JQ.png" /><figcaption>After a few minutes, Soccer API has generate 9 scatter plots we can dig into to scout our Julian Carranza replacement.</figcaption></figure><h4>Data Driven Scouting</h4><p>Reviewing our visualizations, the first thing we need to consider is, which players have a respectable Non-Penalty xG. Remeber, we are looking for an attacker. I don’t care how many Tackles, Interceptions, and Pressures they have, if an attacker can’t score goals, they’re not ver useful to us. The next thing we need to consider is which players have a high defensive work rate while being involved in winning the ball back. With this in mind, these 3 scatter plots felt the most relevant to what we’re looking for.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gHxFxW9yswfLVJf7n7-_Tw.png" /><figcaption>Three scatter plots I felt were most relevant in identifying who could replace Julian Carranza.</figcaption></figure><p>Reviewing the plots, the first name to stick out to me was Sunderland’s Eliezer Mayenda. He’s young, only 19 years old, and quite affordable (1 Million Euro Transfermarkt Market Value). Of all the players we plotted (excluding Carranza), he has the strongest Non-Penalty xG per match. Additionally, he finds himself producing tackles at a similar rate as Carranza, while being more involved in the penalty box.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*v7ZNH75zLKfKFDV_EN1KBA.png" /><figcaption>Eliezer Mayenda’s proved to have one of the best Non-Penalty xG in this player dataset while recovering the ball more often than Carranza</figcaption></figure><p>The most glaring deficiency to his game (in comparison to Carranza), is where he’s applying his defensive work rate. It’s certainly there, but only in the attacking third. Carranza was putting in tackles and pressing all over the pitch. He didn’t care where the ball was. He wanted to win it back. Perhaps, being young, Mayenda could be coached up to start pressing more in the middle third of the pitch and eventually drop farther down the pitch and press even deeper. In my opinion, a 19-year-old who has proven he can contribute goals in the Championship and is willing to put in a hard tackle is a promising prospect that the Union could potentially mold and develop to fit their system.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*5z3eB9dCaQehAxne5luH8Q.png" /><figcaption>While Mayenda does recover the ball frequently and while he is active in the penalty box, he is generally stays high and does not involved defensively.</figcaption></figure><p>Another player the Union could also consider is Charlotte FC’s Kerwin Vargas. From a pressing and defensive perspective, Vargas is a machine. He’s going to be giving centerbacks and midfielders headaches all game. No attacker is more involved defensively and he’s involved in every third of the pitch.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*hDjif7QOc5XWm-wHCAmsxQ.png" /><figcaption>Vargas is active all over the pitch and putting in tackles at an incredibly high rate for an attacker.</figcaption></figure><p>Unfortunately, as I mentioned prior, strikers are not valued for their defensive output. They are valued for their goals. His offensive output leaves a lot to be desired. 10 goals and 6 assists in 75 matches. It’s not great, but then again, Carranza’s attacking output was a major disappointment at Inter Miami (3 goals in 42 matches), before they loaned him to the Philadelphia Union. Perhaps, if he has someone like Gazdag feeding him opportunities, he could step up his production.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*12aYHgpKQzXggnfQd5SBJA.png" /><figcaption>Vargas’ work rate is there, but his production per matches needs to be improved.</figcaption></figure><p>In many ways, I see a lot of Carranza in Vargas. Tons of work rate, underperforming at his current club (perhaps Charlotte are willing to dump him or loan him for cheap), and maybe a new location where he’s supported by the league’s best chance creator will do him wonders.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*_C8L18KfiPCxUugKAaHwog.png" /><figcaption>Kerwin Vargas and Eliezer Mayenda</figcaption></figure><p>Looking at both Vargas and Dossou, neither is a perfect fit, but I think these are two solid candidates within the Union’s budget that match the player profile of Carranza.</p><h4><strong>Soccer API</strong></h4><p>In just 32 lines of code, Soccer API enabled me to scour Transfermarkt, FBRef, and 1,485 players so I could scout a single player’s replacement. If I wanted to do this challenge by hand, we’re talking maybe 10 to 20 tedious hours of digging through these websites myself and manually entering data into a spreadsheet. Soccer API didn’t reinvent the wheel here and I understand professional scouts probably have much more sophisticated tools, but for the average amateur scout or soccer analytics enthusiast like myself, Soccer API may prove to be an invaluable tool.</p><p>If anyone’s interested in working with a simple Python API that marries the data of 82,000 players, 4200 teams, and 200 leagues across many of the most popular football analytics websites, feel free to install it and give it a try! Just simply install git and simply run…</p><pre>git clone https://github.com/JohnComonitski/SoccerAPI.git</pre><p><em>It is worth mentioning you will need </em><a href="https://rapidapi.com/api-sports/api/api-football"><em>API-Football</em></a><em> for many of the features, including player profiles and fixture statistics, but they do offer a free tier!</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=88570c59f592" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[The USMNT’s 2024 DVpC and Why the Pulisic Might Be Overrated]]></title>
            <link>https://medium.com/@johncomonitski/the-usmnts-2024-dvpc-and-why-the-pulisic-might-be-overrated-f7bf17ee1ac3?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/f7bf17ee1ac3</guid>
            <category><![CDATA[premier-league]]></category>
            <category><![CDATA[us-soccer]]></category>
            <category><![CDATA[mls]]></category>
            <category><![CDATA[pulisic]]></category>
            <category><![CDATA[usmnt]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Sun, 23 Jun 2024 17:50:05 GMT</pubDate>
            <atom:updated>2024-06-23T17:50:05.944Z</atom:updated>
            <content:encoded><![CDATA[<h4>A dive into USMNT’s 2024 Defensive Value statistics and the truths it reveals about this record setting season for US Internationals</h4><p><strong>Looking Back On the 2023/24 Season for</strong></p><p>The 2023/24 season has concluded and it was a year with many ups and downs for the US Men’s National Team (USMNT). For Christian Pulisic and Weston McKennie, it was a comeback year where they both proved they belonged at the highest level of the game. For other players such as Folarin Balogun, who was previously sky rocketing in market value, it was a humbling season.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*JPLPzr2vhJKPhbJwkwnWuw.png" /><figcaption>American Internationals Christian Pulisic, Weston McKennie and Folarin Balogun.</figcaption></figure><p>With a whole season to look back on, it’s only natural for fans compare and debate the performances of our players across various European leagues. The problem is that cross-league comparisons feel like an impossible challenge. For example, how do we compare the production of Balogun’s 12 Ligue 1 Goal Contributions versus Haji Wright’s 23 Championship Goal Contributions?</p><p>One year ago, Balogun was the US’s clear number one up top. After a regressive and difficult 2023/24 season, has that changed? He’s playing in a high-quality league, but Haji Wright was incredibly productive in the Championship. Haji’s stock is at an all time high and many fans feel he should be rewarded by becoming the US’s new number one up top. That feels fair, go with the hot hand, but if we plopped Haji into Ligue 1, would we see a similar fall-off to Balogun’s?</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*q7YyDe_OIBSlkIenAodLXg.png" /><figcaption>Comparing Folarin Balogun’s Ligue 1 performances to Haji Wright’s Championship performances.</figcaption></figure><p>This is a difficult question to answer and these debates get even more difficult when the perceived gap in a league’s quality gets even smaller. It’s easy to say a goal in the Eredivisie is more valuable than an MLS goal, but how much more? 2 times more? 3 times more? I would call this an impossible question, but I think the answer may lie in calculating a player’s Defensive Value per Contribution (DVpC).</p><p><strong>What is Defensive Value per Contribution?</strong></p><p>Defensive Value per Contribution (DVpC) is a statistic I created to enable the cross-league comparisons of goals and assists. I’ve explained in great depth in this article:</p><p><a href="https://medium.com/@johncomonitski/an-introduction-to-defensive-value-per-contribution-bf8caec9f3f0">An Introduction to Defensive Value per Contribution</a></p><p>To find a player’s DVpC, you must first understand Defensive Value (DV). When a goal or assist is had, if you sum the Transfermarkt (TM) Market Value of the goalkeeper and all the defenders on the field at the time of the goal contribution, you have the DV of that goal. DV can be used to argue, for-example, a hypothetical player’s goal against Manchester City’s €222 million defense is more impressive than a player’s goal against Sheffield United’s €33.2 million defense. This argument could of course be missing a whole tactical breakdown worth of nuance and context, but I think DV shines when averaged over the total goal contributions from a player in a given season. Doing so will yield a player’s DVpC.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Zcb9I2CqmXTyFzuVsTEM5g.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*RDH2_OPPAEOGTJPFlpye1w.png" /><figcaption>Manchester City’s Defensive Value and Sheffield United’s Defensive Value on the final match day of the 2023/2024 season. Data was provided by Transfermarkt.com</figcaption></figure><p>To calculate DVpC, you must sum the DV of every goal or assist a player has in a season. Then divide that sum by the player’s total number of goal contributions from that season. By averaging the DV of a player’s goal contributions, you can assign a quantifiable value to the type of opposition a player is capable of scoring or assisting against. The average can serve as a useful cross-league comparison.</p><p>A quick note on using TM Market Values in player analysis. I’m sure many will be quick to point out that TM Market Values are completely arbitrary. Don’t worry, I’ve also explained my reasoning for using TM Market Values in a previous article:</p><p><a href="https://medium.com/@johncomonitski/a-defense-of-transfermarkt-market-values-in-football-analytics-c819aa954eb6">A Defense of Transfermarkt Market Values in Football Analytics</a></p><p>Yes, I thought ahead!</p><p><strong>Introducing! A New and Improved Dashboard!</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*JIsphELCW94NGppIDAvdqw.png" /><figcaption>The <a href="https://public.tableau.com/app/profile/john.comonitski/viz/2024USMNTDVpC/AllPlayers?publish=yes">2024 USMNT DVpC Dashboard</a>.</figcaption></figure><p>Now that we’re all (hopefully 🤞) up to speed on DVpC, I am proud to present the 2024 USMNT DVpC statistics present via a new and improved Tableau dashboard!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qCSqdGGm5vD9Sd81lsikJg.png" /><figcaption>Last season’s USMNT DVpC Dashboard along side this season’s USMNT DVpC Dashboard</figcaption></figure><p>Last year’s dashboard was specifically created to teach me Tableau. It’s… it’s um… a dashboard. Is it a good dashboard? Well, let’s just move from that question and look at <em>this year’s</em> dashboard! Over the past year, I’ve gotten much more comfortable working in Tableau and now I have a dashboard I’m actually proud of. I must of course give credit to Thanoshaan’s Thayalan’s <a href="https://public.tableau.com/app/profile/thanoshaan.thayalan/viz/CarloAncelottisRealMadridAnalysis/CornerBreakdownDashboard">Real Madrid Analysis</a> dashboard, which was a major inspiration.</p><p>The Defensive Value data presented in this dashboard was collected using a Python script. My script first used <a href="https://rapidapi.com/api-sports/api/api-football">Football-API</a> to find the match history of every American international. Once I had the match history, I iterated over the match history to find the matches US internationals scored or assisted. If a player contributed to a goal, I used Football-API’s match line-up data to identify which players were in defense and on the pitch at the time that goal was scored. With the defense identified, I web-scraped the TM Market Value’s of those defenders and goalkeeper to calculate the DV of that goal contribution. I did this for every goal contribution by every US international this season and in the end, I had a CSV file perfect for Tableau to run with.</p><p>The 2024 USMNT DVpC Dashboard is broken into three sections. The first section is a big-picture look at the offensive productivity of US Internationals this season. Here you’ll find a breakdown of how each American performed this season in their various competitions, as well as Defensive Value statistics to provide context for goal contributions.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*JIsphELCW94NGppIDAvdqw.png" /><figcaption>Page 1 of the 2024 <a href="https://public.tableau.com/app/profile/john.comonitski/viz/2024USMNTDVpC/AllPlayers?publish=yes">USMNT DVpC Dashboard</a> presenting the entire offensive productivty of Americans internationals.</figcaption></figure><p>Moving onto section two, you will find the individual statistics of various players. You can select which player you would like to break down and get a more detailed look into their performances this season.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*e-90jHN5H3vZC1BdueCKIg.png" /><figcaption>Page 2 of the 2024 <a href="https://public.tableau.com/app/profile/john.comonitski/viz/2024USMNTDVpC/AllPlayers?publish=yes">USMNT DVpC Dashboard</a> providing a look into the productivity and defensive value statistics of individual americans.</figcaption></figure><p>Finally, the third section is simply a data table with the raw data powering this dashboard. For the data nerds who want all this data in a single table, this is the page for you.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*LMvbEF8M11y8AGiGbHmFGg.png" /><figcaption>Page 3 of the <a href="https://public.tableau.com/app/profile/john.comonitski/viz/2024USMNTDVpC/AllPlayers?publish=yes">2024 USMNT DVpC Dashboard</a>, which features a raw table with all the statistics powering this dashboard.</figcaption></figure><p>All of this is publicly available for you to explore on my <a href="https://public.tableau.com/app/profile/john.comonitski/viz/2024USMNTDVpC/AllPlayers?publish=yes">Tableau</a>!</p><p><strong>Exploring the 2023/24 USMNT DVpC Data</strong></p><p>When exploring this season’s USMNT Defensive Value statistics, the first player of note is Burnley’s Luca Koleosho. Luca had the highest DVpC of all US eligible players at €133.45 million. This would be impressive, but when you dig into the data, you quickly discover this statistic is comprised of a single assist in a 5–2 loss to Spurs and a goal against bottom-of-the-table Sheffield United. Despite what the data says, Koleosho did not have the most impressive goal-scoring record of all US eligible players this season. This is where DVpC falls flat on its face.</p><p>Luca’s goal against Spurs was very much a shooting star moment that catapulted him to the top of the DVpC charts. Brendan Aaronson topped this same statistic last year, because he scored goal against Chelsea and did very little after that. I’m highlighting Koleosho (at the peril of undermining my own statistic), because I think it’s important to note that this data is best used within the correct context. DVpC is most useful when analyzing players who have a large body of goal contributions. A great example of this is Pulisic.</p><p>With a total Defensive Value of €707.5 million across all competitions, Pulisic has topped the 2023 Total Defensive Value Rankings. This was to be expected. Pulisic had a great season and only Haji Wright contributed more goals than him.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*8-183WiLP22_fKEyuKCPag.png" /><figcaption>Pulisic’s Serie A statistics.</figcaption></figure><p>Pulisic deserves his flowers for this great season, but I think there is a second story to be told here. A lot of Pulisic’s production appears to have come from beating up on the lesser Serie A sides. Pulisic only had 1 goal and 1 assist against a top 6 team in Italy. This isn’t to write off this season’s accomplishments, but Pulisic’s DVpC did fall below most of his American peers in the top 5 leagues. When you consider Pulisic’s relatively low DVpC over a large body of goal contributions, it suggests that perhaps the upper level of Serie A defenses is his ceiling.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*4Qy5U4aGJby3Uv3Wt-KJGA.png" /><figcaption>The DVpC of Americans internationals in Top 5 Leagues</figcaption></figure><p>One player who had a higher DVpC than Pulisic is Folarin Balogun. This could speak to Pulisic smurfing on weaker Serie A squads or the quality of the French league being better than fans think. Regardless, I find Balogun’s high DVpC interesting, because many have described this season as a slump. Maybe this slump is not as dramatic as we think.</p><p>I’d like to conclude with two players, who I think deserve praise: Malik Tillman and Haji Wright. Both Malik and Haji made jumps to stronger leagues and improved on their previous season’s productivity. Last season Malik was constant threat for Rangers in the Scottish Premier League. There’s a lot of prestige in playing for the Old Firm, but despite playing for Rangers, Malik’s DVpC statistics weren’t particularly impressive. He had 14 Goal Contributions at a meager DVpC of €1.5 million. For this reason, I was concerned about his jump to PSV. The Eredivisie is far more competitive top to bottom than the Scottish Premier League. I, thankfully, will eat my words. Malik not only rose the challenge, but rose his level. In the Eredivisie, Malik had 9 Goals and 10 Assists at a DVpC of €14.3 million.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Pcxxi2MkHxkfMiAvkVSx5g.png" /><figcaption>Malik Tillman’s offensive productivity this season compared to last season.</figcaption></figure><p>Haji Wright found himself in a similar situation. He spent the previous season in the Turkish Super League, where he had 18 contributions at a DVpC of €3 million. In the summer, Haji made the move to Coventry City in the Championship. After a slow start, Haji finished the season red hot, scored an FA Cup Semi Final goal and contributed to 23 goals at a DVpC of €16.2 Million. Not only did Haji see a massive leap in both goal productivity and DVpC, but was 2nd in this season’s Total Defensive Value Rankings. This is inflated by 3 FA Cup goals against Premier League opposition (including a goal against Manchester United), but that sort of accomplishment from a Championship player deserves its praise!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*CfsekPOGGqjf3lqSHuBsNA.png" /><figcaption>Haji Wright’s offensive productivity this season compared to last season.</figcaption></figure><p>I don’t have the historical data to back this statement up, but this was likely the most productive season Americans have ever had playing abroad. Americans playing in Europe had the 3rd most goal contributions out of any nationality not from Europe! If you want to dive further into this season’s output from Americans, I encourage you to dig into my <a href="https://public.tableau.com/app/profile/john.comonitski/viz/2024USMNTDVpC/AllPlayers?publish=yes">2024 USMNT DVpC Dashboard</a>!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f7bf17ee1ac3" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Using K-Means Clustering to Tier List World Football]]></title>
            <link>https://medium.com/@johncomonitski/using-k-means-clustering-to-tier-list-world-football-e57e9ca4e7b7?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/e57e9ca4e7b7</guid>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Sat, 11 May 2024 17:01:45 GMT</pubDate>
            <atom:updated>2024-05-13T12:23:21.694Z</atom:updated>
            <content:encoded><![CDATA[<h4>How I used K-Means Clustering and TransferMarkt Market Values to cluster 26 leagues across world and sort them into a Tier List.</h4><p><strong>Is This League Quality? Or Is The League A Farmers League?</strong></p><p>There are too many leagues! With nearly every country having multiple professional football leagues, how does one even begin to compare and rank various leagues? Most football fans would probably point to there being the “Big Four Leagues” (Premier League, La Liga, Bundesliga &amp; Serie A) followed by everyone else. Some may even include Ligue 1 and call it the “Big Five Leagues”. Let’s also consider the Premier League on its own. With all the money is circulating around the Premier League, should it be in a class of its own? Or, maybe, given the Premier League’s recent European failures, is it a “farmers league”?</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*PESwrQTjLO8l65BhKR4HQQ.jpeg" /><figcaption>A meme mocking the Premier League’s 2024 results in European Competitions</figcaption></figure><p>These are the kinds of questions football fans could debate endlessly in a bar. Then, after several pints, a lot of shouting, and a vociferous defense of MLS by yours truly, we’d probably be nowhere near an answer.</p><p>Luckily, statistics offers us a solution! K-means clustering is a statistical technique used to group similar items into K clusters, where K represents the desired number of clusters.</p><p>K-Means clustering allows us to cluster a set of leagues into distinct groups and identify which leagues statistically belong in a class together. All we need is a metric, that can be used to directly compare two leagues. This is where one of my favorite metrics, Transfermarkt Market Values, comes into play.</p><p><strong>Scraping League Market Values</strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/354/1*NaD5thUK186-GuCHs5vDbg.png" /><figcaption>TransferMarkt.com is a popular site for tracking the rises and falls of player market values.</figcaption></figure><p>Finding a data point to compare one league to every other league in the world is a tough ask. The best effort, to my knowledge, is the Transfermarkt (TM) Market Value. A league’s TM Market Value is the sum of every player in a league’s estimated market value. I will admit that TM Market Value is a flawed metric, but anyone who has read my work before will know that I believe TM Market Values, when used in the right context, can serve as a useful cross-league analysis metric. I’ve defended it arleady in a previous article, but to summarize my stance, TM Market Values serves as a useful representation of the general perceived quality of a league in a global context.</p><p><a href="https://medium.com/@johncomonitski/a-defense-of-transfermarkt-market-values-in-football-analytics-c819aa954eb6">A Defense of Transfermarkt Market Values in Football Analytics</a></p><p>In order to organize leagues using K-Means Clustering, I needed to scrape the TM Market Value of as many leagues as possible. A big factor in what leagues I collected market values from was ease of access. I already maintain a private database that maps player IDs, team IDs, and league IDs across various sites and APIs. Using this database and Python, I wrote a script to scrape the TM Market Value of all the leagues in my database that have a mapping between <a href="https://www.api-football.com/">API-Football</a>’s League IDs and TM league IDs.</p><p>In total, I collected market values for 26 different leagues across EUFA, Concacaf, Conmbol, and the AFC. It must be stated that this list is not comprehensive nor was it scientifically produced. A glaring omission from my list of leagues are CAF leagues. The Argentine Primera División also, somehow, escaped my data scrapping. With the market values of 26 different leagues scraped and collected, my next step was to graph all 26 league’s market values across a line.</p><p><strong>Visualizing the Wealth of 26 Leagues</strong></p><p>Viewing all 26 leagues on a graph very quickly gives you a feel for the distribution of wealth across world football. The Premier League exists in a class of its own on the far right of the line. Meanwhile, the remaining three of your “Big 4 Leagues” (and Ligue 1) occupy the center of the line. Finally, the remaining leagues are spread out across the left side of the line.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Fn2znM1628cl0LbluvtADg.png" /><figcaption>26 Leagues plotted on a line based on their total TransferMarkt Market Value.</figcaption></figure><p>With every league graphed onto a line, it was time to begin clustering our teams. I graphed the same data along 8 different whole number Y-axis values, where the Y-axis value is the number of clusters (K) that we will group our data into.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/600/1*Ax9QTMVQhtGLOyIe01uq5Q.png" /><figcaption>The 26 Leagues Plot repeated over 8 y-axis values</figcaption></figure><p>After applying K Means clustering to each set of data along a given Y-axis value, we are left with eight different ways to cluster 26 leagues. Note that the color of the data points (leagues) denotes the cluster that league belongs to.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/600/1*LPmeNjDYqzQb9KrCm12ysQ.png" /><figcaption>26 Leagues plotted and clusters with 8 different applications of the K-Means Clustering</figcaption></figure><p><strong>Clustering 26 Leagues in K Groups</strong></p><p>What you are looking at is all 26 Leagues with K Means Clustering applied eight times. Y-axis value 1 (K = 1) is the data clustered into 1 group (essentially no clusters), while y-axis value 8 (K = 8) is the same data clustered into 8 groups.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/600/1*LPmeNjDYqzQb9KrCm12ysQ.png" /><figcaption>26 Leagues plotted and clusters with 8 different applications of the K-Means Clustering</figcaption></figure><p>Viewing all the various clusterings together reveals interesting insights. For starters, it’s clear as day that the Premier League is in a class of its own. Except for K = 1 and K = 2 clusters, there is no value of K, where the Premier League does not exist in a cluster of one. I knew the Premier League was the best league in the world, but seeing exactly just how far it is from its closest competitor, La Liga, is shocking. It should still be noted thought, that K = 2 clusters did not separate the Premier League from every other league. It was instead clustered together with the other “Big 5 Leagues”. This is interesting, because it means that K Means Clustering agrees with “Conventional Wisdom” among football fans, who support the notion of there being 5 main top class leagues.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qddtOWS6reavVm-NoXggmQ.png" /><figcaption>26 Leagues plotted and clusters using K-Means Clustering with K = 3 and K = 4</figcaption></figure><p>In my opinion, clustering the data into K = 3 and K = 4 clusters was the most interesting way to divide the data. K = 3 clusters presents the Premier League as the top dog, but also enforces the notion of the “Big 5 Leagues”, which is inclusive of Ligue 1. Next, when we expand to K = 4 clusters, one might expect the Ligue 1 and Serie A to be separated from the Bundesliga and La Liga and placed into different clusters. Instead, the Premier League remains in a cluster of one, while the remaining four “Big Five Leagues” remain in a cluster together. K Means Clustering would prefer to divide the remaining smaller 21 leagues further, rather than break up the remaining four “Big Five Leagues”.</p><p>I believe the takeaway here is that the “Big Five Leagues” isn’t something Ligue 1 fans say to make themselves feel good. The data backs them up. Anecdotally, there does seem to be a pipeline of obscure Ligue 1 players (such as Ngolo Kante) succeeding the Premier League. Perhaps Ligue 1 does not get enough credit.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/600/1*NUQzbitoTzpcGfvOZvrBaw.png" /><figcaption>21 Leagues (“Big 5 Leagues excluded) plotted on a line based on their total TransferMarkt Market Value.</figcaption></figure><p>Outside the top 5 leagues, every other league is quite far behind. In fact, nearly 2 billion dollars in player value behind. With such a large gap, I felt it was worth re-clustering the leagues outside the top 5 leagues. After reviewing the results, my first takeaway was that the Portuguese league is the “best of the rest”. This came as a surprise, but the league has been producing great talent of late.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/600/1*XYxw4JZjqWogFWay5JAhZQ.png" /><figcaption>17 Leagues plotted and clusters with 8 different applications of the K-Means Clustering</figcaption></figure><p>These leagues unfortunately do not fit themselves nicely into new clusters. If we do break these leagues into K = 2 clusters, it’s interesting that the Championship is the only second division league in the best cluster. We can call this cluster the “Best of the Not Top 5 Leagues”. Meanwhile, Liga MX and MLS just miss out on being in the “Best of the Not Top 5 Leagues”.</p><p><strong>Building a League Tier List</strong></p><p>After looking at all the various clustering, I believe the data is best divided into 5 clusters. For fun, I’ve gone ahead and broken it down into a tier list. It looks like this: <strong><em>S — The Premier League</em></strong>, <strong><em>A — The (Next) Big 4</em></strong>, <strong><em>B — Mid Tier Top Flights (and the Championship) </em></strong>, <strong>C</strong> — <strong><em>2nd Division Leagues (and Friends)</em></strong> and finally <strong><em>D — The Rest</em></strong>.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OOgRjnOWZV4XiIIj6mPGbg.png" /><figcaption>My final Tier Ranking of 26 football leagues based on a K-Means Clustering of K = 5</figcaption></figure><p>This tier list is not the law of the land. It’s missing leagues such as Argentina’s top flight, CAF representation and AFC representation (I’d be very intrested to see where the Saudi Pro League falls). Not to mention, TM Market Values are an imprecise and imperfect metric for valuing leagues.</p><p>Sadly, A tier list like this falls flat in the face of recent anecdotal evidence. The Premier League has struggled in European competition this year, which would argue against the Premier League’s placement in a class of their own. An possible explanation for this could be that be that the wealth amongst a squad has a logarithmic impact on a squad’s strength.</p><p>In my opinion, this data serves as an additional perspective on how to compare the quality of leagues. There are very few available resources to quantify, for example, how much better the Bundesliga is than Australia’s A League. We know it’s better, but by how much? Using classes, statistically generated through K-Means Clustering, can at least provide context by finding similar-level leagues, that Bundesliga fans may be more use to. They’ve maybe never seen an A League match, but they can probably seek out a Ligue 2 match for reference. When statistically generated, a silly tier list like the one I made, could actually be of some value to a scout trying to convince an executive to take a chance on a player from a more obscure league.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=e57e9ca4e7b7" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Can Resigning A Top Player Doom A Club?]]></title>
            <link>https://medium.com/@johncomonitski/can-resigning-a-top-player-doom-a-club-167a4f90c786?source=rss-e8cb1575753c------2</link>
            <guid isPermaLink="false">https://medium.com/p/167a4f90c786</guid>
            <category><![CDATA[soccer]]></category>
            <category><![CDATA[philadelphia-union]]></category>
            <category><![CDATA[mls]]></category>
            <category><![CDATA[scouting]]></category>
            <dc:creator><![CDATA[Johncomonitski]]></dc:creator>
            <pubDate>Sat, 09 Mar 2024 22:37:18 GMT</pubDate>
            <atom:updated>2024-03-09T22:37:18.926Z</atom:updated>
            <content:encoded><![CDATA[<h4>Is it possible to have a player thats too good for your team? Perhaps even too good the league? As the salary cap looms ominously over every MLS team, the Philadelphia Union’s Kai Wagner may have boxed himself and his club into a corner by being too good and deserving too much money.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*o3q5-0jl4LlEof4K1Kj3cg.jpeg" /><figcaption>Kai Wagner, left fullback for the Philadelphia Union and one of the top wingbacks in MLS</figcaption></figure><p><strong>Should the Union Have Resigned Kai Wagner?</strong></p><p>After joining MLS as an unheard-of, unproven, 3. Bundesliga defender, Kai Wagner emerged as one of the most productive wingbacks in MLS. Looking back on the r/MLS thread that announced his transfer from the Würzburg Kickers to the Philadelphia Union, fans were less than impressed.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/824/1*quaRLF3K10rACQnngUgeEw.png" /><figcaption>Comments from the original r/MLS thread annoucing the Union had signed Kai Wagner</figcaption></figure><p>Kai quickly silenced doubt among the Union faithful. With two MLS All-Star honors to his name, inclusion in the 2022 MLS Best XI, and a remarkable MLS record of 15 assists by a defender in a single season, he became a signing some sporting directors only make in their wildest dreams.</p><p>With a meteoric rise like Kai’s, it was only natural he’d be the subject of constant transfer rumors and speculation. As the club entered the 2023 season, Kai was also entered the final year of his contract with Philadelphia. Fans were shocked to learn that, while the Union did offer Kai a new contract, he rejected that contract. Then, rather than return with a new offer, Philadelphia decided to let him walk. Fan’s shock quickly turned to outrage as they then protested what they believed to be another sign that the Union ownership prioritizes frugality over success. They chanted “Pay Kai Wagner” game after game (<a href="https://twitter.com/JTansey90/status/1710809332634333426">Pay Kai Wagner chant</a>).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*XfwH-u7KVDEZpRJ1hobkFA.png" /><figcaption>For nearly every game after Kai Wagner, who wears #27, was announced to not be returning fans chanted “Pay Kai Wagner” at the 27th minute of the match.</figcaption></figure><p>An important aside, support for Kai quickly diminished among fans after Kai was suspended by the MLS Disciplinary Comity for three games due to an incident involving New England’s Bobby Wood, where he violated MLS’s on-field anti-discrimination policy. While his racial abuse of Bobby Wood is not the subject of this article, ignoring it would be a disservice when discussing Kai’s future with the Union, MLS, and other teams around the world.</p><p>In a pleasantly mature and non-partisan move by Union fans, many decided that Kai’s actions drew a line in the sand they would not dare cross. Messages of “Thank you and goodbye” turned to “Good riddance”. Behind the scenes, contract negotiations continued to stall and rumors of moves to sides like Red Star Belgrade and Marseille swirled around. Kai seemed destined for departure.</p><p>It turned out to be a quiet offseason for Kai and ultimately Kai Wagner signed a new three-year deal with the Philadelphia Union. The specifics of this agreement remain undisclosed, but it’s reasonable to infer that Kai received a raise from his previous salary of $700,000 per season. I assume one of two things happened. 1. The European interest was not as concrete or as lucrative as Kai hoped or 2. The Union budged and met Kai’s salary exceptions. Let’s assume for the sake of argument it’s the latter and they were able to agree on a salary that came closer to Kai’s ideal salary.</p><p>Undoubtedly, in a vacuum, Kai strengthens the Union. I genuinely believe he could secure a starting spot on nearly every team in MLS. We must however consider how MLS’s strict salary cap complicates matters. If financial constraints were a sticking point in their previous negotiations, there’s a looming question about the threshold at which Kai’s salary might outweigh his on-field contributions. In a salary-cap league, every dollar allocated to Kai takes away from resources elsewhere on the roster. It raises the uncomfortable question: Should the Philadelphia Union have resigned Kai Wagner?</p><p><strong>Valuing a Fullback with Bad Branding</strong></p><p>Kai Wagner is a fullback with a branding problem. Some wingers would kill to deliver 8 to 15 assists per season and if you rebranded Kai a “defensive winger”, he’d probably double his salary overnight. Tactically, the Union employs a 4–4–2 with a diamond in the midfield. They don’t use wingers and this gives Kai freedom to get high up the pitch as well as be involved in the build-up. Once the ball reaches the final 3rd, Kai can often be found waiting atop the box or in the wide left channel just outside the box, ready to help recycle a ball and whip it back into the box. Among MLS fullbacks in 2023, Kai found himself in the 99th percentile for crosses and 96th percentile for through-balls. Accurate deliveries and picking out that final pass define Kai’s offensive abililties.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*eYvuSaUuSAJrZrUDgAtnbQ.png" /><figcaption>Kai Wagner’s passing statistics for the 2023 MLS Season as a percentile when compared to other fullbacks in MLS (Datasource: FBRef)</figcaption></figure><p>This offensive involvement also does not come at the expense of defensive responsibilities. I don’t think Kai is a lights-out defender, but his statistics suggest that he’s a top defender. These numbers may slightly bolstered by playing on one of the most defensively solid and well structured teams in MLS, but it’s lazy to assume that because he’s so offensively talent, his defense must be a weak link. That does not appear to be the case.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*CmZqZ9yIqqa3nTJwum6lsQ.png" /><figcaption>Kai Wagner’s defensive statistics for the 2023 MLS Season as a percentile when compared to other fullbacks in MLS (Datasource: FBRef)</figcaption></figure><p>How do you value a fullback like this? Digging into the MLSPA’s 2023 Salary Report, Kai was the 41st highest-paid defender and 12th highest-paid fullback in MLS after earning $701,000. For comparison, in 2023 Richie Laryea was the highest paid full back in MLS. Laryea earned $1.437 million, which was more than twice as much as Kai. Laryea can also be classified as a wingback and earned 9 G+A in 2023. Kai’s total was 8 G+A. Technically Kai has a lower hual, but his Price per Contribution was $87,625, which blows Laryea’s Price per Contribution of $159,593 out of the water. The Union got incredible value from Kai.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*6-z26TzuUbh9CL1clBHccQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*RYfEIjT0NN_e85tD4ifKHw.png" /><figcaption>Kai Wagner vs Richie Laryea’s passing statistics for the 2023 MLS Season as a percentile when compared to other fullbacks in MLS (Datasource: FBRef)</figcaption></figure><p>Let’s zoom out to a league-wide perspective, where Kai emerges as the 181st highest-paid player in MLS (out of 809 players). An interesting comparison to Kai is his teammate Daniel Gazdag, who is renowned as one of the league’s top attackers. Gazdag epitomizes salary efficiency. Earning $1.355 million in 2023, Gazdag generated a remarkably low Price per Contribution of $64,523. It’s a testament to Kai’s offensive efficiency that he can, as a fullback, be mentioned in the same breath as one of the league’s premier attackers. We don’t generally think of fullbacks as offensive pieces, but his offensive efficiency put him well above his MLS fullback colleagues and into conversations with the best attackers in MLS. For that reason, a tag like “defensive-winger” almost feels more accurate than “fullback”.</p><p>I don’t know how much to pay Kai, but there’s no doubt he’s undervalued. Unfortunately for Kai, the Philadelphia Union do not traditionally spend big money and even if the Union did finally decide to open the purse strings, they can’t just “give him everything he asks for and some”. The MLS salary cap must be navigated carefully and there exists a salary where Kai is no longer worth his offensive contributions. If Kai makes Richie Laryea money, is that good for the rest of the team? More money to Kai means a thinner, weaker team overall.</p><p>This runs contradictory to the Union’s needs. Depth was one of their Achilles heels in 2023, when a 51-game season left the team exhausted. Union beat reporter José Roberto Nuñez recently asked Union goalkeeper and captain Andre Blake if the Union did enough to improve squad depth during the 2023 offseason. His response was a telling “Um, I don’t want to get into any trouble” (<a href="https://twitter.com/JoserNunez91/status/1765217106595840383">Original Tweet</a>). The Union need depth badly.</p><p>Hypothetically, if Union found a new fullback at 1/3 of Kai’s current salary, who’s still offensively minded, but less productive, could they then reinvest that savings on another attacker and perhaps a depth piece or two? The team might just be better for it overall. These are the challenges MLS general managers wrestle with every transfer window. The data is showing, that the savvy General Managers who acknowledge this harsh reality, rather than throwing endlessly cash around at big DPs, are likely coming out on top. 2023 Price per Point statistics argue that it is not the biggest spenders who win the most, but the teams who are most thoughtful with their money that wins more often (<a href="https://www.reddit.com/r/MLS/comments/1auqc67/2023_mls_teams_ranked_by_per_point/">2023 Price per Point data</a>).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*jhh3KZLEUXMQhA6-60hiRQ.png" /><figcaption>Top 5 teams in MLS by points and their Price per Point (PPP) and total spend as a rand in comparison to other MLS teams.</figcaption></figure><p>None of the league’s 10 biggest spends finished in the top 5. It’s for this reason, that I can’t help but look at the Price per Point data and in my gut feel that Kai may be worth “Richie Laryea money” at $1.437 million, but just not in Philadelphia. Perhaps the smartest thing to do is to find a replacement.</p><p><strong>Finding the Next Kai Wagner</strong></p><p>Let’s go back to the fall of 2023 and pretend I am Ernst Tanner, the Philadelphia Union’s Sporting Director. I’ve decided that while Kai is a wonderful player, we just can’t pay him what he wants. Who should replace him?</p><p>We can accept a drop in production from his replacement, but we’d need to see some offensive contributions. This player will also have to be young and cheap. After all, we’re running the Philadelphia Union.</p><p>In my opinion, the best leagues to look at for this challenge would be the 2 or 3 Bundesliga. These are smaller leagues, but not small enough that an amateur scout, like myself, can’t find data to work with. These leagues also have a history of their players succeeding in MLS and MLS players succeeding in their league. It’s an established path in both directions. In my hunt, I was able to come up with an interesting alternative to Kai Wagner from the 2. Bundesliga.</p><p>Allow me to introduce 24-year-old English left-fullback Derry-John Murkin from Schalke. Traditionally, he’s been used as a left-fullback in a 4–4–2 and has even been a part of a 4–4–2 with a midfield diamond. Given this is the Union’s preferred setup, he’d already be familiar with aspects of their system. Like Kai, Murkin does have tendencies to get involved in final 3rd play, however, his passing isn’t on the level of Kai’s passing. What Murkin lacks in passing, he makes up for in direct play however. He’s willing the carry the ball down the flanks and with 5 assists in 16 matches, there’s production to show for it. Since he’s producing nearly 1 assist every 3 matches in a bad 2. Bundesliga, perhaps he can achieve a similar level of production in MLS.</p><p>📸 Murkin Photo and Radar — ✅</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Fo2vhEM2Yrk1T0Y9TSiG7Q.png" /><figcaption>Derry John Murkin’s passing statistics for the 2023 MLS Season as a percentile when compared to other fullbacks in MLS (Datasource: FBRef)</figcaption></figure><p>Defensively, Murkin is hard to rate. His defensive statistics are not terrible. I question, however, if this a product of playing on a bad Schalke team. Would his statistics flourish in a more defensively sound team? In the past, Schalke has used him as a 3rd left-sided centerback. What that shows me is that he’s versatile, but more importantly, that Schalke have a high level of defensive trust in him. Given the chaos and the potential relegation to the 3. Bundesliga that Schalke face, there’s reason to believe he’d be available for relatively cheap. Much like the Union acquiring Julian Carranza from Miami, acquiring Murkin would be another classic example of the Union taking advantage of another team’s dysfunction.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*i_24iy9mq9J88WnVwLzaWw.png" /><figcaption>Kai Wagner’s defensive statistics for the 2023 MLS Season as a percentile when compared to other fullbacks in MLS (Datasource: FBRef)</figcaption></figure><p>I have no delusions of grandeur that the Union will read this article, drop all their plans, and sign Murkin. That’s not the point. If I can, with minimal scouting experience, find a player to replace Kai that strikes me as a great fit, imagine what someone like Ernst Tanner can do with decades of experience and a proven track record of discovering MLS talent in obscure places. Believing a player is irreplaceable is a dangerous place to be as a team. Kai Wagner is wonderful, but he’s not irreplaceable.</p><p><strong>Should the Union Have Resigned Kai Wagner?</strong></p><p>The decision on Kai Wagner is already settled. He’s here for the next 3 years and as I fan, I hope he’s worth every penny. Plucking him from the 3. Bundesliga was like finding a diamond in the rough and perhaps asking to go and find a second Kai Wagner is like asking “Lightning to strike the same place twice”. I could describe finding a Kai Wagner replacement with cliches for days, so maybe conventional wisdom is right and maybe Kai’s best has yet to come.</p><p>I just can’t help but notice, however, that the Union was not built with conventional wisdom. Leading the team is a 6th-round draft pick at CB, a Venezuelan midfielder who had 0 international caps before joining, a 3-goals in 41 appearances failed striker from Miami and a 3. Bundesliga fullback. Somehow, this unconventionally built team has earned more points in MLS than any other team over the last 4 seasons.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*dWfFAhqS3witw0ZNIkiJTg.png" /><figcaption>The Philadelphia Union celebrating winning the 2022 Eastern Conference Final</figcaption></figure><p>My gut tells me that the Union should not have re-signed Kai Wagner. I do believe there other other fullbacks out there who can do the job, while freeing up cap space elsewhere on the team. It’s an unpopular, unconventional decision, but it’s the kind of unconventional decision that got this team where it is now. My heart is happy he’s back. My gut feels differently. Now, I must do something that Philadelphia Sports fans are all too familiar with, sit back and “Trust the Process”</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=167a4f90c786" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>