Skip to content

toLower is slow #459

@noughtmare

Description

@noughtmare

See this discourse post.

Take this program:

import qualified Data.Text as T
import qualified Data.Text.IO as T
import System.IO
import Control.Monad

main :: IO ()
main = isEOF >>= \b -> unless b $
  foldr seq main . T.words . T.toLower =<< T.getLine

It is almost 2x slower than equivalent Python code:

import sys

for line in sys.stdin:
    words = line.lower().split()

The benchmark I used is the kjvbible.txt.

Results (using hyperfine):

Benchmark 1: ./simple-hs <kjvbible.txt
  Time (mean ± σ):     156.7 ms ±   4.9 ms    [User: 141.2 ms, System: 6.9 ms]
  Range (min … max):   149.1 ms … 164.8 ms    19 runs
 
Benchmark 1: python3 simple.py <kjvbible.txt
  Time (mean ± σ):      81.0 ms ±   2.4 ms    [User: 73.5 ms, System: 5.4 ms]
  Range (min … max):    78.3 ms …  87.8 ms    36 runs

Removing the toLower step from both programs makes the running time approximately equal:

Benchmark 1: ./simpler-hs <kjvbible.txt
  Time (mean ± σ):      70.2 ms ±   6.9 ms    [User: 54.9 ms, System: 7.0 ms]
  Range (min … max):    61.4 ms …  92.7 ms    44 runs
 
Benchmark 1: python3 simpler.py <kjvbible.txt
  Time (mean ± σ):      70.6 ms ±   1.3 ms    [User: 62.8 ms, System: 5.3 ms]
  Range (min … max):    69.0 ms …  77.0 ms    40 runs

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions