Importing cfdm is quite slow ... on my local computer it takes 1.4 seconds.
Another system I have tried (JASMIN) can take between 6 and 10 seconds!
Having had a look at this, there seem to be two main culprits for the slow import:
- Doc string rewriting
- At import time, every docstring (of which there are currently 5569) is inspected for doc string substitutions, replacing any that are found
- Importing external modules that themselves have a slow import
- The main problems here are
dask, scipy, s3fs, zarr, h5netcdf, uritools, netCDF4
These can be improved on:
- Doc string rewriting
- Only apply substitutions when necessary, rather than trying every possible substitution for every doc string. Only 3009 of the doc strings need rewriting, and each one of those only utilises a small number of the 110 possible substitutions.
- Importing external modules that themselves have a slow import
- Move the
dask, scipy, s3fs, zarr, h5netcdf, uritools to run time, rather than import time. Many will not ever get imported, and when they do, the time is usually negligible compared to the operation being run.
Results
By applying these two changes, my local import time reduces to 0.2 seconds (from 1.4 seconds - a factor of 7 speed-up). On the other system I tried, the time reduces to between 1.5 and 2.5 seconds (from between 6 and 10 seconds).
These are good enough improvements for a PR, I think ...
Importing
cfdmis quite slow ... on my local computer it takes 1.4 seconds.Another system I have tried (JASMIN) can take between 6 and 10 seconds!
Having had a look at this, there seem to be two main culprits for the slow import:
dask,scipy,s3fs,zarr,h5netcdf,uritools,netCDF4These can be improved on:
dask,scipy,s3fs,zarr,h5netcdf,uritoolsto run time, rather than import time. Many will not ever get imported, and when they do, the time is usually negligible compared to the operation being run.Results
By applying these two changes, my local import time reduces to 0.2 seconds (from 1.4 seconds - a factor of 7 speed-up). On the other system I tried, the time reduces to between 1.5 and 2.5 seconds (from between 6 and 10 seconds).
These are good enough improvements for a PR, I think ...