Showing posts with label QA. Show all posts
Showing posts with label QA. Show all posts

Sunday, March 21, 2010

Results Of Aqua Satellite Project QA Checks

The big QA session for the Aqua Satellite project is done. The results are there were some major bugs in the code. With these bugs fixed, the data from NASA looks they way you'd expect it to. The unusual readings occurring for each footprint mentioned in this post are still there, but they're not enough to throw off the averages.

Major Bug Fixes
We'll take a look at all this data, but first, here's a list of the major bugs that were fixed:

● The Limb Check had values that were too small for some of the footprints in its lookup table. This is what was causing entire footprints to be dropped from the old QA checks.

● A subtle bug in the gnu C++ template generator was generating integers rather than floating point variables when working with floating point data. This was causing the decimal places to be chopped off, throwing off things like average values. This was fixed by getting rid of the offending templates and replacing them with a series of overloaded functions.

● The order in which I was grabbing data for daily and monthly summaries was wrong, causing data for one footprint to be displayed as another footprint. This has been fixed.

● The concept of daily and monthly minimum and maximum values produced by AMSUSummary was and is fundamentally flawed. For reasons why, see my post on Noise. Therefore, the concept has been replaced with averages for the lower third of the data and the upper third of the data. There's enough data that goes into these calculations that noise shouldn't be a problem.

What The Data Looks Like Now
The data displayed here covers the range of January, 2008, and January, 2009 through January, 2010. This is 14 months total of data. Letting the pictures speak for themselves:

Image
Monthly Averages, By Footprint

Image
Monthly Averages, By Date

Image
Monthly Lower Third Averages, By Footprint

Image
Monthly Lower Third Averages, By Date

Image
Monthly Upper Third Averages, By Footprint

Image
Monthly Upper Third Averages, By Date

Image
Daily Averages, By Footprint

Image
Daily Averages, By Date

Image
Daily Lower Third Averages, By Footprint

Image
Daily Lower Third Averages, By Date

Image
Daily Upper Third Averages, By Footprint

Image
Daily Upper Third Averages, By Date

The upper and lower values look a little blocky. So I'll do some more QA on those. I also still need to QA AMSUNormalize. However, none of those things are enough to prevent this code from being released as part of Update 4. Update 4 will be released later today.

Previous Posts In This Series
QAing AMSUExtract
What's Wrong With This Picture?

Thursday, March 18, 2010

QAing AMSUExtract

In A NutShell
So, I've been QAing AMSUExtract the last couple days. The end results is that several bugs were found and fixed in my code. But even with these fixes, a lot of data was still failing limb validation. Even after increasing the limb validation tolerance from the original 0.5 to 10.0 (giving over 100 degrees tolerance in some cases), there were still more failures than I would consider acceptable.

This told me that having a limb validation in AMSUExtract was pointless. To get a decent number of records to pass this QA check, I'd need to make the check so loose that nearly any imaginable value would pass.

So limb validation, the -q switch of AMSUExtract, has been taken out. It's been replaced with a -FLOOR switch that lets you set a lower bound for a valid temperature. The default lower bound when you use -FLOOR is 150 degrees K (-123.15° C, -189.67° F), but you can pass any numeric value you want. Scans that are marked as bad by NASA or that fail the optional -FLOOR check are the only ones that will now fail in AMSUExtract.

Limb vaidation is still available as an option for the AMSUQA program.

I want to thank Malaga View for suggesting the feature to save not only scans that pass QA, but also scans that fail QA. This feature is now part of AMSUExtract and was very useful for this QA check. BTW, should anyone else have ideas for features, please feel free to suggest them. If I have the time, I'll add them.

The end result of all this is AMSUExtract is now working and has new features to boot. This will be released in Update 4. Now I move on to QAing AMSUSummary.



Image
Final Results With -FLOOR switch. 
The file is clean like this all the way through.

The Painful Details

The part of this post can be skipped. It's the details on how the QA was carried out. It's here for people who just love to read about QA procedures on blogs or who may need to know the details for something they're doing on their own.

Setup For QA
Image
I setup the QA runs by removing all but one folder for the directory I was working in. That folder contained .hdf files for a single month. That folder, in turn, contained a text folder that contained the text extract files created by ncdump from the .hdf files.

As always, ncdump was run with no command line arguments, just the file name of the file being converted.

You can download ncdump from here.


Running Release 3 Version Of AMSUExtract
The amsu_extract script was used to extract data from all the files for the month. The exact command line was:

amsu_extract -extension C5F1-30_QA_TEST -c 5 -f 1-30 -q

This extracts channel 5, footprints 1 through 30 and performs QA checks on them. The -extension argument tells the script how to name the output file. In this case the output extract file will be named 200912_extract_C5F1-30_QA_TEST.csv, where 200912 is the name of the folder storing the monthly data. This file is created in the same directory from which the script is run.

The version of AMSUExtract called by the amsu_extract script was the same as the one in Update 3 with one exception: scans that fail QA are sent to stderr. The amsu_extract script, in turn, saves stderr to a separate file.


Image
Results Of Running Release 3 Version Of AMSUExtract
Step 1: Check Values Being Extracted
The first thing to do is verify that AMSUExtract is pulling the correct data. To do this, use HDFView and open up the original .hdf files downloaded from NASA. You can download HDFView from here.

Use the File/Open command to open the .hdf files you want to look at. Use the tree control on the left to navigate to brightness_temp. Double click on brightness_temp. At the to of the grid window that opens up, use the page scroller to move to page 4. Page 4 is the data for channel 5, the numbering starts at zero. 

You'll notice the rows in the window are numbered 0-44. These are the scan lines. The columns are numbered 0-29. These are the footprints. 

Open up the extract file (200912_extract_C5F1-30_QA_TEST.csv in my case) in a text editor. Visually compare the values in the extract file to the values in HDFView. You don't have to check every value, but you want to check enough cases to convince yourself the correct values are being pulled.

Step 2: Compare File Sizes Of Passed And Failed Data Files
Image
Checking the file sizes of the passed and failed data shows that the failed file is about 2/3s the size of the passed file. Now, a fair portion of that is header information, but most of it is actually failed data. This means a lot of data is failing QA.

Step 3: Look At Failed Data:
Image
Opening up the failed data file, I saw a lot of data that looked reasonable to me. So I was failing data that should have passed. 

To fix this problem, I tracked down a couple of bugs in the QA checks, expanded the number of footprints that don't get a limb validation check, and increased the tolerance of the limb validation from 0.5 to 10.0.

Even with these changes, a lot of data was still failing. I'll talk about that in the next section.

Results Of Running Modified Version Of AMSUExtract
To try to cut down on the number of scans failing, I gradually increased the tolerance. Eventually, I had it all the way up to 10.0 and still had more records failing than I expected. Screen shots of running with a 10.0 tolerance are below.

Image
The error file looks a lot better, but further down the file there are still quite a few scans listed as failing.
At this point I realized it was hopeless to try and salvage the concept of running a limb validation in the extract. So I took it out and replaced it with the -FLOOR option.

Image


References:
Download ncdump
Download HDFView