Entering the Metaweb

May 30, 2007

I am two weeks into a new position as Senior QA Engineer at Metaweb. The current plan is for me to lead testing of Freebase, our client application (currently in an invite-only Alpha). It was a bit surreal of a decision to make. It meant leaving the most fun work environment I’ve ever had – and after only 6 months, something I’ve never done before – but the opportunity to be a part of what Metaweb is trying to do is too rich an opportunity for me to pass up.

And what are we doing that’s so exciting? In short, we aim to organize the world’s knowledge – in ways that are more meaningful to machines than pages of text. We hope that Freebase will compliment the fantastic ways that Google and the Wikipedia already help us to access the information we need – solving certain problems that are challenging using current options.

You can read more about what we’re doing at the NY Times, at Tim O’Reilly of O’Reilly Books’ blog here and here, or listen to an in depth interview of our “Minister of Information”.

I am only a few weeks into the job. One of my first impressions (besides the usual excited blur as I learn a new application and explore a new domain) is that it’s very refreshing to come to an environment where there are already experienced testers. The last two places I’ve worked, I’ve enjoyed getting to create the test processes from scratch. Now I’m enjoying learning from what others have established…and providing thoughts and ideas from the perspective of an outsider coming in to an existing organization, rather than building from scratch.

Limeuristics

May 11, 2007

We’ve read James’, Mike’s and Jonathan’s compelling arguments for using mnemonics to keep useful testing heuristics at the tip of our tongue. But where, I ask you, can one find heuristics that rhyme?

In anticipation of National Limerick Day (May 12th), I humbly suggest using Limeuristics.

I’ve written a handful to get us started:

There are those who say testing’s a bore
And the moment they start it they snore.
Now my secret I’ll tell:
If you learn to test well
You’ll discover a thirst to explore.

A tester I know (who loves bisquits) –
Her peers thought her gifted by mystics.
“You’ve got it all wrong,”
She’d explain to the throng.
“It’s just that I use good heuristics”

Certifications are always a-fruitin’
To make resumes highfalutin’.
If they don’t certify
What they try to imply –
Our craft, I’m afraid they’re pollutin’

Every tester at times gets contusions
When their findings leave egos with bruisin’s.
But don’t say we’re unkind
For the things that we find
We must test to dispell our illusion
s

…and finally, with thanks to Chris McMahon and “the man from Japan” for the inspiration:

A limerick ’bout testing sublime
Should have meter and rhythm and rhyme
But I might bend a rule
While a bug I pursue
So don’t be surprised if I decide to add a boundary test into the last line.

Have a very happy National Limerick Day. If you have a Limeuristic you’d like me to add to the list, I’d happily add yours as well.

Cheers!

Why Do You Enjoy Testing?

April 30, 2007

My first career was as an educator. I worked for 10 years with youth of all ages, in and out of the school system. One of my favorite jobs was running leadership training programs for teenagers at Hidden Villa. I had some very gratifying times, but was feeling ready for a change just as a grant-funded position of mine was drying up. That was September of 2000, at the tail end of that wacky boom when someone could get a testing job just by being bright and willing to learn.

When I started testing, I decided to try it out for a year – pay off my tenacious student loan debt – and then decide what was next. I knew very little about what to expect, and what I discovered surprised me. I’ve now been a software tester for five and a half years and my satisfaction with the work continues to grow. Many folks in my life see that I seem pretty pleased with my work, but continue to be perplexed as to why exactly this is so stimulating for me. Recently a friend who doesn’t work in technology asked “…But don’t you miss interacting with people?”

To this friend, I started by saying that a great deal of the work I do is interactive – I commonly spend much of the day talking to other testers, to software developers, to the folks who asked for the software or who represent our users…and in fact just about everyone in the company. In my current position, I may well have regular interactions with more folks in the company than anyone else. Along the way, I get to ask myself and others thorny questions. I challenge myself to seek out and illuminate unaware assumptions (my own first, and then those of everyone else connected to the project). I imagine what the potential risks are in the product we are building. (What might be broken? What might have unintended consequences? What proposed solution might not really solve our users’ problems?) I think as creatively and as strategically as I’m able about how to explore those risks (What else haven’t I considered yet? What’s another angle this could be approached from?) and then as I explore those risks I keep thinking, generating new test ideas and refining my strategy. Along the way I am learning constantly – about the product, underlying technologies, the users we want to serve, etc.

To me, this is in many ways a dream job. My friend clearly didn’t understand. She is both a voracious reader and a writer, so my next tact was describing the books I’m currently reading to learn and grow as a tester. While I’ve learned a good deal from testing and programming books, that’s not what I’m reading at the moment. I recently finished The Logic Of Failure (mentioned in an earlier post) – a fascinating study of how our thinking can break down in the face of complex systems, often leading to dire results. The next (barely begun) book is Jerry Weinberg’s Introduction To General Systems Thinking…which I can already tell is one for me to read slowly and to reread – it is dense with insight into how complex systems work.

This meant a bit more to her. She still couldn’t quite picture what I did (which is fine) but was intrigued that social psychology and general systems theory were on a tester’s reading list, and decided based on that that whatever-it-is-I-do must be more than she thought it was.

I’ve been thinking about the job of a tester a lot recently, partly because I’ve been hiring (or attempting to hire) testers…and having a hard time. I know there’s a marketing problem here, because (a) so few folks (outside of tech companies) seem to have even heard of testing as a job, and (b) those who have heard of it tend to have heard either that it’s “a job for programmers” or that it’s “boring and repetitive”. Now, programming skills will almost always help (and sometimes are necessary) but frankly I think that the technical skills involved in testing are often easier to train folks on than the just-as-crucial creativity, organization, communication, and strategy. As far as repetitiveness: I know that every job contains repetitive elements, but I would suggest that testing well minimizes the repetitive aspects while maximizing covering new territory…because covering new territory (or finding ways to cover old territory in a new way) tends to provide more useful information about the state of the product to its stakeholders.

All that said, do you enjoy testing? If you do, why? And if you’re feeling bold enough, how do we get the word out to smart, creative, organized folks that exploring software is a fascinating and lucrative way to make a living?

A Rose By Any Other Name

April 6, 2007

There was an interesting conversation about test terminology a few weeks back on the Agile Testing list. It started with Chris McMahon forwarding an amusing post looking for a Non-Functional Tester…and led to an interesting conversation about variation amongst test terminology, and whether we should be trying to standardize it. I’m feeling the urge to sum up and synthesize what I’m

First, let me go on record stating that I think trying to hire a “non-functional tester” is an painful miss-use of language. It reminds me of a white fellow I knew in college who commented after being at a black event that “It was interesting to be the only majority in the room”.

Regarding what language to use – There are no meaningful standard definitions for testing terms of which I am aware. There was an interesting conversation about this on the software-testing yahoo group a few weeks back. Matt posted a bit about it here: http://xndev.blogspot.com/2007/02/non-functional-testing.html

When folks talk of non-functional or parafunctional testing, I think they tend to mean “all forms of testing other than testing a particular function of the software.” This tends to include some combination of: performance/load/stress, scalability, integration, usability, and security testing…and probably a good deal more.

Someone on the Agile Testing list suggested we find a way to name it positively rather than negatively. It’s a good challenge. For a term that’s used as a catch-all for “everything other than ____”, can there be a way to state it positively, other than to use a list? I tend to think that the only thing that ties these classes of tests together is that they aren’t functional tests. That in turn makes me wonder if it’s a concept with much value, whatever name we give it.

I think the real issue is that, like many terms for “everything other than ____”, it’s a funny bucket to try to define. People seem to want the bucket though, and given that I think that using a term that describes it as accurately as possible is a good thing. “Para-” can mean “beside” or “in addition to” (as in paramedic). For that reason, plus the fact that I have yet to hear a more descriptive term suggested for it, and because it’s starting to get (at least a bit of) acceptance amongst testers, parafunctional works for me.

If we go back to the description above, Parafunctional is (to me) at least as clear as non-functional, and has the additional virtue of not sounding foolish.

The Logic of Failure, Part 1

April 5, 2007

I’ve been reading Dietrich Dorner’s The Logic of Failure recently. I am not quite done and will post more soon, but there are a few quotes that I keep coming back to. Here’s one:

An individual’s reality model can be right or wrong, complete or incomplete. As a rule it will be both incomplete and wrong, and one would do well to keep that probability in mind. But this is easier said than done. People are most inclined to insist that they are right when they are wrong and when they are beset by uncertainty. (It even happens that people prefer their incorrect hypotheses to correct ones and will fight tooth and nail rather than abandon an idea that is demonstrably false.) The ability to admit ignorance or mistaken assumptions is indeed a sign of wisdom, and most individuals in the thick of complex situations are not, or not yet, wise. (p. 42)

Dorner gives a fascinating analysis of the reasonable decisions which led to the Chernobyl disaster. It’s interesting to note that there was no glaring mistake here – no one who fell asleep on the job or did something ‘insane’. There were a number of places where folks acted in violation of safety regulations. Why? Most of these was an experienced operator who knew that that rule was a bit more conservative than it needed to be in this case, for an operator of his/her skill, and each operator had broken that saftey rule before with positive results. (e.g. they were able to avoid overtime without causing any problems). In fact, Dorner points out that the nature of safety regulations is that we tend to be rewarded for violating them most of the time. If I choose not to wear my bike helmet, the main effects are that I can enjoy the wind in my hair and don’t look quite as silly as usual…most of the time. If we break internal before releasing a product, we may get similarly positive results…most of the time. Happily, on the projects I’ve been a part of the potential negative consequences of releasing buggy software tend to be less frightening than a nuclear meltdown, or a bike accident without a helmet. There are plenty of bugs in the world though that have cost lives, and many more that have cost jobs and money.

I am a big believer in having a group of informed stakeholders assess the relative risks of releasing v. waiting. I believe that my job as a tester is to make sure that that is as informed of a discussion as it can be, but not to try to assert control of blocking the release. Dorner is a humbling reminder for me that while I attempt to shed light on the current state of the application under test, my model can be assumed to be incomplete and incorrect.

I believe that one of the major challenges a tester faces is how to communicate the perceived quality of the software, including a map of what is not known, and what may be incomplete and incorrect.

Creating Methods On the Fly…And Bugs in Google Phonebook

January 27, 2007

I wrote the following Watir script as a demonstration of how to generate new methods on the fly in ruby.

See (at the time of this writing) there’s a bug in Google Phonebooks, where if you do a search that generates pages of results (e.g. “rphonebook: j smith, ny“) and then quickly click to some of the later pages, you will see:

Google Server Error

I wanted to write a script that would randomly generate search strings – some that would yield many results, like “j smith, ca” and others that would yield few if any, like “z glinkiewicz, ak”.

I wanted the script to be able to easily run an arbitrary user-defined number of iterations…and I wanted each iteration to have it’s own assertion, so that if one failed the rest would still run. In ruby’s test::unit (which Watir gets it’s assertions from) each assertion needs it’s own method…and that led me to generating methods on the fly.

My inspirations to play with semi-random automated tests were Chris McMahon and Paul Carvalho, and I got help with my syntax through a speedy answer to my question from Brett Pettichord on wtr-general.

Here’s the script I wrote. Feel free to question me about why I did what I did or to propose improvements (I consider myself a beginning automator). In the mean time, without further ado…here’s the code:

#
#   This script creates any number of randomized Google phonebook searches,
#   then quickly cycles through each page of results, looking for server errors.
#
#   Written to reproduce and explore a bug that I found in the Google phonebook,
#   and to show a watir script of mine that's completely non-proprietary and can
#   be run against publicly available software.
#

#$LOAD_PATH.unshift File.join(File.dirname(__FILE__), '..') if $0 == __FILE__
require 'test/unit'
require 'watir'

class TC_GooglePhoneBook < Test::Unit::TestCase
  include Watir
  $count = 15   # set number of iterations here

  def setup
    $ie = Watir::IE.new
    $ie.bring_to_front
    $ie.maximize
    $ie.set_fast_speed
  end

  # data arrays
  first_initial = ('a'..'z').to_a
  last_name = ["allen","brown","glinkiewicz","johnson","jones","mason","ross",
                    "sanchez","smith","wieczorek","williams","woo","wolfe"]
  state = ["ak","az","ca","fl","ma","mi","mt","nv","ny","wa"]

  $count.times do |count|

    fi = first_initial[rand(first_initial.length)]
    ln = last_name[rand(last_name.length)]
    st = state[rand(state.length)]
    method_name = :"test_#{count}_#{fi}_#{ln}_#{st}"  #dynamically create test method
    define_method method_name do
      search_string = fi +" "+ ln +" "+ st
      $ie.goto("http://www.google.com")
      $ie.form( :name, "f").text_field( :name, "q").set("rphonebook: #{search_string}")
      $ie.button( :name, "btnG").click
        i = 1
        while $ie.link( :text, 'Next').exists? do
          $ie.link( :text, 'Next').click
          i = i + 1
          assert_no_match( /Server Error/, $ie.text, "Page #{i} contains a server error." )
        end #do
    end #method
  end #N.times do count

  def teardown
    $ie.close
  end

end #TC_GooglePhoneBook

Incentives for Developers and Testers

January 27, 2007

There’s a new Google Testing Blog. In the comments for their first post, Michael asks innocently:

I have a simple question as a former-programmer, now business guy, between dev and QA.

Why not put pay performance targets on both sides as incentives per testing release?

In other words, each QA person gets $10 for each bug they find (up to 10, or whatever). Each development person gets $10 for the number of bugs not found under a certain target (10, or whatever).

Seems like an easy way to get people more motivated about the whole testing process.

Sounds almost common-sensical…but it’s a disastrous idea.

Why?

Be Careful What You Wish For
Incentives often work in perverse ways. Tell Tony Tester that you’ll pay him for every bug he logs, and he’ll clog your bug tracking system with niggling items, and go to pains to turn one bug into ten bug reports. Tell Connie Coder that you’ll pay her for not having bugs logged against her, and she’ll spend half her day arguing why these three issues are features, not bugs, and these four are really in Carry’s code, not hers, and…you get the idea.

Do I think that everyone is really this petty? No, but you are incentivizing pettiness here, and incentives can have a frightening power. Which leads to the next problem with this suggestion:

Replacing Intrinsic Motivation with Extrinsic Motivation Harms Your Team
This may not be true for everyone – I’ve met a few folks who swear that they work best when there is dollar goal they are shooting for. A good deal of research shows that:

  • Intrinsic motivation is more valuable to an organization than extrinsic motivation, and that
  • Extrinsic motivation tends to erode intrinsic motivation.

My experience as a worker, team member, and manager has been that the best folks work for some combination of:

  • Personal pride in a job well done,
  • Commitment to a vision,
  • Commitment to the team (or to someone on the team),
  • The enjoyment of the task itself, and
  • A desire to learn and grow.

That’s not to say that salary is irrelevant…but I think it’s relevance is often misunderstood. If one believes in the project, likes the team, and understands that there’s just not much money in the company right now, many folks will happily give their all, knowing full well that they could make more money somewhere else. If you don’t believe me, look at how hard school teachers and non-profit workers tend to work.

On the other hand, if that same person finds out that the person next to them is making 20% more for roughly the same job, their intrinsic motivation may be completely destroyed. Why? It’s not the amount they are being paid per se, because that hasn’t changed. It is that the salary differential is a sign of disrespect for their work, for unfairness in the workplace, or for dishonesty in management.

I would also add that I suspect salary is a strong factor in employee retention, but I doubt that it plays much of a role in employee motivation…other than the negative impact of undercutting intrinsic motivation by making someone feel undervalued or taken advantage of.

Bug Databases Are Tools to Understand Projects, Not to Evaluate Workers
Lastly, I believe that bug tracking databases are at their best when they help to provide insight into the state of a software development project. I believe that if Bug DBs are used to evaluate employee performance, they will necessarily begin to be gamed…and before long the information that they should provide will be obscured and distorted by folks trying to protect their jobs, to save face with management, or to maximize that next bonus.

And one of the last incentives I want to create is for someone to manipulate the data I use to get a handle on how the project is doing.

Another Testing Blog

January 18, 2007

I’ve read and been inspired by many great testing blogs for a while now, but have held off so far on contributing myself. Why start now?

Recently I’ve:

  • Had a very fun job search,
  • Chosen a job as the first test engineer at an Agile startup, and now
  • Just got back from a very stimulating AWTA,

…All of which have finally gotten the thoughts bouncing about in my head to boil over into this blog.

I’m looking forward to the regular practice of writing, and especially to hearing your thoughts, comments, and suggestions.

Cheers!

Thoughts About All Pairs Testing

January 18, 2007

Danny Faught started a thread on All Pairs testing on the Software Testing list. In it, Jared Quinert asked

In what kind of situations have people found all-pairs useful? When can we be comfortable that the theory of error that underlies all-pairs is likely to hold true? I find that it’s not that often for me. Am I just over-exercising my tester sense of doubt?

Here’s an example of how I applied All Pairs at a previous company. This was a C++ app that ran on a handful of OSs and DBs. It’s a fairly small and straightforward matrix, but it illustrates some of my thinking about All Pairs testing. The matrix below is NOT what we really did, but from memory I think it’s not far off:

Sample All Pairs Matrix

A few things to note:

  1. Tools to randomly generate an All Pairs matrix are A Good Thing, but I would generally use the generated matrix as a starting point, modifying it as appropriate. In this case, I had a small enough set of options that I had no reason to auto-generate it. Instead, it made sense to think about my context (what combinations do more of our customers use? What do our highest paying customers use?) rather than fill it randomly.
  2. Many of the combos we didn’t test at all were unsupported or impossible combinations (e.g. Linux+MSSQLServer) , but
  3. One glaring exception was Solaris 10 on Oracle — a platform we supported but never tested on.

Not testing on one of my supported platforms? Why would I do such a thing? And then go ahead and admit it in public? For a long time we were behind the times on Solaris, supporting older versions but not the latest. There wasn’t much clamor for it – not enough to be losing sales – but a few adventurous customers asked about upgrading to it. In reading Sun’s docs, I found the strongest statement I’ve ever seen about backwards compatibility – guaranteeing that if any software that ran on 9 didn’t work on 10, that Sun would release a patch to fix the incompatibility in 10. At first we just told a few eager customers, “we don’t support 10 yet, but here’s a link to what Sun says. If you want to experiment, feel free and let us know what happens” Several of them chose to try it. Six months later we still didn’t have a Solaris 10 server in house, but we had a good deal of feedback from those customers that all was well from their perspective, we did some further research leading to more encouraging reports about Solaris 10’s backwards compatibility, and finally we decided we were confident enough to declare public support.

Now, I’m sure that there are folks who would say that that was a foolish risk to take. Looking back I’m still happy with the decision…for our particular product, in our particular market, with our particular time and budget limitations.

It’s important to remember that deciding what (not) to test is always risk management. All Pairs is often a useful technique, but of course it doesn’t guarantee that an application will work on all the other platforms. In contrast to the application I’m describing, a friend of mine tests a tool that works at a very low protocol level on a staggering list of OSs, and he’s learned from experience that he really has to run every test on every platform. In my context All Pairs (and Sun’s track record of backward compatibility) mitigated the risk sufficiently that I chose to put my limited time into other tests – and I believe we caught more significant bugs as a result.


Design a site like this with WordPress.com
Get started