September 16, 2013

Proven Practices for Year-Round Testing

As someone who loves seeing nonprofits succeed, I have to say this is my favorite time of year. This is the time when nonprofit organizations seem most open to testing because even small improvements can have a significant impact during a time of year when organizations raise up to 50% of their revenue. But successful testing programs don’t hinge on a few weeks of one-offs. They require a robust approach that builds upon itself year-round.

Fundamentally, testing is all about math and data after the segments are created, the packages sent out, the creative is done and the send button is pushed. It’s a way to analyze the effectiveness of your communication with constituents, your practices for inspiring donors and so on — but this post is not about what you should be testing. Instead, it covers proven practices on testing execution itself. It’s easy to let go of testing fundamentals, which too leads to inconsistent program growth, unclear next steps or even lack of basic insight into what appears to be working.

1) Every test needs a hypothesis or “theory” it is aiming to confirm or dispel.

Write your hypothesis for each test idea and make it as mutually exclusive and collectively exhaustive as possible. “Which subject line works best?” is not good enough. Ask yourself: Does this test a larger learning objective such as increasing open rates? Increasing clicks? Re-activating a very specific segment? One-off testing is like driving your car without an ultimate destination and asking “Where am I going?” at every intersection.

2) Tests should be cumulative when possible, informing the next set of tests to fully leverage the findings of the first set.

“Extraordinary success is sequential, not simultaneous.”* Translation: Trying to capture lightning in a bottle is a 99.9% losing exercise. The odds are against you. But making a 3% incremental improvement once a month, every month for a year will get you a 38% improvement over where you started — and that’s success by any standard.

Defining a testing plan on an annual or semi-annual basis allows for the kind of learning that moves programs forward. So you must set up your test to get the learning you are seeking and plan how tests will build upon each other. This may seem obvious, but it is actually a very complex issue.

Let’s take a welcome series as an example. Welcome series are considered an industry-wide proven practice — once someone raises their hand and says, “I want to hear from you,” it makes sense to welcome them and create mission stickiness. But how do you tell, after investing the time and effort into a series, if it’s actually working?

First, that is too broad a question. Clarify what “working” means. Is it overall series opens that indicate engagement? Overall improved long-term value to the organization from the donor or supporter? This must be pre-defined, as it not only will impact the length of the test and its evaluation period but also will define how you have to set up the learning aspect.

So if we ask, “Is it working?” there is a parallel question: “Compared to what?” When testing a welcome series, it might mean creating a holdout group that doesn’t get the same welcome experience. It also may mean planning ahead for future communications to both groups to be consistent, and tracking the subsequent performance of both groups to fundraising and engagement outreach. That way, you can compare and contrast.

Similarly, deciding at what point to take stock of how these audiences are performing compared to each other is crucial to determine ahead of time. Twelve months in? After three opportunities to donate? Planning and test set-up can be daunting, tedious and complicated to track both offline and online and, to be honest, it might not be possible to perfectly execute. But understanding the fundamental test set-up is key to then reporting back on “Did it work?” and any potential factors that should be considered as impacting the result.

3) Avoid Sample Bias. Make sure the control and test cells are reasonably equal across all possible variables and that they are representative of your target audience.

What kind of results are valid and how does segment size impact statistically valid results? There are loads of tools out there to make testing set-up and segmentation easier, but they cannot fix some challenges that some in our industry frequently face.

If you have a smaller file, it may not be possible to get a valid result in one test, even by splitting the file 50/50. Nine responses compared to 11 is not a valid result, even if it happens four times over. Clients in such a situation often ask, “Well, what should we do if we don’t have a big enough file? Not test at all?” No — everyone should test! But your approach might have to be more broad and creative.

For example, if you don’t get a valid number of responses to measure donation rates, expand to the next most measurable metric, such as click-through. Consider expanding the testing period — if your site has little traffic, you might run a donation form test for three months to get a valid result. Yes, this means having discipline. And yes, it means results might come slowly. But it also means you’ll have reliable results that will be the basis of future tests.

The list above is by no means comprehensive of testing proven practices, but with these points in mind, fundraisers and marketers can arm themselves with a data-driven approach to learning what’s working and what’s not.

And one last note on a reminder I often give not only clients, but myself. Testing is, frequently, all about failure. It is based on an educated guess or an indicator in the data or something that worked for someone else — of what might work for your constituents and program. It’s OK if a test doesn’t work. Or if a series of tests doesn’t work. That gets back to the whole point of testing. If the outcome were certain, we wouldn’t need to test! So don’t get discouraged. Try to understand why the test may not have worked, move on to the next one or improve the test that didn’t work.

* “The ONE Thing” by Gary Keller. http://www.amazon.com/The-ONE-Thing-Surprisingly-Extraordinary/dp/1885167776

Miriam Kagan
Miriam Kagan is senior fundraising consultant at Kimbia. Reach her on Twitter at @MiriamKagan.
Interest Categories: Evaluation
Tags: email campaigns, email marketing, measurement