Previously Unpublished: An Analysis of Eric Peterson's “Engagement” Calculation

by Joseph Carrabis on July 1st, 2009

This post contains the analysis of Eric Peterson's “Engagement” calculation that I mentioned in From TheFutureOf: Analyzing Eric's Calculation or “Back into the fray, picking up at Eric's 'The purported complexity of my calculation…'” (Note to readers — this was previously unpublished). I took the text for my comments, analysis and quotes from How to measure visitor engagement, redux.

As I noted in From TheFutureOf: Analyzing Eric's Calculation or “Back into the fray, picking up at Eric's 'The purported complexity of my calculation…'” and further down in this post, this is an analysis. I went through this exercise because and as suggested in the original premise of TheFutureOf (and continued here in The Analytics Ecology), there is more power in different disciplines coming together than in any single discipline attempting to answer what it doesn't have the tools to question.

So again to state what I wrote in my caveat to From TheFutureOf: Analyzing Eric's Calculation or “Back into the fray, picking up at Eric's 'The purported complexity of my calculation…'”, Would anyone care to join me in helping Eric reformulate this pseudo-logic so that it does something closer to what is intended?

Let me rebuild the calculation (above) from the ground up using the definitions supplied in How to measure visitor engagement, redux. I'm going to do my best to mathematize the definitions in order to disambiguate meanings. I'll use “==” for “is defined as”, “|” for “is formularized as” and place what I believe are necessary modifications to the original statements in italics in what follows.

Ci == “the percent of sessions having more than 'n' page views divided by all sessions” | = (sessions with n page views)/(all sessions with countable page views)

First, the above change serves the purpose of borrowing a lesson from physics, engineering, basically any discipline where metrics have meaning; make sure the units always match and that the final units are relevant to the original proposition. I don't think the modification adversely affects the definition and it does add a little rigor, something that will be required as I progress through the explanation of the calculation's logic.

Second, I am confused by the definition, “the percent of sessions having more than 'n' page views divided by all sessions”. I think what is meant is either “the percent of session having more than ‘n’ page views” or “the number of sessions having more than ‘n’ page views divided by all sessions” (these are mathematically equivalent statements). A “percent” takes the form of “x/y”. My reading of the definition or expression fails because I read it as “(x/?)/y” where “?” is a missing element of the calculation. I considered the material in the preface to the formula, “…to calculate 'percent of sessions having more than 5 page views' you need to examine all of the visitor's sessions during the time frame under examination and determine which had more than five page views.” and if I understand that model correctly then I've written the formulation as I think it was intended. If I'm mistaken, this modification greatly affects the calculation because the original calculation becomes unusable otherwise.

Ri == “the percent of sessions having more than 'n' page views that occurred in the past 'n' weeks divided by all sessions. The Recency Index captures recent sessions that were also deep enough to be measured in the Click-Depth Index.”

This one gives me some real challenges. There's a similar challenge to what I mentioned above regarding “percent”. The use of “captures” infers that Ci is a subset of Ri. This means that a simple “Ci + Ri” has the potential of yielding over 100% (as I believe negatives are disallowed by the definitions).

The use of “n” for both page views and weeks concerns me. The introductory material states that the calculation isn't absolute for all sites. That I can accept and I would need much more explanation of why the index is identical for page views and weeks. Even allowing one index to be “n” and another to be “m” presents some challenges that I'll address later.

The “…that occurred in the past 'n' weeks…” in itself poses a challenge. Time — as inferred by this statement — is a funny variable. The use here indicates that a temporal variable isn't being introduced as a subset of something else, such as “all sessions that lasted more than x minutes” as is done with Di below. Stating “…sessions having more than 'n' page views…” clearly (to me, anyway) indicates that “page views” are being introduced to define a subset of “sessions”. This isn't the case with “…that occurred in the past 'n' weeks…” because determining the correct meaning greatly affects “…divided by all sessions”. The prefatory material includes “…examine all of the visitor's sessions during the time frame under examination…” and leads me to conclude that the same time frame is applied to numerator and denominator. This introduces challenges of its own. Now the value of 'n' must be carefully determined based on data not yet demonstrated in the model; what's the average length of time it takes for “Recency” to occur? It's possible with the current model to determine that someone was extremely “engaged” over a one hour period. But if they are never heard from again, web wise? Or someone could be extremely “engaged” over a one year period and still not fall under the predicate of “truly qualified opportunities”. The fact that a time frame is specified for this variable and not others indicates that no time frame exists for other variables unless so noted.

In any case, my understanding of what is written causes me to guess (where m may equal n)

| = (sessions with n page views in some m week interval)/(all sessions with countable page views in the identical m week interval)

If I have understood the calculation correct thus far (big “if”, I know) then it's already violating the conservation of units concept I identified above. Ci has no time element therefore it can't be added to or summed with Ri because Ri does have a time element. For those less mathematically inclined, the problem here is that the sum of numbers with dissimilar units provides less information than the individual values. If I have a Weather Index that sums the likelihood of rain (1-10 scale with 10 being a flood) and expected temperature (1-12 scale with 12 allowing egg frying on the sidewalk) and I told you tomorrow’s Weather Index is 7, how would you dress for the day? Much more informative is simply saying the precipitation scale is a 2 and the expected temperature is a 10. This concern grows exponentially as we traverse the various measures.

Di == “the percent of sessions longer than 'n' minutes divided by all sessions.”

The use of “percent” continues to be a challenge. Ditto “all sessions”. Tritto unit conservation as Di has no temporal referents. Quartto the repeated use of 'n' as an index.

| = (sessions lasting longer than n minutes)/(all sessions)

Li == “1 if the visitor has come to the site more than 'n' times during the time-frame under examination (and otherwise scored '0').”

That 'n' is going to be very carefully chosen, yes? Also much the same concerns and challenges as noted previously.

| = {1,0} || {n >= N, n < N}

Bi == “the percent of sessions that either begin directly (i.e, have no referring URL) or are initiated by an external search for a 'branded' term divided by all sessions (see additional explanation below)”

Same challenges as noted previously.

| = (sessions that are branded)/(all sessions)

Fi == “the percent of sessions where the visitor gave direct feedback via a Voice of Customer technology like ForeSee Results or OpinionLab divided by all sessions (see additional explanation below)”

I'll leave logic errors presented by VoC for some links in my bibliography. I will offer that this is the closest variable thus far proposed in the calculation that comes close to either the dictionary or NextStage definitions of “engagement” (also in the bibliography) as it is a direct representation of the visitor's psyche. I don't believe it's a particularly useful representation because VoC as referenced here fails on several social science backgrounds and that's just my opinion. Also noting the same concerns as before.

| = (sessions with direct feedback)/(all sessions)

Ii == “the percent of sessions where the visitor completed one of any specific, tracked events divided by all sessions (see additional explanation below)”

This is something I recognize as a Visitor Action Metric. Otherwise same challenges as noted before.

| = (sessions with selected events)/(all sessions)

Si == “scored as '1' if the visitor is a known content subscriber (i.e., subscribed to my blog) during the time-frame under examination (and otherwise scored '0')”

Same challenges as before and with a twist. We're adding a logic referent disguised as a temporal referent. “If during the three weeks we monitored, person A became a subscriber” == 1. But it doesn't matter when during the three week period person A became a subscriber. Nor is there a recognition of unsubscription. Nor is there any consideration of person A's interaction with the subscription vehicle. I just finished a lot of work on understanding how and why people subscribe and unsubscribe to things so I really have a challenge with this being a metric as laid out here.

| = {1,0} || {S,~S}

The use of

eric%27s%20sum%20over%20all%20visitors.jpg

confuses me. Is something calculated that applies to all visitors to a site or is the result unique to each visitor? The extra level of panthicity adds to the challenges of forming the calculation correctly if the former, not if the latter. In the expression “Ci + Ri + Di + Li + Bi + Fi + Ii + Si” and using 'i' as the index, is the meaning that the index “i” for F is the same and the index “i” for C, etc.? In other words, there exists a set “C1 + R1 + D1 + L1 + B1 + F1 + I1 + S1” and a set “C2 + R2 + D2 + L2 + B2 + F2 + I2 + S2” and a set “C3 + R3 + D3 + L3 + B3 + F3 + I3 + S3” and so on for each visitor in the calculation?

This means for every “C” there is a matching “R”, “D”, … for each visitor. Having 20 Cs and 10 Ds, for example, violates the calculation. Unless there's a lot of fast and loose play to make things fit. I also think that some of the explanatory comments in How to measure visitor engagement, redux require a rewrite of the calculation.

“You take the value of each of the component indices, sum them, and then divide by '8' (the total number of indices in my model) to get a very clean value between '0' and '1' that is easily converted into a percentage.” Okay. I'm not going to argue semantics and I'll do my best to incorporate this terminology into my understanding.

There are separate ratios (because I don't think the Cs, Rs, Ds, etc, are percentages yet) and add them together (there is no summing yet as indicated by the calculation) then divide by the number of ratios you've gathered (8). This returns an average ratio that could be between 0 and 1 and doesn't have to be as noted above. If L and S are both 1 there's a real possibility of 25% becoming a false attractor (final calculated value). I would suggest removing these variables or calculating their values differently to provide more meaning to the calculation.

In any case, this average ratio is converted into a percentage.

I think there are flaws in either the explanation or the construction of the calculation. One way or another conservation of units doesn't occur and that pretty much invalidates the calculation from an arithmetic perspective.

The brand index B is also interesting and I agree, complex. We've found that people will find client sites in ingenious ways and one of the reports we offer clients is “Search Term by Level of Interest”. We've found that some of the most qualified visitors find client sites via search terms that appear out of the blue to site owners yet make extreme sense when one considers the cultural concepts the searcher is applying.

The feedback index F is also interesting. The statement “…anyone willing to provide direct feedback is engaged.” is troublesome. Ignoring the use of “engaged”, I will offer that anyone willing to provide direct feedback is willing to provide direct feedback, nothing more until their reasons for providing direct feedback are known. We've already demonstrated and published in detail that (the majority of) people who participate in such methodologies are not motivated by the site (they are not highly qualified and may not even be qualified visitors).

At this point I think evaluating the arithmetic is a lost cause. There are too many logic flaws (my opinion) to make the exercise worthwhile. Instead I'll focus on how well what is offered matches what is intended.

What I'm willing to suggest is that the calculation offered represents some pseudo-logic. I'm not ready to accept that the pseudo-logic represents “…the degree and depth of visitor interaction on the site against a clearly defined set of goals.”

Does the pseudo-logic represent “…new ways to examine and evaluate visitor interaction.”? Of course it does. Is it relevant, does it do what it claims to do and does it demonstrate both reproducible and functionally operable results (from Starting the discussion: Attention, Engagement, Authority, Influence, …, “Can you repeatedly measure what you mean by them so that there's a reasonable surety that what you're measuring is what you mean by the terms you've used?”)?

That's the real question for me. I've created lots of mathematical models that are danged impressive and completely useless (economically) when I create them. An example are the maths I created for our first patent. Nobody gave a rat's patootie about them when I developed them because few people understood what the mathematics was doing. Now companies are calling asking for applications based on the math.

So the fact that I believe the calculation offered doesn't do what is claimed is only relevant now and probably has more to do with my ignorance than anything else.

Is the pseudo-logic “…time-consuming…”? I suppose, if you're as anal as I am about understanding things before I respond to, use or accept them. And in their present form? Definitely time-consuming and due to the flaws in the calculation itself and documented above. But calculations (flawed or otherwise) that are time-consuming can be made much less so by a variety of mathematical tools so this isn't a real concern to me unless the desire is to perpetually do everything by hand. Two of NextStage's developers are remarkably adept at taking the math in my head and turning it into working code. Thank goodness! My inability to create legible blog posts is an indication of my lack of programming skill.

Does this pseudo-logic “…identify truly qualified opportunities.”? I would need to know what “truly qualified” and “identify” means. The calculation provided is arithmetically unstable, hence any values derived from it are questionable, hence it fails the test suggested in the quote a few paragraphs above and several others that would normally be applied. “Does a value of x here mean the same thing as the value x applied there?” that type of thing. Until repeatable and actionable the pseudo-logic is an interesting exercise and nothing more (my opinion again).

Regarding “…this model fully supports both quantitative and qualitative data,” … I'll agree that there are elements in the pseudo-logic that have quantitative origins and others that have qualitative origins, not that the model supports them. Making arithmetic use of something — including it in a calculation without logical reasons for doing so — is not a basis of support.

Lastly “…there is no single calculation of engagement useful for all sites,”… I'll agree that the pseudo-logic presented is not useful for all sites because it is flawed. Let that slide for a second and do some simple lexical substitution using the definition of “engagement” supplied in How to measure visitor engagement, redux, “…there is no single calculation of [the degree and depth of visitor interaction on the site against a clearly defined set of goals] useful for all sites…”.

I'm thinking I don't understand what is meant because (to me) of course there are such calculations available. A simple example is the NextStage Gauge. Interestingly, this tool came about when several university PR and marketing offices asked for something that would allow them to get away from traditional web analytics because, as you point out, “Web analytics is hard…”. They wanted something that would look at all visitors and determine what needed to be done to achieve specific business goals.

Regarding the section of How to measure visitor engagement, redux entitled “How Does This All Work in Practice?”… Let me first offer that the economic value of a metric is directly proportional to

1) the information value of what the metric reports on

2) as that information value is defined by some group with an interest in it and

3) that same group's ability to change environmental factors so that

4) the metric changes report value (not information value) in direct and obvious response to that same group's intentional changes in environmental factors.

Thus the proposition (and using the same lexical substitution as above) “…someone coming from a Google search for “web analytics demystified” who looks at 10 pages over the course of 7 minutes, downloads a white paper and then returns to my site the next day will have a higher visitor [degree and depth of visitor interaction on the site against a clearly defined set of goals] value than someone coming from a blog post who looks at 2 pages and leaves 2 minutes later, never to return.” is obviously true for items 1 and 2 above (and with the caveats and concerns already documented above) and I have no evidence regarding items 3 and 4.

Does it have anything to do with the dictionary definition of engagement? Not that I can determine.

So, if the final offering is that some pseudo-logic has been created and the term “engagement” has been used as a symbol for that pseudo-logic then great and good. But the value of this semanticism will only go as far as a business case can be made according to the four items above, me thinks. And it's not “engagement” by any definition other than the one supplied in How to measure visitor engagement, redux.

A couple of final thoughts. One of my mentors, a brilliant mathematician, often told me “If you're going to make mistakes, make them at the beginning. They're much easier to find and fix that way.” I think this is the case here. The logic and arithmetic mistakes are fairly simple to fix although the fixes might cause some radical rethinking of what the metric is really reporting.

Second, I didn't go through this simply because I enjoy understanding such things (I do. Ask my wife about my reading math texts to relax). I went through this exercise because and as suggested in the original premise of TheFutureOf, there is more power in different disciplines coming together than in any single discipline attempting to answer what it doesn't have the tools to question.

So again to state what I wrote at the beginning, Would anyone care to join me in helping Eric reformulate this pseudo-logic so that it does something closer to what is intended? I'm game for it. Remember, though, it might take me a while to respond.

Links for this post:

3 Comments
  1. Katar Ctatne permalink

    This is remarkable, very useful. I did not know about this and know no many others do. This wasn't part of the original posts. Why?

Trackbacks & Pingbacks

  1. Previously Unpublished: Analyzing Eric’s Calculation or “Back into the fray, picking up at Eric’s ‘The purported complexity of my calculation…’” | The Analytics Ecology