Electric Insights: Simulation for Better Decisions
Electric Insights
 Log in (or register) to access all
site features, including all simulations.
Forgot username or password?
See & Change the Future

Light bulb moments are few and far between. But what if they weren't? What if they occurred at the flip of a switch and let you see and change the future? To make that happen, we've invented a wonderful way—think of it as supercharged simulation—to solve an endemic problem in the $90 billion "insights" industry: a massive over-reliance on scorekeeping at the expense of prediction.

Scorekeeping
(Where you are)
  1. "39% of Americans approve of how Joe Biden is handling his job as president."
  2. "Your market share is 12%."
Prediction
(A road map to where you want to be)
  1. "If all Americans (vs. 12%) felt economic conditions were excellent, 67% would approve of Joe Biden's performance."
  2. "If you eliminate all concerns about your product's safety, you'll increase your market share from 12 to 14%."
The Scorekeeping Problem is Not New

In a 1976 interview, pollster and consultant Lou Harris derided his peers in market research and polling, whom he dubbed eunuchs, for believing their job was done once they had tallied scores, such as presidential approval ratings and market share estimates. He targeted George Gallup, who had insisted that pollsters, in particular, should be "fact-finders and scorekeepers, nothing else." Like Gallup, Harris took great care to "take society's pulse via the thermometer of survey research." In contrast to Gallup, Harris believed they also had to show presidents, prime ministers, the C-suite, and "insights" professionals exactly how to move the needle.

Prediction is Paramount

An example of how Harris operated is easy to find. Before the 1960 presidential election, he predicted the vote-share increase his client John F. Kennedy would attain by emphasizing certain issues, such as civil rights, in his battle with Richard M. Nixon. Specifically, Harris determined the vital few issues that would produce the biggest increase, and the size of the increase in percentage points. To make that happen, Harris broke new ground by transforming typical scorekeeping surveys into platforms for prediction. It was an enormous challenge and why he worked "until 3 or 4 a.m." (See our surveys as experiments FAQ.)

Phenomenal Foresight

Percentage points effects simulation, our invention's name, supercharges Harris's approach, determining instantly how changes in key opinions, attitudes, and experiences will raise scores, grow groups (e.g., JFK likely voters) and lift probabilities (e.g., a specific person's probability of voting for JFK). It also empowers you to like, rate, and comment on actions you might take to achieve your goals, with ROI measurement straightforward, too. So you'll know what to do, who to target, and what to expect in return. See our simulations involving Trust in Pollsters, NBA 3-Pointers, and Steph Curry 3-Pointers.

Percentage Point Effects

A crystal ball like linear regression is exceptional for determining how a change in any predictor variable's value (e.g., if a hotel guest checks in remotely rather than traditionally) will affect a continuous outcome variable (e.g., # minutes required to check in), all else the same. But when the outcome is binary (e.g., whether or not a guest recommends the hotel), and the aim is to identify the best way to grow one group (e.g., recommenders), logistic regression "dominates all other methods in the social and biomedical sciences," producing "a more accurate description of the world."

Speaking in Tongues

Although binary outcomes are ubiquitous, "insights" industry researchers rely on logistic regression only rarely. They have been unable to translate logits and odds–conventional measures for describing a predictor variable's likely effect–into clear findings and recommendations. Here is what a researcher might say to a hotel client after using logistic regression to analyze guest data: "If all five million of your guests check in remotely, their logit of recommending the hotel will increase by 0.51." Or maybe that researcher would say, "Guests who check in remotely have a 33% higher odds of recommending the hotel than those who check in traditionally." Both statements would leave most people confused.

Change the Game

Percentage point effects simulation changes the game, enabling the researcher to say: "If all five million of your guests check in remotely, the percentage who will recommend the hotel will rise from 20 to 24. That four-point effect translates to 200,000 more recommenders (from 1 to 1.2 million), 300,000 additional guests (1.5 per recommender), and a revenue increase (given a two-day average stay at $150 per day) of $90 million (300,000*2*150). It also will reduce the time you need to process each check in by five minutes per guest, saving you 416,667 labor hours ([5 million guests*5 minutes per check in]/60)."

The researcher also can predict remote check-in's effect on key segments (e.g., business travelers) and specific guests. And if the hotel client prefers to bypass the researcher and take a DIY path, it can do so easily—it's a benefit of Electric Insights' one-of-a-kind algorithm for producing probabilities (and percentage point effects) instantly. Check out our simulations involving Trust in Pollsters, NBA 3-Pointers, and Steph Curry 3-Pointers.

Applicability

Is percentage points effects simulation right for you? Opportunities abound to figure out how to raise scores, grow groups, and change the future.

Illustrative Group-Growing Opportunities
  • Market Size & Share Tracking: Buyers rather than Non-Buyers (See our market share FAQ.)
  • Usage & Attitudes: Cannabis Users rather than Non-Users
  • Targeting & Segmentation: Chocolate Lovers vs. All Others
  • Customer Experience Monitoring: Promoters vs. All Others (See our CX monitoring systems FAQ.)
  • Brand Tracking: Brand Lovers vs. All Others (See our continuous tracking surveys FAQ.)
  • Concept Testing: Definite/Probable Buyers vs. All Others (See our concept testing FAQ.)
  • Ad Testing: Ad Lovers vs. All Others
  • Pricing: Definite/Probable Buyers at specific price points vs. All Others
  • Performance: Made vs. Missed NBA 3-point shots.
Data Needs

Percentage point effects simulation works with any data type—it can enrich or even resuscitate what you've got already.

Starting anew can be a good option, too: the right design can enable optimal simulation. (See our research design FAQ.)

Working Together
FAQs

Binary Dependent Variables

There are four main reasons:

  1. They are following the lead of academic researchers, who often focus on the sign and statistical significance of logit coefficients, with "little emphasis on the substantive and practical significance of the findings," according to sociologist Richard Williams.
  2. Most major statistical software packages don’t produce percentage point effects through pre-packaged procedures.
  3. It's not possible to represent a percentage point effect with a single number in a summary regression equation (in contrast to logits and linear regression coefficients). The size of the effect will depend on how close the predicted probability (i.e., y-hat) is to 0 or 1 and the values of the model's other variables (i.e., the x's). The effect is slimmer at the top and bottom of the probability scale than in the middle. Alfred DeMaris, a sociologist and statistician, called this an "intractable" problem.
  4. Producing probabilities (and percentage point effects) in real-time is complicated. As an example, a -1.5 logit translates to a .18 probability formulaically but the individual cases (e.g, 4,000 survey respondents, 35,000 3-point shots) might average to a 0.26 probability, with the latter regarded as the correct number. An implication is that simulation (i.e., What if…?) won’t work properly without looping through all cases and changing various values. To do so in real-time requires a custom algorithm and a diverse skill set.

George Terhanian asked a team of marketing scientists from Harris Interactive that same question a decade ago. They returned a list of 31 methods.

Compared to other methods (and models), logistic regression produces evidence that’s often more credible and trustworthy. That's one reason experts, such as sociologist Paul Allison, describe it as the "dominant" method for predicting binary dependent variables.

Logistic regression is central to some applications, such as discrete choice modeling (e.g., statistically modeling the choice to stay at one hotel rather than five alternatives) and attribution modeling (e.g., determining how different advertising “touch points" contribute to a desired action, such as purchasing a product).

It's also the go to method of survey researchers who use propensity score approaches to enhance sample representativeness and improve accuracy. (George Terhanian introduced propensity scoring to the "insights" industry community in the late 1990s, as described in Public Opinion Quarterly.) But "insights" industry researchers turn to logistic regression only rarely for core work (e.g., brand tracking, concept testing), despite the ubiquity of binary dependent variables.

Multinomial logit modeling would be the method of choice. The downside is that it complicates reporting and analysis.

If the starting probability is close to 0 or 1 (as in online advertising where click-through rates are often below 1 percent), the percentage point effect of a change in an otherwise important predictor variable could fall under the radar. An analysis might suggest, for example, that the use of active-voice language in call-to-action display ads, controlling for other variables' effects, increases the click-through probability from .0025 to .0067, a tiny number easy to overlook.

Conventional logistic regression reporting would supply the information (e.g., logits, z scores, odds ratios) needed to reduce the risk of making a mistake: the .0042 percentage point increase would translate to a 172% odds increase, and a one-point logit increase, both of which are substantial. It is a good example of what the statistician Frederick Mosteller termed "balancing biases," or letting "weaknesses from one method...be buttressed by strength from another."

Absolutely. If 40% of Americans say they are enthusiastic about the development of driverless vehicles, then any single person's probability of being enthusiastic is .40 (assuming we know nothing else about that person).

A crosstabulation ("crosstab") does not change the underlying (i.e., recorded, observed, original) data. It tells you how things are. Percentage point effects simulation tells you how things would if the underlying data were to change. It produces predictions.

Researchers have explored the possibility of estimating causal effects through survey research for decades. The belief is that a high-quality survey, coupled with an analytic method like percentage point effects simulation, can generate estimates of effect equivalent to those from a randomized controlled experiment. Through an experimenter's lens, the approach would be akin to quasi-experimentation, a method for estimating causal effects without random assignment. (See our research design FAQ.)

Benefits of Percentage Point Effects Simulation

Although many different research methods and modules answer specific questions, they don't necessarily deepen stakeholders' understanding of markets, customers, prospects, and competitors. That's a big problem for companies looking to build and sustain a competitive advantage. Percentage point effects simulation makes it easy to dig beneath the surface to develop that kind of deep understanding.

Let's say a specific action you take grows a key group by 30 percentage points. You'd then need to estimate the total cost of that action. If it's $3 million, then the cost per percentage point increase (or ROI) would be $100,000 (i.e., $3 million/30 percentage points). See also page 645 of The Possible Benefits of Reporting Percentage Point Effects.

Most concept tests (e.g., BASES) estimate the percentage (and number) of definite/probable buyers. We'd reframe that percentage as the probability of buying the eventual product. With that in hand, we'd develop a model to predict it, using diagnostic and other data from the concept test. We'd then package it in a simulator (akin to those on this site). You then could refine the concept (e.g., by emphasizing features that increase purchase probability—the simulator will report the size of any/every feature's increase)—before re-testing it. The simulator also will enable you to identify desirable groups (e.g., 30-49-year-old women living in the Northeast) and individuals based on their purchase probability.

Proponents of CX systems market their systems aggressively (e.g., "software to help turn customers into fanatics, products into obsessions, employees into ambassadors, and brands into religions."). And in fact, good CX systems possess many attributes. Through analysis of customer data (e.g., customer experience surveys, customer transactions), for instance, they make it easy to assign scores to individual customers, with those scores representing membership within, or proximity to, a critical group, such as brand promoters or repeat customers. They also make it easy to notify stakeholders of issues that may require attention. But they're missing at least one feature: the ability to produce precise predictions of the likely impact of potential actions on, say, group membership/size. That's where percentage point effects simulation fits in—it's all about producing those predictions.

There are many ways to estimate a variable's importance. One way, stated importance, involves direct questioning (e.g., “How important—not at all important, not too important, somewhat important, very important—is it for a hotel to offer guests a mobile check-in option?"). A second way, derived importance, uses correlation or regression analysis to estimate the degree to which a response to a stated importance question is associated with a response to a key outcome question (e.g., “On a scale of 0-10 where 0 represents not at all satisfied and 10 represents completely satisfied, how would you rate your overall satisfaction with your hotel stay?").

Many research agencies will create a matrix comparing these two importance measures. But that won't show you how a higher rating on a particlar variable (e.g., use of a mobile check-in option) would increase the size of a critical group (e.g., the percentage of people who give you a 9 or 10 on the overall satisfaction question). Percentage point effects simulation would report the increase.

Percentage point effects simulation works with all data, including B2B survey data (e.g., an annual customer experience survey) and CRM data (e.g., Salesforce data).

Yes, it works with any data that can be coded, not just survey data.

It’s similar but typing tools don’t allow you to hold constant the values of predictor variables, in contrast to percentage point effects simulation. Typing tools also don’t make it easy for you to understand how individuals within the same segment differ from one another—percentage point effects simulation will generate a unique prediction (of belonging to the group of interest) for every member of every segment (thereby enabling 1-to-1 communication).

Through Nielsen, IRI, NPD, GfK, or STR, you already may know the size of your market, your share, and the average selling price of products (or services) in particular categories. It should be straightforward, as a next step, to convert those market-wide measures to per-customer spend estimates. Through linkage to other data sources (e.g., brand health data, customer experience, loyalty card data), synchronization, and percentage point effects simulation, you then could estimate how changes you could make would grow (or shrink) the size of your customer base, customer spending, and your market share. (See our prediction accuracy FAQ.) You also could learn more (e.g., socio-demographics, attitudes, behaviors, beliefs, proclivities) about the customers and prospects most likely to contribute to your growth.

Design & Analysis

It depends on the character and quality of the available data. Ideally, that data will include at least one binary dependent variable and several potential drivers, as in our simulations.

We think of percentage point effects simulation as a quasi-experiment. An implication from a design perspective is that you should conceptualize your survey or information system as a platform for estimating causal effects. The platform should include the variables needed to enable percentage point effects simulation. In principle, the dependent variable—the one you’re trying to grow or shrink—should be a true dichotomy from the start (rather than, say, a 10-point scale you transform to a dichotomy post hoc). And the predictor variables, aside from any socio-demographic ones, should be potential levers.

If you're trying to figure out what question types to include as predictor variables in a survey, follow best practice. Recently conducted research suggests that unipolar, four-category, fully-anchored scales work well (in terms of usability, reliability, and validity) across modes (e.g., mobile, online, telephone, face-to-face, mail) and languages. Think about using two-point scales, too—they ease interpretation. You also should keep in mind the concept of linking and syncing. It involves enhancing the extent to which the data sources on which you rely share common factors (e.g., sampling frames, data collection dates, questions). (See our data linkage FAQ.)

Data linkage is the act of putting together different data sources to enhance the usefulness of the combined information. It also can involve using data from one source to adjust data from another—we call that a link and sync process. For instance, a brand might use key data from market size and share reporting as a check and basis for adjusting survey information (e.g., self-reports of purchase behavior) from a brand health or customer experience survey. It would be akin to how Pew, Gallup, Ipsos, or YouGov use Census data (e.g., population percentages for key demographics) to adjust survey data. To describe that process, they would say something like this: results were weighted for age within sex, region, and race-ethnicity…to align them with their population proportions. Although data linkage may seem sensible in theory, it can be difficult to apply, particularly when the targeted data sources were neither designed nor conceived of as neatly-fitting puzzle pieces.

Those systems lack built-in modules for producing and reporting percentage point effects so custom programming would be needed.

After you assess all plausible scenarios, develop a plan to increase the target populations's probability of belonging to the group of interest (e.g., customers). A successfully-executed plan will increase the size (and, where applicable, spend) of the key group. You'll also need to think through how difficult it may be to make a particular change (or changes).

We would need to identify a handful of plausible values for the continuous variable and include them in the simulator. We also could convert a categorical predictor variable into a continuous one and take the same steps.

Miscellaneous

Our simulator produces predictions with 95% confidence intervals though some clients prefer that we not show them. With that said, those intervals can be an important safety check. In general, predictions based on larger sample sizes are more trustworthy than those based on smaller ones. By reporting 95% confidence intervals, or lower and upper estimates, we quantify the trustworthiness of our predictions. Here's how you should interpret 95% confidence intervals: "If the study had fielded (or had the same data been collected), say, 100 times, and nothing else had changed, then our predictions would lie within the upper and lower estimates 95 times out of 100."

That statement assumes that there were no biases other than those associated with how cases (e.g., people) were selected (i.e., sampled). That may not be realistic. In survey research, potential biases include non-coverage error, non-response error, question wording, and question order. Other data types (e.g., point-of-sale) may suffer from different biases.

In logistic regression analysis, those terms are synonyms for a variable (i.e., “an attribute that describes a person, place, thing, or idea") on the right-hand side of the equation. They explain or predict the binary dependent variable, or y. Sometimes, they can be thought of as levers, actions, linchpins, or drivers (e.g., a mobile check-in option at a hotel)—it depends partly on what they describe (e.g., a person, place, thing, or idea).

The model is unlikely to change meaningfully from month to month. Our suggestion, absent additional information, is to update it once or twice a year. It will give you time to implement the actions you've identfied to affect the key outcome(s) from the tracker.

Electric Insights is a new start-up so we don't have many clients, yet. With that said, we've worked with data sets covering several broad areas, including patient outcomes, sports performance, hospitality, public relations, public opinion, and sales performance.

We'd start by reviewing your objectives and the data available to support them. If you have what you need, we'd work with you to identify the group(s) (e.g., customers) of interest. We'd then build one or more logistic regression model to understand which predictor variables move the needle. Once we decide on the ultimate model(s), we'd generate all predicted probabilities before building the simulator(s). We'd also make sure you understand how to use it. If you don't have the data you need, we'd advise you on how to produce it. Topics we typically touch on include research design, data linkage, and any other that contributes to better data, reporting, analysis, and decisions.


If the future is like the past, our predictions will be accurate.

About

George Terhanian Words like simplify, predict, and advise resonate with George Terhanian, founder of Electric Insights. The experiences he had as a basketball player and coach shape his view. Every player wanted to know how to increase their shooting percentage. They were looking for simple numbers: the expected increase in percentage points associated with any change they could make. Then they'd decide whether to put in the effort to make a particular change. Terhanian knows that CEOs, CMOs, brand managers, insights professionals, and others want that same thing: trustworthy predictions of how their actions will affect critical outcomes.

Experience

Terhanian has held C-level roles for The NPD Group, Toluna, and Harris Interactive, as his curriculum vitae shows. He has also served as a board or advisory group member for the National Academy of Sciences, the US Department of Education, the Advertising Research Foundation, the Insights Association, and the British Polling Council.

Earlier in his career, he taught and coached in public and private schools. The basketball teams he helped coach at the Episcopal Academy in Philadelphia won three league titles, with an overall record of 75 wins and 6 losses.

Academic Background

Terhanian holds a Ph.D. from the University of Pennsylvania, a master’s degree from Harvard, and his undergraduate degree from Haverford College. He's known for conceiving of the idea of using propensity score matching to make survey data more accurate.

His work is published in several refereed journals. The UK’s Market Research Society named The Possible Benefits of Reporting Percentage Point Effects a finalist for "best paper in 2019's International Journal of Market Research."

Hit? Stand? Double? Master "Likely Effects" to Make the Right Call is Terhanian's most recent work. It is published in Quirk's.

Contact
info@electricinsights.com
+1-646-430-3420
Las Vegas, NV