Light bulb moments are often elusive, reserved for rare occasions. But what if they didn't have to be? Imagine that these moments of clarity occurred at the flip of a switch—enabling you to see the future and shape it to your liking. If this prospect excites you, you'll be thrilled to learn about our groundbreaking invention. We've developed a solution that turns this vision into reality, addressing a longstanding issue in the $90 billion insights industry: an excessive emphasis on scorekeeping at the expense of predictive modeling. With our innovation, we're reshaping the landscape of insight generation, providing a seamless transition from mere observation and description to proactive decision-making. Say goodbye to guesswork and hello to a future where foresight is not just a luxury but a standard practice.
In a candid 1976 interview,
renowned pollster and consultant Lou Harris didn't hold back his criticism of counterparts in market research and polling. He famously referred to them as "eunuchs," implying that they lacked the vigor and gumption required for their profession. Harris took issue with the prevailing mindset among pollsters, epitomized by figures like George Gallup, who viewed their role primarily as
Lou Harris refined his approach to polling and predictive modeling in the run-up to the 1960 presidential election. He developed a model to predict the vote-share increase his client, John F. Kennedy, could achieve by focusing on specific issues, including civil rights, in his campaign against Richard M. Nixon. Harris's approach involved (a) identifying the vital factors that would significantly impact voter sentiment and (b) predicting the potential increase in vote share were Kennedy to emphasize these issues. This innovative approach represented a fusion of traditional scorekeeping methods with forward-looking predictive modeling techniques. The task was no small feat, requiring Harris to work tirelessly into the early morning hours, sometimes until
Linear regression, often described as a statistical crystal ball, is a powerful tool for predicting how changes in the values of predictor variables can impact a continuous outcome variable under consistent conditions. For example, in a hotel check-in process, linear regression can predict how remote check-in versus traditional check-in might influence the time required to check-in, assuming all other relevant factors remain constant. By analyzing historical data and identifying relationships between predictor variables (including the check-in method) and the outcome variable, linear regression enables researchers and analysts to quantify the effect of each predictor variable on that outcome. This predictive capability allows decision-makers to estimate the potential impact of changes they're considering and make informed decisions to optimize processes and outcomes. But when the outcome is binary (e.g., whether or not a guest recommends the hotel), and the aim is to identify the best way to grow one group (e.g., recommenders), logistic regression "
Although binary outcomes are ubiquitous,"insights" industry researchers rely on logistic regression only rarely. They have been unable to translate logits and odds–conventional measures for describing a predictor variable's likely effect–into clear findings and recommendations. Here is what a researcher might say to a hotel client after using logistic regression to analyze guest data: "If all five million of your guests check in remotely, their logit of recommending the hotel will increase by 0.51." Or maybe that researcher would say, "Guests who check in remotely have a 33% higher odds of recommending the hotel than those who check in traditionally." Both statements would leave most people confused.
Change the Game
The researcher also can predict remote check-in's effect on key segments (e.g., business travelers) and specific guests. Then again, if the hotel client preferred to bypass the researcher altogether, it could do so easily—it's a benefit of Electric Insights' one-of-a-kind algorithm for producing probabilities (and percentage point effects) instantly. Check out our simulations involving JFK Approval, Trust in Pollsters, NBA 3-Pointers, and Steph Curry 3-Pointers.
Is percentage points effects simulation right for you? Opportunities abound to figure out how to raise scores, grow groups, and change the future.
Percentage point effects simulation works with any data type—it can enrich or even resuscitate what you've got already.
Starting anew can be a good option, too: the right design can enable optimal simulation. (See our research design FAQ.)
Send us your data. We'll return simulators in as little as a day.
We'll handle everything from survey (or information-system) design to data collection, reporting, analysis, and consulting.
We'll give you the keys to the factory.
We'd be happy to work on retainer.
There are four main reasons:
Compared to other methods (and models), logistic regression produces evidence that’s often more credible and trustworthy. That's one reason experts, such as sociologist Paul Allison, describe it as the
Logistic regression is central to some applications, such as discrete choice modeling (e.g., statistically modeling the choice to stay at one hotel rather than five alternatives) and attribution modeling (e.g., determining how different advertising “touch points" contribute to a desired action, such as purchasing a product).
It's also the go to method of survey researchers who use propensity score approaches to enhance sample representativeness and improve accuracy. (
Multinomial logit modeling would be the method of choice. The downside is that it complicates reporting and analysis.
If the starting probability is close to 0 or 1 (as in online advertising where click-through rates are often below 1 percent), the percentage point effect of a change in an otherwise important predictor variable could fall under the radar. An analysis might suggest, for example, that the use of active-voice language in call-to-action display ads, controlling for other variables' effects, increases the click-through probability from .0025 to .0067, a tiny number easy to overlook.
Conventional logistic regression reporting would supply the information (e.g., logits, z scores, odds ratios) needed to reduce the risk of making a mistake: the .0042 percentage point increase would translate to a 172% odds increase, and a one-point logit increase, both of which are substantial. It is a good example of what the statistician Frederick Mosteller termed "balancing biases," or letting "weaknesses from one method...be buttressed by strength from another."
Absolutely. If 40% of Americans say they are enthusiastic about the development of driverless vehicles, then any single person's probability of being enthusiastic is .40 (assuming we know nothing else about that person).
A crosstabulation ("crosstab") does not change the underlying (i.e., recorded, observed, original) data. It tells you how things are. Percentage point effects simulation tells you how things would if the underlying data were to change. It produces predictions.
Researchers have explored the possibility of estimating causal effects through survey research for decades. The belief is that a high-quality survey, coupled with an analytic method like percentage point effects simulation, can generate estimates of effect equivalent to those from a randomized controlled experiment. Through an experimenter's lens, the approach would be akin to quasi-experimentation, a method for estimating causal effects without random assignment. (See our
Although many different research methods and modules answer specific questions, they don't necessarily deepen stakeholders' understanding of markets, customers, prospects, and competitors. That's a big problem for companies looking to build and sustain a competitive advantage. Percentage point effects simulation makes it easy to dig beneath the surface to develop that kind of deep understanding.
Let's say a specific action you take grows a key group by 30 percentage points. You'd then need to estimate the total cost of that action. If it's $3 million, then the cost per percentage point increase (or ROI) would be $100,000 (i.e., $3 million/30 percentage points). See also page 645 of
Most concept tests (e.g., BASES) estimate the percentage (and number) of definite/probable buyers. We'd reframe that percentage as the probability of buying the eventual product. With that in hand, we'd develop a model to predict it, using diagnostic and other data from the concept test. We'd then package it in a simulator (akin to those on this site). You then could refine the concept (e.g., by emphasizing features that increase purchase probability—the simulator will report the size of any/every feature's increase)—before re-testing it. The simulator also will enable you to identify desirable groups (e.g., 30-49-year-old women living in the Northeast) and individuals based on their purchase probability.
Proponents of CX systems market their systems aggressively (e.g., "software to help turn customers into fanatics, products into obsessions, employees into ambassadors, and brands into religions."). And in fact, good CX systems possess many attributes. Through analysis of customer data (e.g., customer experience surveys, customer transactions), for instance, they make it easy to assign scores to individual customers, with those scores representing membership within, or proximity to, a critical group, such as brand promoters or repeat customers. They also make it easy to notify stakeholders of issues that may require attention. But they're missing at least one feature: the ability to produce precise predictions of the likely impact of potential actions on, say, group membership/size. That's where percentage point effects simulation fits in—it's all about producing those predictions.
There are many ways to estimate a variable's importance. One way, stated importance, involves direct questioning (e.g., “How important—not at all important, not too important, somewhat important, very important—is it for a hotel to offer guests a mobile check-in option?"). A second way, derived importance, uses correlation or regression analysis to estimate the degree to which a response to a stated importance question is associated with a response to a key outcome question (e.g., “On a scale of 0-10 where 0 represents not at all satisfied and 10 represents completely satisfied, how would you rate your overall satisfaction with your hotel stay?").
Many research agencies will create a matrix comparing these two importance measures. But that won't show you how a higher rating on a particlar variable (e.g., use of a mobile check-in option) would increase the size of a critical group (e.g., the percentage of people who give you a 9 or 10 on the overall satisfaction question). Percentage point effects simulation would report the increase.
Percentage point effects simulation works with all data, including B2B survey data (e.g., an annual customer experience survey) and CRM data (e.g., Salesforce data).
Yes, it works with any data that can be coded, not just survey data.
It’s similar but typing tools don’t allow you to hold constant the values of predictor variables, in contrast to percentage point effects simulation. Typing tools also don’t make it easy for you to understand how individuals within the same segment differ from one another—percentage point effects simulation will generate a unique prediction (of belonging to the group of interest) for every member of every segment (thereby enabling 1-to-1 communication).
Through Nielsen, IRI, NPD, GfK, or STR, you already may know the size of your market, your share, and the average selling price of products (or services) in particular categories. It should be straightforward, as a next step, to convert those market-wide measures to per-customer spend estimates. Through linkage to other data sources (e.g., brand health data, customer experience, loyalty card data), synchronization, and percentage point effects simulation, you then could estimate how changes you could make would grow (or shrink) the size of your customer base, customer spending, and your market share. (See our
It depends on the character and quality of the available data. Ideally, that data will include at least one binary dependent variable and several potential drivers, as in our simulations.
We think of percentage point effects simulation as a quasi-experiment. An implication from a design perspective is that you should conceptualize your survey or information system as a platform for estimating causal effects. The platform should include the variables needed to enable percentage point effects simulation. In principle, the dependent variable—the one you’re trying to grow or shrink—should be a true dichotomy from the start (rather than, say, a 10-point scale you transform to a dichotomy post hoc). And the predictor variables, aside from any socio-demographic ones, should be potential levers.
If you're trying to figure out what question types to include as predictor variables in a survey, follow best practice. Recently conducted research suggests that unipolar, four-category, fully-anchored scales work well (in terms of usability, reliability, and validity) across modes (e.g., mobile, online, telephone, face-to-face, mail) and languages. Think about using two-point scales, too—they ease interpretation.
You also should keep in mind the concept of linking and syncing. It involves enhancing the extent to which the data sources on which you rely share common factors (e.g., sampling frames, data collection dates, questions). (See our
Data linkage is the act of putting together different data sources to enhance the usefulness of the combined information. It also can involve using data from one source to adjust data from another—we call that a link and sync process. For instance, a brand might use key data from market size and share reporting as a check and basis for adjusting survey information (e.g., self-reports of purchase behavior) from a brand health or customer experience survey. It would be akin to how Pew, Gallup, Ipsos, or YouGov use Census data (e.g., population percentages for key demographics) to adjust survey data. To describe that process, they would say something like this: results were weighted for age within sex, region, and race-ethnicity…to align them with their population proportions. Although data linkage may seem sensible in theory, it can be difficult to apply, particularly when the targeted data sources were neither designed nor conceived of as neatly-fitting puzzle pieces.
Those systems lack built-in modules for producing and reporting percentage point effects so custom programming would be needed.
After you assess all plausible scenarios, develop a plan to increase the target populations's probability of belonging to the group of interest (e.g., customers). A successfully-executed plan will increase the size (and, where applicable, spend) of the key group. You'll also need to think through how difficult it may be to make a particular change (or changes).
We would need to identify a handful of plausible values for the continuous variable and include them in the simulator. We also could convert a categorical predictor variable into a continuous one and take the same steps.
Our simulator produces predictions with 95% confidence intervals though some clients prefer that we not show them. With that said, those intervals can be an important safety check. In general, predictions based on larger sample sizes are more trustworthy than those based on smaller ones. By reporting 95% confidence intervals, or lower and upper estimates, we quantify the trustworthiness of our predictions. Here's how you should interpret 95% confidence intervals: "If the study had fielded (or had the same data been collected), say, 100 times, and nothing else had changed, then our predictions would lie within the upper and lower estimates 95 times out of 100."
That statement assumes that there were no biases other than those associated with how cases (e.g., people) were selected (i.e., sampled). That may not be realistic. In survey research, potential biases include non-coverage error, non-response error, question wording, and question order. Other data types (e.g., point-of-sale) may suffer from different biases.
In logistic regression analysis, those terms are synonyms for a variable (i.e., “an attribute that describes a person, place, thing, or idea") on the right-hand side of the equation. They explain or predict the binary dependent variable, or y. Sometimes, they can be thought of as levers, actions, linchpins, or drivers (e.g., a mobile check-in option at a hotel)—it depends partly on what they describe (e.g., a person, place, thing, or idea).
The model is unlikely to change meaningfully from month to month. Our suggestion, absent additional information, is to update it once or twice a year. It will give you time to implement the actions you've identfied to affect the key outcome(s) from the tracker.
We'd start by reviewing your objectives and the data available to support them. If you have what you need, we'd work with you to identify the group(s) (e.g., customers) of interest. We'd then build one or more logistic regression model to understand which predictor variables move the needle. Once we decide on the ultimate model(s), we'd generate all predicted probabilities before building the simulator(s). We'd also make sure you understand how to use it.
If you don't have the data you need, we'd advise you on how to produce it. Topics we typically touch on include
If the future is like the past, our predictions will be accurate.
The words simplify, predict, and advise resonate with George Terhanian, founder of Electric Insights. The experiences he had as a basketball player and coach shape his view. Every player wanted to know how to increase their shooting percentage. They were looking for simple numbers: the expected increase in percentage points associated with any change they could make. Then they would decide whether to put in the effort to make a specific change. Terhanian knows that CEOs, CMOs, brand managers, insights professionals, and others want that same thing: trustworthy predictions of how their actions will affect critical outcomes.
Terhanian has held C-level roles for The NPD Group, Toluna, and Harris Interactive, as his
Earlier in his career, he taught and coached in public and private schools. The basketball teams he helped coach at the Episcopal Academy in Philadelphia won three league titles, with an overall record of 75 wins and 6 losses.
Terhanian holds a Ph.D. from the University of Pennsylvania, a master’s degree from Harvard, and his undergraduate degree from Haverford College. He's known for conceiving of the idea of using propensity score matching to make survey data more accurate.
His work is published in several refereed journals. The UK’s Market Research Society named The Possible Benefits of Reporting Percentage Point Effects a finalist for "best paper in 2019's International Journal of Market Research."
Advice on Making the Most of Basketball Three-Point Shot Data is Terhanian's most recent work. It is published in The Sport Journal.