However, as a market segmentation method, CHAID (Chi-square Automatic Interaction Detection) is more sophisticated than other multivariate analysis. Chi-square automatic interaction detection (CHAID) is a decision tree technique, based on –; Magidson, Jay; The CHAID approach to segmentation modeling: chi-squared automatic interaction detection, in Bagozzi, Richard P. (ed );. PDF | Studies of the segmentation of the tourism markets have CHAID (Chi- square Automatic Interaction Detection), which is more complex.

Author: Aragul Shakara
Country: Togo
Language: English (Spanish)
Genre: Travel
Published (Last): 7 February 2007
Pages: 271
PDF File Size: 5.75 Mb
ePub File Size: 1.64 Mb
ISBN: 836-8-45591-639-7
Downloads: 36761
Price: Free* [*Free Regsitration Required]
Uploader: Zulkijora

Popular Decision Tree: CHAID Analysis, Automatic Interaction Detection

It is one of the oldest tree classification methods originally proposed by Kass CHAID will “build” non-binary trees i. Hence, both types of algorithms can be applied to analyze regression-type problems or classification-type.

This name derives from the basic algorithm that is used to construct non-binary trees, which for classification problems segmenyation the dependent variable is categorical in nature relies on the Chi -square test to determine the best next split at each step; for regression -type problems chid dependent variable the program will actually compute F-tests.

Specifically, the algorithm proceeds as follows:. The first step is to create categorical predictors out of any continuous predictors by dividing the respective continuous distributions into a number of categories with an approximately equal number of observations.

For categorical predictors, the categories classes are “naturally” defined. The next step is to cycle through the predictors to determine for each predictor the pair of predictor categories that is least significantly different with respect to the dependent variable; for classification problems where the dependent variable is categorical as wellit will compute a Chi -square test Pearson Chi -square ; for regression problems where the dependent variable is continuousF tests.


If the respective test for a given pair of predictor categories is not statistically significant as defined by an alpha-to-merge value, then it will merge the respective predictor categories and repeat this step i.

Seegmentation the statistical significance for the respective pair of predictor categories is significant less than the respective alpha-to-merge valuethen optionally it will compute a Bonferroni adjusted p -value for the set of categories for the respective predictor.

Selecting the split variable. The next step is to choose the split the predictor variable with the smallest adjusted p -value, i. Continue this process until no further splits can be performed given the alpha-to-merge and alpha-to-split values.

Specifically, the merging of categories continues without reference to any alpha-to-merge value until only two categories remain for each predictor. The algorithm then proceeds as described above segmenttaion the Selecting the split variable step, and selects among the predictors the one that yields the most ssegmentation split.

For large datasets, and with many continuous predictor variables, this modification of the simpler CHAID algorithm may require significant computing time. Unique analysis management tools.

What is CHAID Segmentation? – TRC Market Research

A general issue that arises when applying tree classification or regression methods is that the final trees can become very large. In practice, when the input data are complex and, for example, contain many different categories for classification problems, and many possible predictors for performing the classification, then the resulting trees can become very large.

This is not so much a computational problem as it is a problem of presenting the trees in a manner that is srgmentation accessible to the data analyst, or for presentation to the “consumers” of the research.

However, it is easy chaud see how the use of coded predictor designs expands these powerful classification and regression techniques to the analysis of data from experimental. For classification -type problems categorical dependent variableall three algorithms can be used to build a tree for prediction.


QUEST is generally faster than the other two algorithms, however, for very large datasets, the memory requirements are usually larger, so using the QUEST algorithms for classification with very large input data sets may be impractical. CHAID will build non-binary trees that tend to be “wider”. CHAID often cahid many terminal nodes connected to a single branch, which can be conveniently summarized in a simple two-way table with multiple categories for each variable or dimension of the table.

This type of display matches well the requirements for research on market segmentation, for example, it may yield a chaix on a variable Incomedividing that variable into 4 categories and groups of individuals belonging to those categories that are different srgmentation respect to some important consumer-behavior related variable e.

As far as predictive accuracy is concerned, it is difficult to derive general recommendations, and this issue is still the subject of active research.

Chi-square automatic interaction detection – Wikipedia

As a practical matter, it is best to apply different algorithms, perhaps compare them with user-defined interactively derived trees, and decide on the most reasonably and best performing model based on the prediction errors. For a discussion chaic various schemes for combining predictions from different models, see, for example, Witten and Frank, Products Solutions Buy Trials Support. Specifically, the algorithm proceeds as follows: