Positioning a product in the context of the competition, or the identification of a product’s features and attributes are often key issues facing marketing, promotion, and product managers . Emphasizing a product’s key advantages and features helps adapt the promotional message to the expectations of potential customers, and knowing the strengths of your competition makes it is easier to plan an effective promotional strategy.
The question of how customers perceive brands immediately brings the perceptual map to mind. This is one of a broad set of visualisations built using such techniques as correspondence analysis discussed earlier. They work well with qualitative variables but do just as well when it comes to maps based on means as well. The other possibility is to use principal component analysis which is very popular among researchers. Today, I will discuss another positioning technique based on perceptual maps, namely multidimensional scaling using the ALSCAL algorithm.
Distance and the distance matrix
Multidimensional scaling uses a distance matrix with the analysed attributes – or objects – serving as the input data. Note that the distance is interpreted as a reversed correlation (the basis of factor analysis), or concurrence. Put simply, distance is the difference between objects calculated from variables available for analysis. For example, you can calculate the age distance between two people as a simple difference in the number of years.
If two variables are considered, the distance can be calculated using various formulas, the most popular being the Euclidean distance. It is calculated as the root of the sum of squares of differences between the objects in individual dimensions (variables). Have a look at the picture below. The distance between objects A and B can be expressed as the sum of squares of differences in the X dimension (16) and the Y dimension (36), i.e., 52. The distance between the points is the root of 52. In the case of more than two variables or dimensions, the distance between the points is calculated similarly. Squares of differences in each dimension are summed, and the square root is calculated.
This is how the distance is calculated for quantitative variables. However, it can be calculated for ordinal variables as well. The measure is the so-called city block distance. Values of ordinal variables express ranks that can be used to order objects. The distance is the difference between ranks. Note that the ranks are not metric variables, so multiple variables cannot be treated as quantitative variables. In the case of ranks, the differences between positions (in absolute values) are summed for each variable used or dimension. The final distance is the sum of individual component distances. If the illustration above showed a difference in ranks (such as a ranking after a sports competition), the difference between A and B would be 4 (difference in discipline X) + 6 (difference in discipline Y) = 10 in total for both.
Remember that for a difference between values to be called a distance, it has to meet certain criteria. They are:
The distance matrix used by the multidimensional scaling algorithms is a set of distances between all object pairs. The diagonal of the matrix is 0, which is the distance from an object to itself. Again, the greater the value in the matrix, the larger the distance between compared objects.
Fortunately, the multidimensional scaling algorithm has an option to use raw data and compute the matrix, which makes data exploration much more convenient. This post will focus on that very option. Other available functions will be discussed in future texts on multidimensional scaling.
Brand positioning using the ALSCAL algorithm: Basic settings
The aim of multidimensional scaling is to simplify the picture of distances between analysed objects. Dispersion/distances expressed with many variables should be reflected as well as possible on a two-dimensional perceptual map. Similar objects should be close to each other, and objects assessed to be different should be far from each other.
Let’s take a look at an example. A US self-defence equipment manufacturer researched how customers perceive its products. Respondents scored 1 to 10 how well certain statements fit individual pieces of equipment. Additionally, they assessed a model – a hypothetical product – with the most desired set of features. The scoring was multiplied by 100 to yield a percentage scale. The table below, which contains mean scores, is the dataset to be analysed where rows contain statements about features and columns show the evaluated objects or brands. This arrangement may be counter-intuitive (objects are usually in rows and features in columns), but it is more convenient for the analysis. Scaling can be performed both on rows and columns, so the data arrangement does not matter for the algorithm. Still, it has to be declared appropriately in the analysis options.
The table shows how well, measured as X%, respondents judge a certain feature fits the brand in question. For example, brand A scores 80% for instinctive activation and 100% for difficult to neutralise. Brand B meet both criteria 100%. The calculation of the Euclidean distance between brands is intuitive. The squares of differences in each row should be calculated, then all the values have to be summed up, and then the root has to be determined. For example, the distance between brand A and B is 70.711, and the distance between A and C is 128.062. Brand B is clearly closer to brand A than brand C. Distances between all the objects can be calculated the same way.
The multidimensional scaling procedure has several steps. First, a distance matrix for the objects is calculated. Next, the distances are optimised (multiplied by the optimally selected value). Then, the algorithm decomposes the distance matrix into vectors and singular values. This way, two dimensions that best reflect the initial distances can be identified. From this, a coordinate system is generated where points representing the analysed objects will be placed. When coordinates are ready, and the difference between the scaled distance matrix and the distance matrix reconstructed from coordinates of objects is calculated, the algorithm fits the final solution iteratively to achieve the best map representation.
Now, to its practical use. The ALSCAL procedure is in menu ANALYSE -> SCALE -> MULTIDIMENSIONAL ALSCAL.
After selecting the analysed variables (here, each variable represents a score of a brand), define the data system in the Distances section. The Data are distances option allows you to use a distance matrix (of any shape), and the
After selecting the analysed variables (here, each variable represents a score of a brand), define the data system in the Distances section. The Data are distances option allows you to use a distance matrix (of any shape), and the Create distances from data option prepares the matrix from row data.
In the Create distances from data window, select the distance measure, standardisation options, and whether distances between cases (rows) or variables (columns) should be compared.
The next window accessed with the Model button sets detailed parameters of the model. It concerns mostly data in complex distance matrices. If you have raw data, just select the level of measurement for variables. For the brands in question, it is Ratio.
The Options window orders result objects of the model and sets convergence criteria for the algorithm. To obtain a perceptual map and a measure of fit, select Group plots and Data matrix in the Display section.
Multidimensional scaling: Results interpretation and model diagnostics
Using the perceptual map, you can identify groups of products that the respondents believe to have similar features.
When interpreting perceptual maps resulting from multidimensional scaling, focus on interpreting distances between objects. You can see that brands C and D are perceived similarly. Brands A, B, and H also form a separate, clearly distinct group, which is close to the perfect brand. Other products are unrelated and far from each other on the map. They are hard to classify into a single cluster. Additionally, if you know the features that differentiate the brands, you can try to interpret the dimensions the objects are scattered in.
The model can be generally diagnosed by assessing the relationship between distances from coordinates and actual distances between objects.
RSQ is the R-squared we know from linear regression. It is the square of the correlation between recreated distances and actual distances. The closer it is to 1, the better. Its value can be interpreted as the percentage of the variability accounted for by the model. The STRESS value is the opposite. It represents a lack of fit of the model and assumes values from 0 to 1. The lower it is, the better. Stress is the root of the sum of squares of differences between actual distances and reconstructed distances in relation to the sum of squares of actual distances. It reflects how much the model does not fit the data. As Stress is calculated from distances and RSQ from correlations between them, the two metrics do not add up to 1.
This is where we conclude the general presentation of multidimensional scaling and the basics of model interpreting. Future posts on this technique will focus on more advanced options and detailed diagnostics.
This blog is devoted to data collection and analysis with articles that aim to inspire data analysts from across the business world, academia and public sector. Our articles endeavor to inform, educate and entertain with one goal in mind: to show how to transform data into clear, attractive and usable information. We invite you to read and share.