The following section summarizes the methodology we used to select keywords for our study, extract data from Amazon’s search engine results pages (SERPs), and measure the strength of each ranking factor.
Our keyword selection involved a combination of keyword suggestions from the Google Keyword Tool as well as manual pruning to ensure our search terms matched logical queries for Amazon.
To start, we identified the search nodes (or top level categories) that we wanted to include in the study. For the sake of ensuring our findings on prime eligibility and the seller/vendor relationship were valid, we restricted our study to only categories that supplied a physical product.
We manually selected 30 “head terms” based on the category for which we were building a keyword list. Next, we fed those 30 terms into the Google Keyword Tool to get 20 mid-to-long tail suggestions for those terms.
We would then skim the resulting list to remove informational queries (which wouldn’t reflect Amazon’s on-site search) as well as potential duplicates. In some categories, a relatively high number of duplicate keywords meant we could not meet our self-imposed quota of 20 mid-to-long tail suggestions per head term.
Though this was a painstaking process, it ensured we were working with the best possible keyword sample possible. Here is our keyword count for each category.
|Appliances||543||Home & Kitchen||612|
|Arts, Crafts & Sewing||569||Industrial & Scientific||596|
|Automotive||551||Luggage & Travel Gear||578|
|Baby||558||Movies & TV||550|
|CDs & Vinyl||566||Patio, Lawn & Garden||581|
|Cell Phones & Accessories||582||Pet Supplies||560|
|Clothing, Shoes & Jewelry||460||Sports & Outdoors||594|
|Collectibles & Fine Art||332||Tools & Home Improvement||588|
|Computers||593||Toys & Games||602|
|Grocery & Gourmet Food||571||Wine||254|
|Health & Personal Care||598|
Once our keywords were collected, we queried Amazon’s API to collect data on search results and products. We collected the data using Amazon’s relevance rank, their default search method on the site. Each search query was categorized to its appropriate search node to bring back the correct results.
For each keyword, we brought back 50 search results. In instances where we found fewer than 50 search results, we omitted the keyword from our study.
Enforcing a search result minimum was a way to ensure that data from keywords with significantly less statistical significance were not considered in our summary of data correlations and subsequent key findings.
Our primary method for measuring the strength of each ranking signal was the mean of Spearman correlation coefficients.
In statistics, Spearman's rank correlation coefficient is a nonparametric measure of statistical dependence between two variables.
Specifically, we looked at the Spearman correlation between each ranking factor and SERP. Then, we took the mean of every correlation.
For information on the strength of each ranking factor, check out the correlations section of this report.