Post by account_disabled on Feb 27, 2024 4:49:55 GMT -5
The each tag was used on the site we could bias our final set of trusted tags in favor of these more popular terms. Benefits This was a great tiebreaker metric when we had two tags that were very similar but needed to choose just one. For example sometimes two variants of a phrase were completely acceptable such as a version with and without a hyphen. We could simply defer to the one with a higher tag count. Limitations The clear limitation of tag frequency is that many of the most frequent tags were too generic to be useful.
The tag blue isnt particularly useful when it just helps people find blue tshirts. The term Kazakhstan Phone Number is too generic and too competitive to warrant inclusion. Additionally the inclusion of too broad of a tag would simply create a very large crawl vs. trafficpotential ratio. A common tag will have hundreds if not thousands of matching products creating many pages of products for the single tag. If a tag produces paginated product listings but only has the potential to drive visitors a year it might not be worth it. Porter stemming Method Stemming is a method used to identify the root word from a tag by scanning the word right to left and using various pattern matching rules to remove characters suffixes until you arrive at the words stem.
There are a couple of popular stemmers available but we found as a tool for seeing alternative word forms. You can geek out by looking at the Porter stemming algorithm in Snowball here or you can play with a JS version here. Benefits Plural and possessive terms can be grouped by their stem for further analysis. Running Porter stemming on the terms pony and ponies will return poni as the stem which can then be used to group terms for further analysis. You can also run Porter stemming on phrases. For.
The tag blue isnt particularly useful when it just helps people find blue tshirts. The term Kazakhstan Phone Number is too generic and too competitive to warrant inclusion. Additionally the inclusion of too broad of a tag would simply create a very large crawl vs. trafficpotential ratio. A common tag will have hundreds if not thousands of matching products creating many pages of products for the single tag. If a tag produces paginated product listings but only has the potential to drive visitors a year it might not be worth it. Porter stemming Method Stemming is a method used to identify the root word from a tag by scanning the word right to left and using various pattern matching rules to remove characters suffixes until you arrive at the words stem.
There are a couple of popular stemmers available but we found as a tool for seeing alternative word forms. You can geek out by looking at the Porter stemming algorithm in Snowball here or you can play with a JS version here. Benefits Plural and possessive terms can be grouped by their stem for further analysis. Running Porter stemming on the terms pony and ponies will return poni as the stem which can then be used to group terms for further analysis. You can also run Porter stemming on phrases. For.