The FindWAtt Blog

Google Shopping, Product Feed Management & Product Data Optimization

Share it!

Context is king — object-oriented analysis is the future

Tim Gilbert 2017-09-20

The biggest difference between a computer analyzing a title and a human analyzing a title is context.

Human minds operate as pattern recognition engines. In everything we do, we are constantly searching for the underlying order in the world and circumstances around us. Sometimes our instinctual need to find a pattern to explain everything actually creates problems when we invent patterns out of randomness. Optical illusions and superstition are both examples where we see patterns that aren’t actually there.

Computers have the opposite problem. They will blatantly ignore context and patterns unless specifically instructed not to, and much effort must be put forth to get them to “learn” based on what they have seen before.

The end goal is to increase accuracy and save human effort

The two worst things a quality analysis can do are miss potential errors and flag non-issues. Unfortunately, every extra check you program to find new errors also has the potential to accidentally identify something as a error when it isn't.

Consider the previous titles. Is “Mulege” a misspelled word or is it a properly spelled trademark? A simple program would use a dictionary file, discover that “Mulege” isn’t present, and mark it as a spelling problem. A human would look at the titles around it, or look for where else “Mulege” appears in the data feed and determine that it only shows up in the products of a single brand, and in a consistent pattern with other words that aren’t in the dictionary either, and conclude that the word isn’t misspelled but is probably a product line.

To analyze potential problems in context requires a different, non-linear approach.

Typically a program runs through a list of products one at a time, analyzing each one separately. It will not consider what it has seen in previous titles to help it with the next one, nor take what it learns in the current title and use that to filter out false-positive errors clarified by new information.

The solution to this problem is an object oriented approach where each item is kept in memory during the analysis and grouped together with others that share the problem indicator. Then after each title is analyzed separately, the program runs through each group looking for patterns common to products as evidence to prove or disprove its theory that a problem actually exists.

For example, if we find a set of 6 products that all share the phrase “engagement setting” and one product has the phrase “this engagement setting is a” in its description, the program can guess that “engagement setting” could be a product type that is hasn’t seen before.

Or if we see a word that could be a trademark, but it is spelled in slightly different ways across multiple titles then at least some of them are likely to be misspellings.

By adding this extra step in the analysis, the program can do a better job of ignoring false-flags and highlighting real errors for human review.

To evaluate the quality of your own titles and get recommendations for how to optimize them, check out our Free Shopping Feed Product Title Evaluation