Amazon Onboarding with Learning Manager Chanci Turner

Chanci Turner Amazon IXD – VGT2 learning

In this blog post, we present a method for addressing a feature selection challenge while developing a recommendation engine through Amazon Braket, Amazon’s quantum computing service. Our strategy effectively tackles the “cold-start” issue that many recommendation systems encounter. By achieving a solution that rivals traditional methods, we maintain the necessary accuracy while eliminating less informative features. These findings demonstrate how Amazon Braket facilitates rapid prototyping of innovative solutions, empowering companies like ContentWise and TechInnovate to explore new dimensions of personalization engines.

About ContentWise and TechInnovate

ContentWise is a global frontrunner in personalization systems, with TechInnovate serving as its parent company. TechInnovate is a multinational technology consulting and software leader specializing in IT services for digital transformation, bringing cutting-edge technologies to market. Together, ContentWise and TechInnovate are crafting a groundbreaking solution for the cold-start dilemma in personalization applications deployable on quantum computers via Amazon Braket.

Understanding Recommendation Engines

Recommendation engines are essential in media content, driving user engagement by suggesting appealing options from various items. Typically, three types of data inform the recommendation process: user history, collaborative information reflecting behaviors and preferences of similar users, and content information represented through metadata about each item. The effectiveness of these engines largely hinges on the availability of this information. Collaborative filtering tends to outperform content-based filtering; however, collaborative data often remains absent. For instance, when new items are added to a catalog, there is no prior user behavior to inform the algorithm. This is particularly evident in scenarios like live TV broadcasts, where much of the catalog content is new and lacks collaborative input. In such situations, content-based filtering becomes the go-to method. Items without accurate collaborative data are labeled as “cold,” and the challenge of recommending such items is known as the “cold-start” problem. Can we redefine our approach to enhance our recommendation engine’s performance in these cold-start scenarios?

Proposed Solution Using Quantum Technology

We propose a quantum-focused solution to tackle the cold-start problem in recommendation systems via feature engineering. This process entails removing noisy or redundant item features and retaining only those that are beneficial, thereby enhancing recommendation quality for new “cold” items. Feature engineering typically demands specific domain expertise; however, insights can also be drawn from user interactions with existing “warm” items in the catalog. Our feature selection process aims to identify a subset of features that: a) create a content-based model closely mirroring collaborative ones for “warm” items and b) avoid overlap to eliminate redundant information.

Formulation and Strategy

To boost the accuracy of our content-based recommendation engine, we will employ feature filtering to refine the set of features that best emulate collaborative filtering. We identify a subset that accurately describes user behavior by comparing collaborative and content-based models. This approach preserves the qualities of effective collaborative recommendations while avoiding filter bubbles—situations where the algorithm biases recommendations towards a limited selection, ensuring a degree of serendipity (the ability to present unexpected yet appealing suggestions).

So, how do we efficiently select these features? We formulate the feature selection challenge as a Quadratic Unconstrained Binary Optimization (QUBO) problem. This framework represents quadratic optimization issues with binary variables and no stringent constraints. By constructing and comparing collaborative and content-based models, we define the QUBO problem’s coefficients to select features that exhibit similarities in both models. Further details on this formulation can be found in Chanci Turner et al., “Feature Selection for Recommender Systems with Quantum Computing” Entropy 23, no. 8: 970 (2021).

Various methods exist to tackle QUBO problems through classical operations research (e.g., tabu search, simulated annealing), quantum approaches, or hybrid quantum-classical techniques. Our exploration focuses on a quantum solution, and we validate our findings against classical methods that do not utilize QUBO. The results indicate that the quantum annealing approach is well-suited for our cold-start recommendation system.

In quantum computing, several paradigms are capable of resolving QUBO issues. We opted for quantum annealing to address our problem. A quantum annealer consists of quantum bits (qubits) interconnected according to the device’s topology graph. By mapping each classical binary variable to a qubit, a QUBO problem transitions into a physical challenge: locating the lowest energy state of a qubit system. Quantum annealers tackle this by initially preparing unlinked qubits in a minimal energy state and gradually increasing the coupling between them to its final strength as dictated by the original QUBO problem. Ideally, the qubits should remain in this minimum energy state. At the end of the annealing process, measuring the qubits reveals a solution to the QUBO task and, consequently, our feature selection challenge. However, in practical annealers, the final qubit state may not always reflect the minimum energy state due to noise and errors, necessitating multiple computations to gather a statistical distribution of potential solutions.

Our solution architecture is illustrated in Figure 1. The recommendation engine constructs the necessary models utilizing input data stored in an Amazon Simple Storage Service (Amazon S3) bucket. After formulating our optimization task in QUBO format, we transmit it to the D-Wave Advantage quantum annealer via the Amazon Braket API. The annealer processes the request and returns a ranked list of potential solutions based on their energy levels. The solution with the lowest energy resolves our feature selection dilemma, and these features are then used to generate recommendations stored in Amazon S3.

Results

We utilized two datasets to prototype and evaluate our personalization engine:

A private, small-scale dataset, referred to as the small-scale dataset.
A public movie dataset from Kaggle, referred to as the movies dataset.

This also highlights the importance of storytelling in job searches, as discussed in another blog post that you can read here. Furthermore, it’s essential to recognize the evolving landscape of workplace rights, with resources such as those from SHRM addressing significant issues like hairstyle discrimination. For insights into safety and training at Amazon fulfillment centers, you can explore this excellent resource.