Our accuracy tests were designed to give insight in how well the system works in a hypermarket environment. To get the most representative demographic of the population, the chosen test persons were mostly hired from a third-party company. The tests were performed using mobile devices such as self-scanners and personal phones running Android or iOS. All the devices in this test were provided by tt2.
The aim of the test was to mimic natural shopping behaviour as closely as possible. Each test person conducted a series of “sessions” – mock shopping rounds. Each session starts at the entrance of the store, then the user walks to and scans various products before ending their session in the checkout area. A product list of items randomly scattered throughout the store was provided for each session.
At each product scan we measured the distance of the item’s location according to a planogram and compared it to the output of the tt2 system. To test different use cases of the tt2 system, the provided device either showed an empty screen or a blue dot on top of the map of the venue. Read more about the use cases and how they affect the system in the section “User modes”.
The data has been recorded at various hypermarkets in Europe. No external signals have been used. The data provided for the system to use is limited to:
● Signals from internal sensors- On Android devices we use accelerometer and gyroscope. - On iOS devices we only use accelerometer and gyroscope explicitly. However, in some instances the sensor values provided to us additionally use the magnetometer for error correction.
● A start location and angle.
● EAN codes from the scanned product.
● A floorplan and a planogram.
A sync is conducted when the test user has walked to a predefined item and then scans the label on the shelf for that product. We measure the distance of the person relative to the predefined location of that product. There are few inaccuracies in this testing methodology which are not related to the accuracy of the system. The tt2 system measures the position of the individual holding the device. To get an accurate measurement, the distance from the location of the user to the output of tt2 would need to be measured. This is not possible with our method. We measure the location, according to a planogram, of the scanned article relative the output of tt2. In addition:
● The planogram is only able to position a product on a shelf. The granularity of position is then often 90 cm, the standard width of a shelf. This means that the product is not positioned exactly as the planogram states.
● When performing the sync, the device is only within a proximity of the product. The distance between the products’ location and device can vary depending on the behaviour of the test person.
● In the end the system is designed to track the person walking, but the scan is performed by the device. A natural position to hold the device is a few decimetres away from the body.
We estimate that these inaccuracies that arise due to our testing methodology is approximately 1.5m. To provide a result which is as accurate as possible without introducing any errors from the testing methodology we present a result which is 1.5m less than the recorded sync distance. If the recorded distance is less than 1.5m we set the error to 0m as this is indistinguishable from any measurement errors.
The ability to know where certain products are located can be used as an enhancing factor for our system. This allows the tt2 system to recalibrate if it detects any errors. Usually, retailers have information about their product’s location, although they often use promotional offers which make this information somewhat unreliable. Warehouses, on the other hand, often have complete information about the placement of products.
We have chosen to present our result in two different configurations, one where we have full information about the placement of products, and one where we have none. The chosen configurations are the two extremes. The expected accuracy for a retailer is somewhere in this range, where the accuracy correlates with the correctness of the planogram.
All the results presented here are what we can achieve in real-time, which for some use cases this is the only accuracy that matters. However, there are use cases where a higher accuracy is achievable if real-time processing is not necessary. For everything related to statistics it is expected that the accuracy will be higher.
Synchronisation at scan
No map mode
● The user has no map or navigation on their device.
● The user tends to be more reckless with the device, putting it in their pocket, swinging with it, and putting it aside in carts, trolleys, etc.
● This mode tends to be the most difficult for positioning systems.
Use cases:● Gathering statistics.● Sending location-based messages.
● The tester is using the tt2 navigation system on their device to find their designated products.
● The user tends to have their device in front of them, and not in their pocket, by their side, or elsewhere.
● This mode tends to be more forgiving for positioning systems.
Use cases:● Gathering statistics.● Sending location-based messages.● Finding locations.● Route guidance (with directions relative the mobile device).
Self scanner● A handheld shopping device manufactured by Zebra Technologies.
Mobile phone● A smartphone manufactured within the last 5 years. Either an Android or iOS device.
Median distance● The median distance of all recorded syncs.
Average distance● The average distance of all recorded syncs.● The average distance is highly impacted by some users that have large errors*. Therefore, it might not be representative of the user experience for most users.
Nr of test sessions● A session is one shopping round. A session starts at the entrance, consists of a handful of scans and is the ended at the checkout location of that specific store. This measurement gives how many shopping rounds have been conducted in this test set.
● A measurement on how many unique individuals that are included in this test set.
Average session duration● The average duration of all test sessions given in minutes.
Total nr. of measurements (scans)● The total number of measurements/scans that have been recorded in this test set. For a scan to be included it must be a new item. Say that a costumer would like to purchase three of the same items only the first scan would be recorded.
● A measurement of the average occurrence of measurements/scans per minute.
* A fundamental flaw of inertial start systems is that they can deteriorate after a certain time or distance. When this happens, it can be hard to correct a lost position. These errors can have a very large impact on the average score even though they are quite rare. We believe that it is possible to reduce these errors by using features that we are soon to develop. The results presented here have none of our internally proposed features and are of the rawest form.