The Passio SDK has the ability the recognise anything from simple ingredients like blueberries and almonds to complex cooked dishes like a beef burrito with salad and fries.
There are two types of food recognition, each of them with their own strengths and weaknesses.
Good with single food items, struggles with complex cooked foods
Depending on the hardware of the device, ranges between 50-300ms
The Remote image recognition approach is good when the accuracy of the results is top priority, and waiting for the response is not an issue. This use case is implemented by taking static images from the camera or the library of the device and sending them for recognition in an asynchronous fashion.
The Local model approach is good when speed is of the essence. This use case is implemented using continuous frame recognition from the camera. A callback is registered to capture the results as they are coming from the camera stream.
Remote image recognition
This API sends an image as base64 format to an LLM on Passio's backend, and returns a list of recognised items. The default behaviour of this function is to resize the image to 512 pixels (longer dimension is resized, the other is calculated to keep aspect ratio). Using the PassioImageResolution enum, the image can be either resized to 512, 1080 or keep the original resolution.
val bitmap =loadBitmapFromAssets(assets, "image1.png")PassioSDK.instance.recognizeImageRemote(bitmap) { result ->// display the list}dataclassPassioAdvisorFoodInfo(val recognisedName: String,val portionSize: String,val weightGrams: Double,val foodDataInfo: PassioFoodDataInfo?)
The response, presented as a list ofPassioAdvisorFoodInfo objects, contains the name, portion and weight in grams recognised by the LLM, as well as a PassioFoodDataInfo reference from Passio's nutritional database. To fetch the nutritional data for the PassioFoodDataInfo object, use the fetchFoodItemForDataInfo function.
To match the portion size of the PassioAdvisorFoodInfo object and the fetched PassioFoodItem, pass the weightGrams to the fetchFoodItemForDataInfo method.Also, the SDK has the ability to send 7 concurrent image request to the backend, so multiple images can be processed in parallel.
UI Example
Create a screen where the user can snap one or multiple images using the camera of the device
Upon clicking next, the recognizeImageRemote is invoked on each of the images in the list
Wait for all of the responses to come, add each results list to a final list of results. When the last asynchronous function is executed, present the final list to the user.
Local neural network model
To set up the local model and the continuous scanning mode, the camera preview and the recognition session need to be defined.
Camera preview
To start using camera detection the app must first acquire the permission to open the camera from the user. This permission is not handled by the SDK.
Add the UI element that is responsible for rendering the camera frames:
To start using the camera in your Activity/Fragment, implement the PassioCameraViewProvider interface. By implementing this interface the SDK will use that component as the lifecycle owner of the camera (when that component calls onPause() the camera will stop) and also will provide the Context in order for the camera to start. The component implementing the interface must have a PreviewView in its view hierarchy.
Start by adding the PreviewView to your view hierarchy. Go to your layout.xml and add the following.
This approach is more manual but gives you more flexibility. You need to implement the PassioCameraViewProvider interface and supply the needed LifecycleOwner and the PreviewView added in the initial step.
After the user has granted permission to use the camera, start the SDK camera
overridefunonStart() {super.onStart()if (!hasPermissions()) { ActivityCompat.requestPermissions(this, REQUIRED_PERMISSIONS, REQUEST_CODE_PERMISSIONS )return } else { PassioSDK.instance.startCamera(this/*reference to the PassioCameraViewProvider*/) }}
Using the PassioCameraFragment
PassioCameraFragment is an abstract class that handles Camera permission at runtime as well as starting the Camera process of the SDK. To use the PassioCameraFragment simply extend it in your own fragment and supply the PreviewView that has been added to the view hierarchy in the previous step.
classMyFragment : PassioCameraFragment() {overridefungetPreviewView(): PreviewView {return myPreviewView }overridefunonCameraReady() { // Proceed with initializing the recognition session }overridefunonCameraPermissionDenied() { // Explain to the user that the camera is needed for this feature to// work and ask for permission again }}
The SDK can detect 3 different categories: VISUAL, BARCODE and PACKAGED. The VISUAL recognition is powered by Passio's neural network and is used to recognize over 4000 food classes. BARCODE, as the name suggests, can be used to scan a barcode located on a branded food. Finally, PACKAGED can detect the name of a branded food. To choose one or more types of detection, a FoodDetectionConfiguration object is defined and the corresponding fields are set. The VISUAL recognition works automatically.
The type of food detection is defined by the FoodDetectionConfiguration object. To start the Food Recognition process a FoodRecognitionListener also has to be defined. The listener serves as a callback for all the different food detection processes defined by the FoodDetectionConfiguration. When the app is done with food detection, it should clear out the listener to avoid any unwanted UI updates.
constconfig:FoodDetectionConfig= {/** * Detect barcodes on packaged food products. Results will be returned * as `BarcodeCandidates` in the `FoodCandidates` property of `FoodDetectionEvent` */ detectBarcodes:true,/** * Results will be returned as DetectedCandidate in the `FoodCandidates`and * property of `FoodDetectionEvent` */ detectPackagedFood:true,};useEffect(() => {if (!isReady) {return; }constsubscription=PassioSDK.startFoodDetection( config,async (detection:FoodDetectionEvent) => {const { candidates,nutritionFacts } = detectionif (candidates?.barcodeCandidates?.length) {// show barcode candidates to the user } elseif (candidates?.packagedFoodCode?.length) {// show OCR candidates to the user } elseif (candidates?.detectedCandidates?.length) {// show visually recognized candidates to the user } }, );// stop food detection when component unmountsreturn () =>subscription.remove(); }, [isReady]);
Add the method startFoodDetection() and register a FoodRecognitionListener
The FoodCandidates object that is returned in the recognition callbacks contains three lists:
detectedCandidates detailing the result of VISUAL detection
barcodeCandidates detailing the result of BARCODE detection
packagedFoodCandidates detailing the result of PACKAGED detection
Only the corresponding candidate lists will be populated (e.g. if you define detection types VISUAL and BARCODE, you will never receive a packagedFoodCandidates list in this callback).
Visual detection
A DetectedCandidate represents the result from running Passio's neural network, specialized for detecting foods like apples, salads, burgers etc. The properties of a detected candidate are:
name
passioID (unique identifier used to query the nutritional databse)
confidence (measure of how accurate is the candidate, ranges from 0 to 1)
boundingBox (a rectangle detailing the bounds of the recognised item within the image dimensions)
alternatives (list of alternative foods that are visually or contextually similar to the recognised food)
croppedImage (the image that the recognition was ran on)