Food recognition
The Passio SDK has the ability the recognise anything from simple ingredients like blueberries and almonds to complex cooked dishes like a beef burrito with salad and fries.
There are two types of food recognition, each of them with their own strengths and weaknesses.
No, the image is sent to the backend for recognition
Very precise with all types of foods
On average 4-7 seconds
Yes, the recognition is done on the device
Good with single food items, struggles with complex cooked foods
Depending on the hardware of the device, ranges between 50-300ms
The Remote image recognition approach is good when the accuracy of the results is top priority, and waiting for the response is not an issue. This use case is implemented by taking static images from the camera or the library of the device and sending them for recognition in an asynchronous fashion.
The Local model approach is good when speed is of the essence. This use case is implemented using continuous frame recognition from the camera. A callback is registered to capture the results as they are coming from the camera stream.
Remote image recognition
This API sends an image as base64 format to an LLM on Passio's backend, and returns a list of recognised items.
The API can recognise foods raw or prepared, barcodes or nutrition facts tables. The type of recognition will be shown in the resultType
enum.
The default behaviour of this function is to resize the image to 512 pixels (longer dimension is resized, the other is calculated to keep aspect ratio). Using the PassioImageResolution
enum, the image can be either resized to 512, 1080 or keep the original resolution.
let image = Bundle.main.path(forResource: "image1", ofType: "png")
PassioNutritionAI.shared.recognizeImageRemote(image: image) { passioAdvisorFoodInfo in
print("Food Info:- \(passioAdvisorFoodInfo)")
}
public struct PassioAdvisorFoodInfo: Codable {
public let recognisedName: String
public let portionSize: String
public let weightGrams: Double
public let foodDataInfo: PassioFoodDataInfo
}
The response, presented as a list ofPassioAdvisorFoodInfo
objects, contains the name, portion and weight in grams recognised by the LLM. These attributes can be used for debugging, but the data from the nutritional database is contained either in the foodDataInfo
if the result type is a Food Item, or packagedFoodItem
if it's a Barcode or Nutrition Facts. To fetch the nutritional data for the PassioFoodDataInfo object, use the fetchFoodItemForDataInfo
function.
UI Example
Create a screen where the user can snap one or multiple images using the camera of the device
Upon clicking next, the
recognizeImageRemote
is invoked on each of the images in the listWait for all of the responses to come, add each results list to a final list of results. When the last asynchronous function is executed, present the final list to the user.


Local neural network model
To set up the local model and the continuous scanning mode, the camera preview and the recognition session need to be defined.
Camera preview
To start using camera detection the app must first acquire the permission to open the camera from the user. This permission is not handled by the SDK.
Add the UI element that is responsible for rendering the camera frames:
var videoLayer: AVCaptureVideoPreviewLayer?
func setupPreviewLayer() {
guard videoLayer == nil else { return }
if let videoLayer = passioSDK.getPreviewLayer() {
self.videoLayer = videoLayer
videoLayer.frame = view.bounds
view.layer.insertSublayer(videoLayer, at: 0)
}
}
Start food detection
The type of food detection is defined by the FoodDetectionConfiguration
object. To start the Food Recognition process a FoodRecognitionListener also has to be defined. The listener serves as a callback for all the different food detection processes defined by the FoodDetectionConfiguration. When the app is done with food detection, it should clear out the listener to avoid any unwanted UI updates.
Implement the delegate FoodRecognitionDelegate
:
extension PassioQuickStartViewController: FoodRecognitionDelegate {
func recognitionResults(candidates: FoodCandidates?,
image: UIImage?) {
if let candidates = candidates?.barcodeCandidates,
let candidate = candidates.first {
print("Found barcode: \(candidate.value)")
}
if let candidates = candidates?.packagedFoodCandidates,
let candidate = candidates.first {
print("Found packaged food: \(candidate.packagedFoodCode)")
}
if let candidates = candidates?.detectedCandidates,
let candidate = candidates.first {
print("Found detected food: \(candidate.name)")
}
}
}
Add the method startFoodDetection()
func startFoodDetection() {
setupPreviewLayer()
let config = FoodDetectionConfiguration(detectVisual: true,
volumeDetectionMode: .none,
detectBarcodes: true,
detectPackagedFood: true)
passioSDK.startFoodDetection(detectionConfig: config,
foodRecognitionDelegate: self) { ready in
if !ready {
print("SDK was not configured correctly")
}
}
}
In viewWillAppear
request authorisation to use the camera and start the recognition:
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(animated)
if AVCaptureDevice.authorizationStatus(for: .video) == .authorized {
startFoodDetection()
} else {
AVCaptureDevice.requestAccess(for: .video) { (granted) in
if granted {
DispatchQueue.main.async {
self.startFoodDetection()
}
} else {
print("The user didn't grant access to use camera")
}
}
}
}
Stop Food Detection in viewWillDisappear
:
override func viewWillDisappear(_ animated: Bool) {
super.viewWillDisappear(animated)
passioSDK.stopFoodDetection()
videoLayer?.removeFromSuperlayer()
videoLayer = nil
}
The FoodCandidates
object that is returned in the recognition callbacks contains three lists:
detectedCandidates
detailing the result of VISUAL detectionbarcodeCandidates
detailing the result of BARCODE detectionpackagedFoodCandidates
detailing the result of PACKAGED detection
Only the corresponding candidate lists will be populated (e.g. if you define detection types VISUAL and BARCODE, you will never receive a packagedFoodCandidates list in this callback).
Visual detection
A DetectedCandidate represents the result from running Passio's neural network, specialized for detecting foods like apples, salads, burgers etc. The properties of a detected candidate are:
name
passioID (unique identifier used to query the nutritional databse)
confidence (measure of how accurate is the candidate, ranges from 0 to 1)
boundingBox (a rectangle detailing the bounds of the recognised item within the image dimensions)
alternatives (list of alternative foods that are visually or contextually similar to the recognised food)
croppedImage (the image that the recognition was ran on)
To fetch the full nutrition data of a detected candidate use:
public func fetchFoodItemFor(passioID: PassioNutritionAISDK.PassioID, completion: @escaping (PassioNutritionAISDK.PassioFoodItem?) -> Void)
UI Example
Implement the camera screen using the steps above
Create a result view that can have two states: scanning and result
If the callback returns an empty list, show the scanning state. If it returns the result, display the name from the detectedCandidate.name


Example of an image that produces a DetectedCandidate:

Last updated