Food recognition

The Passio SDK has the ability the recognise anything from simple ingredients like blueberries and almonds to complex cooked dishes like a beef burrito with salad and fries.

There are two types of food recognition, each of them with their own strengths and weaknesses.

Type

Works offline

Precision

Response time

Remote image recognition

No, the image is sent to the backend for recognition

Very precise with all types of foods

On average 4-7 seconds

Local neural network model

Yes, the recognition is done on the device

Good with single food items, struggles with complex cooked foods

Depending on the hardware of the device, ranges between 50-300ms

The Remote image recognition approach is good when the accuracy of the results is top priority, and waiting for the response is not an issue. This use case is implemented by taking static images from the camera or the library of the device and sending them for recognition in an asynchronous fashion.

The Local model approach is good when speed is of the essence. This use case is implemented using continuous frame recognition from the camera. A callback is registered to capture the results as they are coming from the camera stream.

Remote image recognition

This API sends an image as base64 format to an LLM on Passio's backend, and returns a list of recognised items.

The API can recognise foods raw or prepared, barcodes or nutrition facts tables. The type of recognition will be shown in the resultType enum.

The default behaviour of this function is to resize the image to 512 pixels (longer dimension is resized, the other is calculated to keep aspect ratio). Using the PassioImageResolution enum, the image can be either resized to 512, 1080 or keep the original resolution.

let image = Bundle.main.path(forResource: "image1", ofType: "png")
PassioNutritionAI.shared.recognizeImageRemote(image: image) { passioAdvisorFoodInfo in
    print("Food Info:- \(passioAdvisorFoodInfo)")
}
public struct PassioAdvisorFoodInfo: Codable {
    public let recognisedName: String
    public let portionSize: String
    public let weightGrams: Double
    public let foodDataInfo: PassioFoodDataInfo
}

val bitmap = loadBitmapFromAssets(assets, "image1.png")
PassioSDK.instance.recognizeImageRemote(bitmap) { result ->
    // display the list
}

data class PassioAdvisorFoodInfo(
    val recognisedName: String,
    val portionSize: String,
    val weightGrams: Double,
    val foodDataInfo: PassioFoodDataInfo? = null,
    val packagedFoodItem: PassioFoodItem? = null,
    val resultType: PassioFoodResultType,
)

enum class PassioFoodResultType {
    FOOD_ITEM,
    BARCODE,
    NUTRITION_FACTS
}

import {
  PassioSDK,
  type PassioAdvisorFoodInfo,
  type PassioFoodDataInfo,
} from '@passiolife/nutritionai-react-native-sdk-v3'

import { launchImageLibrary } from 'react-native-image-picker'

const onScanImage = useCallback(async () => {
    try {
      const { assets } = await launchImageLibrary({ mediaType: 'photo' })
      if (assets) {
        setLoading(true)
        setPassioSpeechRecognitionModel(null)
        PassioSDK.recognizeImageRemote(
          assets?.[0].uri?.replace('file://', '') ?? ''
        )
          .then(async (candidates) => {
            setPassioSpeechRecognitionModel(candidates)
          })
          .catch(() => {
            Alert.alert('Unable to recognized this image')
          })
          .finally(() => {
            setLoading(false)
          })
      }
    } catch (err) {
      setLoading(false)
    }
  }, [])

The response, presented as a list ofPassioAdvisorFoodInfo objects, contains the name, portion and weight in grams recognised by the LLM. These attributes can be used for debugging, but the data from the nutritional database is contained either in the foodDataInfo if the result type is a Food Item, or packagedFoodItem if it's a Barcode or Nutrition Facts. To fetch the nutritional data for the PassioFoodDataInfo object, use the fetchFoodItemForDataInfo function.

UI Example

Create a screen where the user can snap one or multiple images using the camera of the device
Upon clicking next, the recognizeImageRemote is invoked on each of the images in the list
Wait for all of the responses to come, add each results list to a final list of results. When the last asynchronous function is executed, present the final list to the user.

Local neural network model

To set up the local model and the continuous scanning mode, the camera preview and the recognition session need to be defined.

Camera preview

To start using camera detection the app must first acquire the permission to open the camera from the user. This permission is not handled by the SDK.

Add the UI element that is responsible for rendering the camera frames:

var videoLayer: AVCaptureVideoPreviewLayer?

func setupPreviewLayer() {
    guard videoLayer == nil else { return }
    if let videoLayer = passioSDK.getPreviewLayer() {
        self.videoLayer = videoLayer
        videoLayer.frame = view.bounds
        view.layer.insertSublayer(videoLayer, at: 0)
    }
}

To start using the camera in your Activity/Fragment, implement the PassioCameraViewProvider interface. By implementing this interface the SDK will use that component as the lifecycle owner of the camera (when that component calls onPause() the camera will stop) and also will provide the Context in order for the camera to start. The component implementing the interface must have a PreviewView in its view hierarchy.

Start by adding the PreviewView to your view hierarchy. Go to your layout.xml and add the following.

<androidx.camera.view.PreviewView
    android:id="@+id/myPreviewView"
    android:layout_width="match_parent"
    android:layout_height="match_parent" />

Using the PassioCameraViewProvider

This approach is more manual but gives you more flexibility. You need to implement the PassioCameraViewProvider interface and supply the needed LifecycleOwner and the PreviewView added in the initial step.

class MainActivity : AppCompatActivity(), PassioCameraViewProvider {
	
    override fun requestCameraLifecycleOwner(): LifecycleOwner {
        return this
    }

    override fun requestPreviewView(): PreviewView {
        return myPreviewView
    }
}

After the user has granted permission to use the camera, start the SDK camera

override fun onStart() {
    super.onStart()
    if (!hasPermissions()) {
        ActivityCompat.requestPermissions(
            this,
            REQUIRED_PERMISSIONS,
            REQUEST_CODE_PERMISSIONS
        )
        return
    } else {
        PassioSDK.instance.startCamera(this /*reference to the PassioCameraViewProvider*/)
    }
}

Using the PassioCameraFragment

PassioCameraFragment is an abstract class that handles Camera permission at runtime as well as starting the Camera process of the SDK. To use the PassioCameraFragment simply extend it in your own fragment and supply the PreviewView that has been added to the view hierarchy in the previous step.

class MyFragment : PassioCameraFragment() {
	
    override fun getPreviewView(): PreviewView {
	return myPreviewView
    }

    override fun onCameraReady() { 
    	// Proceed with initializing the recognition session
    }

    override fun onCameraPermissionDenied() { 
    	// Explain to the user that the camera is needed for this feature to
        // work and ask for permission again
    }
}

import {
  PassioSDK,
  DetectionCameraView,
} from '@passiolife/nutritionai-react-native-sdk-v2';

To show the live camera preview, add the DetectionCameraView to your view

<DetectionCameraView style={{flex: 1, width: '100%'}} />

@override
Widget build(BuildContext context) {
  return Scaffold(
    body: Stack(
      children: [
        const PassioPreview(),
        ...
      ],
    ),
  );
}

Start food detection

The SDK can detect 3 different categories: VISUAL, BARCODE and PACKAGED. The VISUAL recognition is powered by Passio's neural network and is used to recognize over 4000 food classes. BARCODE, as the name suggests, can be used to scan a barcode located on a branded food. Finally, PACKAGED can detect the name of a branded food. To choose one or more types of detection, a FoodDetectionConfiguration object is defined and the corresponding fields are set. The VISUAL recognition works automatically.

The type of food detection is defined by the FoodDetectionConfiguration object. To start the Food Recognition process a FoodRecognitionListener also has to be defined. The listener serves as a callback for all the different food detection processes defined by the FoodDetectionConfiguration. When the app is done with food detection, it should clear out the listener to avoid any unwanted UI updates.

Implement the delegate FoodRecognitionDelegate:

extension PassioQuickStartViewController: FoodRecognitionDelegate {
  func recognitionResults(candidates: FoodCandidates?,
                          image: UIImage?) {
        if let candidates = candidates?.barcodeCandidates,
           let candidate = candidates.first {
            print("Found barcode: \(candidate.value)")
        }
        
        if let candidates = candidates?.packagedFoodCandidates,
           let candidate = candidates.first {
            print("Found packaged food: \(candidate.packagedFoodCode)")
        }
        
        if let candidates = candidates?.detectedCandidates,
           let candidate = candidates.first {
            print("Found detected food: \(candidate.name)")
        }
  }
}

Add the method startFoodDetection()

func startFoodDetection() {
    setupPreviewLayer()
                
    let config = FoodDetectionConfiguration(detectVisual: true,
                                            volumeDetectionMode: .none,
                                            detectBarcodes: true,
                                            detectPackagedFood: true)
    passioSDK.startFoodDetection(detectionConfig: config,
                                 foodRecognitionDelegate: self) { ready in
        if !ready {
            print("SDK was not configured correctly")
        }
    }
}

In viewWillAppear request authorisation to use the camera and start the recognition:

override func viewWillAppear(_ animated: Bool) {
    super.viewWillAppear(animated)
    if AVCaptureDevice.authorizationStatus(for: .video) == .authorized {
        startFoodDetection()
    } else {
        AVCaptureDevice.requestAccess(for: .video) { (granted) in
            if granted {
                DispatchQueue.main.async {
                    self.startFoodDetection()
                }
            } else {
                print("The user didn't grant access to use camera")
            }
        }
    }
}

Stop Food Detection in viewWillDisappear:

override func viewWillDisappear(_ animated: Bool) {
    super.viewWillDisappear(animated)
    passioSDK.stopFoodDetection()
    videoLayer?.removeFromSuperlayer()
    videoLayer = nil
}

private val foodRecognitionListener = object : FoodRecognitionListener {
    override fun onRecognitionResults(
        candidates: FoodCandidates,
        image: Bitmap?,
    ) {
        // Handle result
    }
}

Using the listener and the detection options start the food detection by calling the startFoodDetection method of the SDK.

override fun onStart() {
    super.onStart()
    val options = FoodDetectionConfiguration().apply {
        detectBarcodes = true
    }
    PassioSDK.instance.startFoodDetection(foodListener, options)
}

Stop the food recognition in the onStop() lifecycle callback.

override fun onStop() {
    PassioSDK.instance.stopFoodDetection()
    super.onStop()
}

const config: FoodDetectionConfig = {
   /**
   * Detect barcodes on packaged food products. Results will be returned
   * as `BarcodeCandidates` in the `FoodCandidates` property of `FoodDetectionEvent`
   */
  detectBarcodes: true,
 
  /**
   * Results will be returned as DetectedCandidate in the `FoodCandidates`and 
   * property of `FoodDetectionEvent`
   */

  detectPackagedFood: true,
};
    
useEffect(() => {
  if (!isReady) {
    return;
  }
  const subscription = PassioSDK.startFoodDetection(
    config,
    async (detection: FoodDetectionEvent) => {

      const { candidates, nutritionFacts } = detection

      if (candidates?.barcodeCandidates?.length) {
         
         // show barcode candidates to the user

      } else if (candidates?.packagedFoodCode?.length) {
        
        // show OCR candidates to the user

      } else if (candidates?.detectedCandidates?.length) {
        
        // show visually recognized candidates to the user

      }
    },
  );
  // stop food detection when component unmounts
  return () => subscription.remove(); 

}, [isReady]);

Add the method startFoodDetection() and register a FoodRecognitionListener

void _startFoodDetection() {
  var detectionConfig = FoodDetectionConfiguration();
  detectionConfig.detectBarcodes = true;
  NutritionAI.instance.startFoodDetection(detectionConfig, this);
}

@override
void recognitionResults(FoodCandidates? foodCandidates, PlatformImage? image) {
  // Handle result
}

Stop Food Detection on widget dispose:

@override
void dispose() {
  NutritionAI.instance.stopFoodDetection();
  super.dispose();
}

The FoodCandidates object that is returned in the recognition callbacks contains three lists:

detectedCandidates detailing the result of VISUAL detection
barcodeCandidates detailing the result of BARCODE detection
packagedFoodCandidates detailing the result of PACKAGED detection

Only the corresponding candidate lists will be populated (e.g. if you define detection types VISUAL and BARCODE, you will never receive a packagedFoodCandidates list in this callback).

Visual detection

A DetectedCandidate represents the result from running Passio's neural network, specialized for detecting foods like apples, salads, burgers etc. The properties of a detected candidate are:

name
passioID (unique identifier used to query the nutritional databse)
confidence (measure of how accurate is the candidate, ranges from 0 to 1)
boundingBox (a rectangle detailing the bounds of the recognised item within the image dimensions)
alternatives (list of alternative foods that are visually or contextually similar to the recognised food)
croppedImage (the image that the recognition was ran on)

To fetch the full nutrition data of a detected candidate use:

public func fetchFoodItemFor(passioID: PassioNutritionAISDK.PassioID, completion: @escaping (PassioNutritionAISDK.PassioFoodItem?) -> Void)

fun fetchFoodItemForPassioID(
    passioID: PassioID,
    onResult: (foodItem: PassioFoodItem?) -> Unit
)

fetchFoodItemForPassioID(passioID: PassioID): Promise<PassioFoodItem | null>

Future<PassioFoodItem?> fetchFoodItemForPassioID(PassioID passioID)

UI Example

Implement the camera screen using the steps above
Create a result view that can have two states: scanning and result
If the callback returns an empty list, show the scanning state. If it returns the result, display the name from the detectedCandidate.name

Example of an image that produces a DetectedCandidate:

PreviousUse Cases NextNutrition data

Last updated 11 months ago