Speech recognition

The recognizeSpeechRemote function is primarily used to fetch food items using a voice command, although it can be used to extract food items from a recipe in a text form as well.

The input to this function is just a string in free format, so the implementing side needs to do speech-to-text before using the API.

This function will be able to extract several pieces of data:

meal action, is this food being added or removed from the logs
meal time (breakfast, dinner, lunch or snack)
date of log
recognised name from the LLM
portion and weight from the LLM
nutritional data reference as PassioFoodDataInfo

Example of logging a morning breakfast:

let speech = "I had some scrambled egg whites, turkey bacon, whole grain toast, and a black coffee for breakfast"
PassioNutritionAI.shared.recognizeSpeechRemote(from: speech) { recognitionResult in
    print("Result:- \(recognitionResult)")
}

public struct PassioSpeechRecognitionModel {
    public let action: PassioLogAction?
    public let meal: PassioMealTime?
    public let date: String!
    public let advisorFoodInfo: PassioAdvisorFoodInfo
}

public enum PassioLogAction: String, Codable, CaseIterable {
    case add
    case remove
    case none
}

PassioSDK.instance.recognizeSpeechRemote(
    "I had some scrambled egg whites, turkey bacon, whole grain toast, and a black coffee for breakfast"
) {
    // Display results
}

data class PassioSpeechRecognitionModel(
    val action: PassioLogAction?,
    val mealTime: PassioMealTime?,
    val date: String,
    val advisorInfo: PassioAdvisorFoodInfo
)

enum class PassioLogAction {
    ADD,
    REMOVE,
    NONE
}

import {
  PassioSDK,
  type PassioSpeechRecognitionModel,
  type PassioFoodDataInfo,
} from '@passiolife/nutritionai-react-native-sdk-v3'

const recognizeSpeech = useCallback(
    async (text: string) => {
      try {
          // Fetch food results from the PassioSDK based on the query
          const recognizedModel = await PassioSDK.recognizeSpeechRemote("I had some scrambled egg whites, turkey bacon, whole grain toast, and a black coffee for breakfast")
          setPassioSpeechRecognitionModel(recognizedModel)
        } catch (error) {
          // Handle errors, e.g., network issues or API failures
        } finally {
          // Reset loading state to indicate the end of the search
        }
    },
    [cleanSearch]
  )

The SDK doesn't have the functionality to record the voice session, that has to be handled by the app.

UI Example

Create a screen where the user can record the voice logging command. Make sure to add the appropriate permissions.
The UI should enable the user to start/stop voice recording on a tap of a button. When the voice recording is done collect the string and use the SDK to extract food.
Once the SDK returns the results, show the list to the user with the option to deselect incorrect predictions.

PreviousSearch, Food Icons, RefCode NextNutrition Advisor

Last updated 11 months ago