The recognizeSpeechRemote function is primarily used to fetch food items using a voice command, although it can be used to extract food items from a recipe in a text form as well.
The input to this function is just a string in free format, so the implementing side needs to do speech-to-text before using the API.
This function will be able to extract several pieces of data:
meal action, is this food being added or removed from the logs
meal time (breakfast, dinner, lunch or snack)
date of log
recognised name from the LLM
portion and weight from the LLM
nutritional data reference as PassioFoodDataInfo
Example of logging a morning breakfast:
let speech ="I had some scrambled egg whites, turkey bacon, whole grain toast, and a black coffee for breakfast"PassioNutritionAI.shared.recognizeSpeechRemote(from: speech) { recognitionResult inprint("Result:- \(recognitionResult)")}publicstructPassioSpeechRecognitionModel {publiclet action: PassioLogAction?publiclet meal: PassioMealTime?publiclet date: String!publiclet advisorFoodInfo: PassioAdvisorFoodInfo}publicenumPassioLogAction:String, Codable, CaseIterable {case addcase removecase none}
PassioSDK.instance.recognizeSpeechRemote("I had some scrambled egg whites, turkey bacon, whole grain toast, and a black coffee for breakfast") {// Display results}dataclassPassioSpeechRecognitionModel(val action: PassioLogAction?,val mealTime: PassioMealTime?,val date: String,val advisorInfo: PassioAdvisorFoodInfo)enumclassPassioLogAction { ADD, REMOVE, NONE}
import { PassioSDK,type PassioSpeechRecognitionModel,type PassioFoodDataInfo,} from'@passiolife/nutritionai-react-native-sdk-v3'constrecognizeSpeech=useCallback(async (text:string) => {try {// Fetch food results from the PassioSDK based on the query const recognizedModel = await PassioSDK.recognizeSpeechRemote("I had some scrambled egg whites, turkey bacon, whole grain toast, and a black coffee for breakfast")
setPassioSpeechRecognitionModel(recognizedModel) } catch (error) {// Handle errors, e.g., network issues or API failures } finally {// Reset loading state to indicate the end of the search } }, [cleanSearch] )
The SDK doesn't have the functionality to record the voice session, that has to be handled by the app.
UI Example
Create a screen where the user can record the voice logging command. Make sure to add the appropriate permissions.
The UI should enable the user to start/stop voice recording on a tap of a button. When the voice recording is done collect the string and use the SDK to extract food.
Once the SDK returns the results, show the list to the user with the option to deselect incorrect predictions.