Breaking
February 11, 2025

I pushed an AI to make recipes from photos. It pushed back | usagoldmines.com

Yes, AIs can write recipes and sometimes they’re pretty good! (And sometimes not so much.) But for my latest challenge, I wanted to build an AI that would compose recipes from iPhone snapshots and put them in the proper format for my recipe app. Sound easy? Not really, as it turned out.

Now, it’s not all that tricky to have, say, ChatGPT write on-the-fly recipes based on photos–you can even do it using Apple Intelligence on an iPhone. Just take a snap of a meal with Visual Intelligence, ask for a description (Siri will hand that task off to ChatGPT), then follow up with a request for a recipe.

So, how good are these recipes? That’s a topic for a whole different story, but in my experience, they’re a hit and miss. A ChatGPT recipe that called for cornstarch in a salmon honey glaze turned out to be rather dull and chalky while a Thai curry chicken recipe was so tasty that we’re making it for a third time this weekend.

(Of course, one could argue that ChatGPT is stealing these recipes rather than creating them–again, that’s another story.)

Anyway, while it would be relatively easy to craft a recipe-focused GPT (plenty of premade versions are available in OpenAI’s GPT library, or you can simply make your own), I wanted to try something different: a locally-hosted photo-to-recipe AI chatbot.

The setup

For background, I have Ollama (an application for running LLMs on local hardware) installed on a Mac mini M4 souped up with 64GB of RAM, along with Open WebUI on a Raspberry Pi. The latter acts as a ChatGPT-like front end for the Ollama models.

I have a variety of local LLMs (Google’s Gemma 2, Alibaba’s Qwen 2.5, and Microsoft’s Phi-4, for starters) that I use for various tasks, but for my photo-to-recipe experiment, I downloaded a new one: Llama 3.2 Vision, a Meta multimodal model that can “see” images and describe them.

Besides simply writing recipes based on food photos, I also wanted my AI bot to put the recipes in a format that could be smoothly ingested by a recipe app. That requires the recipe to be shaped into JSON format (a language that helps machines trade data) while also being marked up in the proper schema for web recipes. This ensures that search engines and recipe apps know that this item is an ingredient, this item is a cooking step, and so on.

Further reading: How not to get bamboozled by AI content on the web

The plan

Now, a quick and dirty way to get started with this setup is to just take a photo with your iPhone, upload it to the Open WebUI chat window for Llama 3.2 Vision (my “seeing” LLM), and give it a prompt, like: “Examine this food photo and write a recipe, putting it in JSON format and using the proper Schema.org markup for recipes.”

The problem there is two-fold: One, typing out that prompt each time you want a photo recipe gets tedious, and two, the results can be sketchy. Sometimes, Llama would surprise me with a perfectly formatted JSON recipe, other times, I’d get the recipe, but no JSON, or malformed JSON that didn’t work with my self-hosted Mealie recipe application.

What I needed was a custom system prompt. That is, a prompt that serves as an overall guiding light for an LLM, telling it what to do and how to act during every interaction. With the right system prompt, an AI model can do your bidding with a minimum of extra prompting.

Ben Patterson/Foundry

I’m no prompt engineer, but luckily I have an expert at my beck and call: Google’s Gemini. (I could have used ChatGPT too, but my wallet and I are taking a break from OpenAI’s paid Plus tier.)

I asked the “thinking” version of Gemini 2.0 Flash (“thinking” means the model ponders its answer before giving it to you) to craft a suitable system prompt for my photo-to-recipe AI, and it came up with a 700-word wall of text, complete with explicit instructions and lots of phrases in ALL CAPS. Here’s a taste:

You are an expert culinary assistant specializing in recipe generation from food photographs. Your task is to analyze a user-submitted photo of a food dish, create a complete recipe, and output it in **COMPLETE and VALID JSON format**, including tags, categories, and recipe time information. **AVOID ANY TRUNCATION OF THE JSON OUTPUT.**

(The full system prompt is at the very end of the story, and suggestions are welcome.)

I fed this massive tome into Open WebUI’s system prompt field for my Llama 3.2 model, and then the iterations began.

The push-back

I found an old food snapshot from my iPhone’s Photos app and gave it to Llama with the simple prompt, “Make a recipe from this food photo.” The result? A decent JSON recipe with all the ingredients, but only two cooking steps (the rest had been truncated). A second try got the steps right but lost the ingredients, while another attempt brought the ingredients back but (again) chopped off the cooking steps.

Back and forth we went, with me pasting Llama’s output into Gemini, Gemini making tweaks to the system prompt, me putting the adjusted prompt back into Llama, Llama coughing up outputs with new errors, rinse, repeat. (Yes, this went on for a few hours. Welcome to self-hosting.)

Finally, I came to the conclusion that while the smaller, 11 billion-parameter version of Llama 3.2-Vision that I was using (my hardware isn’t powerful enough for the 90B version) was good at describing photos, it couldn’t cut the mustard when it came to recipe formatting. Llama needed a buddy.

Enter DeepSeek.

The team

Now, before anyone reports me to Congress, I should note that I’m not referring to the full-on, 671-billion parameter version of DeepSeek R1, the industry-shaking LLM that’s keeping Sam Altman up at night. Instead, I’m using a much smaller, self-hosted DeepSeek that’s “distilled” from Alibaba’s Qwen models. This hybrid LLM has the DeepSeek name and uses similar “thinking” methodologies, but it’s not the DeepSeek that everyone’s so excited about.

Anyway, I tried a new workflow by getting a food photo description from Llama and feeding it to “little” DeepSeek for the recipe crafting and formatting.

With my new Llama-and-DeepSeek duo, my recipe results were looking much better. The recipes themselves were reasonably meaty (both figuratively and literally), the ingredients looked good, the cooking steps were all there, and I even got recipe tags (“Stir Fry,” “Shrimp,” “Savory,” and “Sweet Sauce”), cook and prep times, and colorful descriptions (“A flavorful stir-fry featuring shrimp, red bell pepper, broccoli, and cauliflower tossed in a savory brown sauce. Served over white rice and garnished with green onions and sesame seeds.”)  

The final dish (well, final-ish)

To be clear, my photo-to-recipe AI bot has a long ways to go. Cutting and pasting food photo descriptions from Llama to my mini DeepSeek model is hardly an elegant solution, a “pipeline” between the two models is likely required, and from what Gemini’s telling me, the process ain’t easy.

But clunky though it is, my photo recipe AI is—kinda?—up and running. Will it whip up decent recipes from the food photos I’m snapping at a Manhattan restaurant this weekend? Stay tuned.

Extra: The system prompt

You are an expert recipe generator. Your task is to create detailed and delicious recipes based solely on descriptions of food photos. Your recipes should be structured for import into recipe management systems like Mealie.

**Instructions:**

1. **Analyze the Photo Description:** You will be given a text description of a photo of food. Carefully analyze this description to understand:
* **The dish being depicted:** Identify the type of food (e.g., pasta, cake, soup, stir-fry).
* **Key ingredients:** Infer the main ingredients based on visual cues described (e.g., "red sauce," "green vegetables," "sprinkling of cheese").
* **Cooking style:** Deduce the likely cooking method (e.g., "grilled," "baked," "fried," "raw") from the description.
* **Overall impression:** Get a sense of the flavor profile and style of the dish (e.g., "rustic," "elegant," "spicy," "sweet").

2. **Craft a Recipe:** Based on your analysis of the photo description, generate a complete and plausible recipe for the dish. Be creative and fill in the gaps where the description is not explicit, making reasonable culinary assumptions.

3. **Include Recipe Components:** Ensure your recipe includes the following essential components, specifically for compatibility with recipe management systems:
* **Recipe Name:** A descriptive and appealing name for the dish.
* **Description:** A brief and enticing description of the recipe, highlighting its key features and flavors.
* **Recipe Category:** Categorize the recipe using a **common recipe category** such as "Main Course," "Dessert," "Appetizer," "Side Dish," "Breakfast," "Lunch," "Snack," "Beverage," etc. This is important for organization in recipe managers.
* **Cuisine:** Identify the likely cuisine or style of cooking (e.g., "Italian," "Mexican," "American," "Vegan").
* **Prep Time:** Estimate the preparation time in ISO 8601 duration format (e.g., "PT15M" for 15 minutes).
* **Cook Time:** Estimate the cooking time in ISO 8601 duration format.
* **Total Time:** Calculate and provide the total time (Prep Time + Cook Time) in ISO 8601 duration format.
* **Recipe Yield:** Specify the number of servings or portions the recipe makes (e.g., "Serves 4," "Makes 12 cookies").
* **Recipe Ingredients:** A detailed list of ingredients with quantities and units. Be specific and list ingredients in a logical order.
* **Recipe Instructions:** Clear, step-by-step instructions on how to prepare and cook the dish. Use action verbs and be concise but thorough.
* **Keywords (Tags):** Generate a list of relevant keywords or tags that describe the recipe. These should be terms that are useful for searching and filtering recipes, such as dietary restrictions (e.g., "Vegetarian," "Gluten-Free"), cooking style (e.g., "Easy," "Quick," "Slow Cooker"), flavor profiles (e.g., "Spicy," "Sweet," "Savory"), or occasions (e.g., "Weeknight Dinner," "Party Food").

4. **Output in JSON Schema.org/Recipe Format:** Structure your recipe output as a valid JSON object adhering to the schema.org/Recipe schema (https://schema.org/Recipe). **Focus on the core properties mentioned above, including `recipeCategory` and `keywords`.** You do not need to include *every* possible property in the schema, but aim for a comprehensive and useful recipe structure that includes category and tags. Use `keywords` to represent tags.

5. **Enclose in Code Block:** Output the complete JSON recipe object within a Markdown code block, using triple backticks and specifying "json" for syntax highlighting. This is crucial for easy copying and parsing.

**Example (Illustrative - You will generate the full recipe based on the description, including `keywords`):**

**Input Description:** "A close-up photo of a vibrant green salad with cherry tomatoes, crumbled feta cheese, and a light vinaigrette dressing."

**Output (Example Structure - You will generate the full JSON):**

```json
{
"@context": "https://schema.org",
"@type": "Recipe",
"name": "Vibrant Green Salad with Feta and Cherry Tomatoes",
"description": "A refreshing and colorful green salad featuring crisp greens, juicy cherry tomatoes, and salty feta cheese, lightly dressed with a tangy vinaigrette.",
"recipeCategory": "Salad",
"cuisine": "Mediterranean",
"prepTime": "PT10M",
"cookTime": "PT0M",
"totalTime": "PT10M",
"recipeYield": "Serves 2",
"recipeIngredient": [
"5 oz mixed greens",
"1 cup cherry tomatoes, halved",
"4 oz feta cheese, crumbled",
"1/4 cup olive oil",
"2 tablespoons lemon juice",
"1 tablespoon Dijon mustard",
"1 clove garlic, minced",
"Salt and pepper to taste"
],
"recipeInstructions": [
"In a large bowl, combine the mixed greens and cherry tomatoes.",
"Sprinkle the crumbled feta cheese over the salad.",
"In a small bowl, whisk together the olive oil, lemon juice, Dijon mustard, and minced garlic.",
"Season the dressing with salt and pepper to taste.",
"Pour the dressing over the salad and toss gently to combine.",
"Serve immediately."
],
"keywords": ["salad", "vegetarian", "easy", "quick", "fresh", "healthy", "lunch", "side dish"]
}

 

This articles is written by : Nermeen Nabil Khear Abdelmalak

All rights reserved to : USAGOLDMIES . www.usagoldmines.com

You can Enjoy surfing our website categories and read more content in many fields you may like .

Why USAGoldMines ?

USAGoldMines is a comprehensive website offering the latest in financial, crypto, and technical news. With specialized sections for each category, it provides readers with up-to-date market insights, investment trends, and technological advancements, making it a valuable resource for investors and enthusiasts in the fast-paced financial world.

Recent:

Report: Massive Batteries for Galaxy S26 Series in the Works Tim | usagoldmines.com

Elon Musk Offered to Buy OpenAI For an Absurd Amount of Money Michelle Ehrhardt | usagoldmines.com

Siri Provides Stroke Victim With Life Saving Help Juli Clover | usagoldmines.com

Meta can turn your thoughts into words typed on a screen if you don't mind lugging a machine the siz...

Best Thunderbolt docks 2025: Extend your laptop’s capabilities | usagoldmines.com

February Google Play Updates: More New Features to Play With Tim | usagoldmines.com

The Out-of-Touch Adults' Guide to Kid Culture: Kendrick Lamar's Super Bowl Win Stephen Johnson | usa...

Apple CEO Tim Cook Visited New Orleans for Super Bowl Juli Clover | usagoldmines.com

AMD fast-tracks its most powerful AI GPU ever as it seeks to steal market sharefrom Nvidia's Blackwe...

Max unveils first look at Euphoria season 3, but I'm not excited about the hit show's long-awaited r...

Challenger laptop brand says you can shove 26TB of superfast SSD storage in its laptop - and I want ...

I Tested Nvidia’s AI Tool for Making Your Webcam Better, and Oof Mark Knapp | usagoldmines.com

Everything the Department of Actually Labor Does Meredith Dietz | usagoldmines.com

The Safe (and Unsafe) Flowers to Buy Your Valentine If They Have Pets Amanda Blum | usagoldmines.com

I Made the Perfect Boiled Egg, According to Science Allie Chanthorn Reinmann | usagoldmines.com

Samsung spin-off wants to break away from the tyranny of 16:9 aspect ratio for displays | usagoldmi...

OpenAI’s secret weapon against Nvidia dependence takes shape Benj Edwards | usagoldmines.com

Twenty-two states sue to block new NIH funding policy John Timmer | usagoldmines.com

Best DVR for cord-cutters: Tablo vs Zapperbox vs Channels vs the rest | usagoldmines.com

Fastest VPN 2025: We identify the speediest performers | usagoldmines.com

The Two Best Ways to Remove Rings From Your Bathtub Lindsey Ellefson | usagoldmines.com

Will Apple Release New iPhone 16 Colors This Year? Hartley Charlton | usagoldmines.com

Apple's Rumored Smart Home Hub Still 'Months Away' From Shipping Joe Rossignol | usagoldmines.com

NYT Connections hints and answers for Tuesday, February 11 (game #611) | usagoldmines.com

NYT Strands hints and answers for Tuesday, February 11 (game #345) | usagoldmines.com

Quordle hints and answers for Tuesday, February 11 (game #1114) | usagoldmines.com

After Trump killed a report on nature, researchers push ahead with release Ashley Belanger | usagold...

SD cards, demystified: How to decipher the confusing jumble of specs | usagoldmines.com

Get this Baseus power bank for 60% off with our special code | usagoldmines.com

Amazon’s newest Kindle Paperwhite is on sale for just $135 right now | usagoldmines.com

Eight Unexpected Places You Can Add a Pop of Color to Your Home Jeff Somers | usagoldmines.com

25 Movies for Lovers Who Love Love Ross Johnson | usagoldmines.com

Make Sure to Update: iOS 18.3.1 Includes Fix for Actively Exploited Vulnerability Juli Clover | usag...

M4 MacBook Air Release Continues to Appear Imminent Juli Clover | usagoldmines.com

Arm's Japanese owner is rumored to be buying Arm's only independent server chip vendor but I don't u...

"A tracking cookie farm for profit" - report claims reCAPTCHA has caused 819 million hours of wasted...

Handful of users claim new Nvidia GPUs are melting power cables again Andrew Cunningham | usagoldmin...

T-Mobile expands Starlink texting service to cover Verizon and AT&T users Jon Brodkin | usagoldm...

10 killer smart home gadgets that were left for dead | usagoldmines.com

Official: OnePlus Watch 3 Launches February 18 Tim | usagoldmines.com

DEAL: Pixel 9 Pro, Pro XL Prices Reduced on Amazon (Up to 20% Off) Tim | usagoldmines.com

My Favorite Amazon Deal of the Day: The Google Pixel 9 Pro Daniel Oropeza | usagoldmines.com

The Beats Solo 4 Are 50% Off Right Now Daniel Oropeza | usagoldmines.com

Wearing an Apple Watch on the Ankle? New Report Explains the Trend Joe Rossignol | usagoldmines.com

Apple Releases watchOS 11.3.1 Juli Clover | usagoldmines.com

Apple Releases visionOS 2.3.1 Juli Clover | usagoldmines.com

Apple Releases iOS 18.3.1 With Bug Fixes Juli Clover | usagoldmines.com

Apple Releases macOS Sequoia 15.3.1 Juli Clover | usagoldmines.com

Microsoft could give Windows 11 PCs a new option for the Copilot key –but don't get too excited just...

If you want to know who will win the AI wars, just watch these two Super Bowl ads from Google and Ch...

New cheaper blue OLED material breakthrough could be great news for OLED TVs – and every other devic...

'It's news to us': Black Panther 3 producer responds to rumors that the Marvel movie's titular hero ...

Cisco Live! 2025 - all the news and updates as they happen benedict.collins@futurenet.com (Benedict ...

A new Facebook phishing campaign looks to trick you with emails sent from Salesforce | usagoldmines...

Netflix's new crime drama Apple Cider Vinegar is the latest female fraudster series that I can't get...

Dragonsweeper is my favorite game of 2025 (so far) Kyle Orland | usagoldmines.com

Tesla turns to Texas to test its autonomous “Cybercab” Jonathan M. Gitlin | usagoldmines.com

Intel’s Core Ultra 200 laptop CPUs deliver shocking performance gains | usagoldmines.com

Zotac fights GPU scalpers by selling RTX 50-series cards on Discord | usagoldmines.com

Samsung’s speedy 2TB portable SSD is a massive 48% off right now | usagoldmines.com

Whoa! Get this Asus Chromebook with 8GB RAM for just $109 | usagoldmines.com

Get this 27-inch Alienware 1440p IPS gaming monitor for just $180 | usagoldmines.com

Want YouTube Premium features for free? This app makes it happen | usagoldmines.com

T-Mobile Starlink Beta Now Open to All, Including Verizon and AT&T Customers Tim | usagoldmines....

This Is the Standard Deduction for the 2024 Tax Year Meredith Dietz | usagoldmines.com

How to Easily Search the Internet With ChatGPT Search Khamosh Pathak | usagoldmines.com

Liquid cool your PC with a home air conditioner | usagoldmines.com

Like the Eagles, Tubi surprised its Super Bowl doubters | usagoldmines.com

This free European AI chatbot is 13 times faster than ChatGPT | usagoldmines.com

Hotspot Shield review: Feeling the need, the need for speed | usagoldmines.com

12 of the Best Modern Movies With Little or No Dialogue Jason Keil | usagoldmines.com

How to Upgrade Your 'Unsupported' PC to Windows 11 Pranay Parab | usagoldmines.com

Best Buy Presidents' Day Sale Includes Major iPad Discounts, Get Up to $200 Off iPad Pro, iPad Air, ...

Powerbeats Pro 2 Given to Customer Early, Expected to Debut Tomorrow Joe Rossignol | usagoldmines.co...

Apple Sports App Updated With NASCAR Support Ahead of Daytona 500 Joe Rossignol | usagoldmines.com

Sam Altman says AI is progressing faster than Moore’s law as he predicts AGI is ‘coming into view’, ...

I don't care what the haters say, the Nintendo Switch 2's rumored mouse mode is by far the most exci...

Top US health provider tells 882,000 patients they were hit in August 2023 breach | usagoldmines.co...

Citing EV “rollercoaster” in US, BMW invests in internal combustion Kana Inagaki and Patricia Nilsso...

You can now generate AI images of people in Google Docs, Gmail, and more | usagoldmines.com

Sick of Elon’s antics? Here’s a beginner’s guide to Bluesky | usagoldmines.com

This Shopping List Always Saves Me Money at the Grocery Store Allie Chanthorn Reinmann | usagoldmine...

This iRobot Roomba Combo Is at Its Lowest Price Pradershika Sharma | usagoldmines.com

Apple Promotes MLS Season Pass: 'When Football Ends, Fútbol Begins' Joe Rossignol | usagoldmines.com

Zotac has a plan to keep RTX 5090 and 5080 GPUs away from the clutches of scalpers – and it sounds l...

Securing 5G edge network – what companies should know before stepping on the edge of tech | usagold...

'Our goal is to make the best movie possible': Captain America: Brave New World director and produce...

Hackers are hijacking government software to access sensitive servers | usagoldmines.com

Three tactics to creating a more secure supply chain | usagoldmines.com

Celebrate NordVPN’s birthday with 72% off and an extra year for free | usagoldmines.com

Today’s best laptop deals: Save big on work, school, home use, and gaming | usagoldmines.com

TikTok's 'Sunday Reset' Trend Is Actually a Great Way to Prepare for the Work Week Lindsey Ellefson ...

Amazon Discounts USB-C AirPods Max to $479.99 ($69 Off) Mitchel Broussard | usagoldmines.com

The Samsung Galaxy S24 Ultra is still full price and I don't know what Samsung's playing at jamie.ri...

Huge cyber attack under way - 2.8 million IPs being used to target VPN devices | usagoldmines.com

IT unemployment hits new high as AI threat continues | usagoldmines.com

Here's Why Eggs Are so Expensive Right Now Meredith Dietz | usagoldmines.com

Make Any File a Template Using This Hidden macOS Tool Tim Hardwick | usagoldmines.com

Exclusive: OnePlus Watch 3 revealed, with an Apple Watch-style rotating digital crown and Galaxy Wat...

A near-complete Samsung Galaxy S25 Edge specs list has leaked, pointing to an even slimmer design th...

Leave a Reply