Comparing Google's Gemini Pro vs OpenAI's ChatGPT4: Which model has better visual reasoning?

DeepLearning Hero · Beginner ·🧠 Large Language Models ·2y ago
In this video, we're going to test Gemini Pro and ChatGPT4 side by side on a gamut of tasks requiring various levels of visual reasoning and specialized skills. These tasks include; common sense reasoning, aesthetic understanding, data analysis, math, etc. Watch the full video to see how these models perform on these tasks. 00:00 - Introduction 03:18 - Common sense reasoning 04:51 - Aesthetic understanding 05:55 - Entity recognition 08:12 - Data analysis 10:53 - Design analysis & Text extraction 12:35 - Misc labelling 15:06 - Math 16:08 - Final thoughts
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Is Claude 4.x Actually Smarter, or Just Hardwired to Spend?
Learn to critically evaluate AI models like Claude 4.x and understand the difference between true intelligence and hardcoded behavior
Medium · LLM
Beyond Facts and Triggers: Closing the Gap Between “Knowing” and “Understanding” in LLM Assistants
Learn how to close the gap between knowing and understanding in LLM assistants by going beyond facts and triggers, enabling more effective decision-making and action-taking
Medium · LLM
Why your dating app conversations die after 3 messages — a technical breakdown
Learn why dating app conversations often die after 3 messages and how to analyze this phenomenon using LLMs and data analysis
Medium · Startup
What If LLMs Learn Better from Language Than from Rewards?
Discover how LLMs can learn better from language than rewards, and explore TextGrad, MIPRO, and GEPA for rethinking LLM optimization
Medium · AI

Chapters (9)

Introduction
3:18 Common sense reasoning
4:51 Aesthetic understanding
5:55 Entity recognition
8:12 Data analysis
10:53 Design analysis & Text extraction
12:35 Misc labelling
15:06 Math
16:08 Final thoughts
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →