Comparing Google's Gemini Pro vs OpenAI's ChatGPT4: Which model has better visual reasoning?

DeepLearning Hero · Beginner ·🧠 Large Language Models ·2y ago

Skills: LLM Foundations80%Multimodal LLMs70%Prompting Basics60%Advanced Prompting50%

In this video, we're going to test Gemini Pro and ChatGPT4 side by side on a gamut of tasks requiring various levels of visual reasoning and specialized skills. These tasks include; common sense reasoning, aesthetic understanding, data analysis, math, etc. Watch the full video to see how these models perform on these tasks. 00:00 - Introduction 03:18 - Common sense reasoning 04:51 - Aesthetic understanding 05:55 - Entity recognition 08:12 - Data analysis 10:53 - Design analysis & Text extraction 12:35 - Misc labelling 15:06 - Math 16:08 - Final thoughts

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

Build LlamaIndex Audio Apps with Python in 5 minutes

Build LlamaIndex Audio Apps with Python in 5 minutes

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

How to Make a Tic Tac Toe Neural Network Easily (LIVE)

How to Make a Tic Tac Toe Neural Network Easily (LIVE)

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Related AI Lessons

Is Claude 4.x Actually Smarter, or Just Hardwired to Spend?

Learn to critically evaluate AI models like Claude 4.x and understand the difference between true intelligence and hardcoded behavior

Beyond Facts and Triggers: Closing the Gap Between “Knowing” and “Understanding” in LLM Assistants

Learn how to close the gap between knowing and understanding in LLM assistants by going beyond facts and triggers, enabling more effective decision-making and action-taking

Why your dating app conversations die after 3 messages — a technical breakdown

Learn why dating app conversations often die after 3 messages and how to analyze this phenomenon using LLMs and data analysis

Medium · Startup

What If LLMs Learn Better from Language Than from Rewards?

Discover how LLMs can learn better from language than rewards, and explore TextGrad, MIPRO, and GEPA for rethinking LLM optimization

Chapters (9)

Introduction

3:18 Common sense reasoning

4:51 Aesthetic understanding

5:55 Entity recognition

8:12 Data analysis

10:53 Design analysis & Text extraction

12:35 Misc labelling

15:06 Math

16:08 Final thoughts

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)