Hi everyone, do you want to create an image captcha solver but you don’t know machine learning and don’t want to spend a lot of time learning ML and then training models and stuff? Don’t worry, we’ve got you covered. In this post, I am going to show you a way to make a captcha solver in python which solves image based captchas in a few seconds without using any ML magic 🙂

How Will it Work?

So basically, we’ll use Google Lens for this, we’ll create a python script which will automatically enter our captcha image url in google lens image search page and then fetch the scanned text.

I’m going to use these types of captchas in this tutorial, but this solution will work for all types of image based captchas as long as it is not a lot distorted or really blurry.

Type of captcha used in this tutorial:

Prerequisites

1. You need python3 installed in your system.

2. You need Google Chrome installed in your system.

Coding Time

First of all, you need to create a new folder somewhere and open your code editor in it, I personally prefer using VS Code but you can use any code editor you want.

Then open your terminal in the same window, you can use VS code’s integrated terminal too.

Run this command to install “selenium” python package, we’re going to use it to automate the google lens image insert process.

pip install selenium

Now create a python file named “app.py” inside this directory. And let’s begin coding.

First of all, import required dependencies.

app.py

from selenium import webdriver # Importing webdriver
from selenium.webdriver.common.by import By # To get elements
from selenium.webdriver.common.keys import Keys # To press buttons
import time # Importing time module to wait for page load

Then we’ll store the captcha link in a variable and go to the google lens upload image page.

app.py

CAPTCHA_URL = "https://i.ibb.co/3hnBz4f/69670.jpg" # Specifying captcha url

driver = webdriver.Chrome() # Initialising webdriver
driver.get("https://www.google.com/?olud") # Going to google lens upload page

time.sleep(1) # 1 second delay to make sure that the image uploader is available

Now we have to send our image url to the input box. As of now, all the attributes of the input box are randomly generated except the placeholder, so we’re going to use the placeholder to get the element.

app.py

# Getting the input element by placeholder and sending the captcha url
input_elem = driver.find_element(By.XPATH, '//input[@placeholder="Paste image link"]')
input_elem.send_keys(CAPTCHA_URL)
input_elem.send_keys(Keys.ENTER)

And then we’ll wait for around 4 seconds to let the image get uploaded and scanned by google lens.

app.py

time.sleep(4) # Waiting for the image to get uploaded and scanned

After the delay, we’ll go to the text tab of google lens.

app.py

driver.find_element(By.ID, 'text').click() # Going to text tab
time.sleep(2) # Delay to ensure text tab has loaded

As per now, the “Select all text” all attributes are randomly generated so we’ll have to use it’s text to find and click on it.

app.py

driver.find_element(By.XPATH, "//span[contains(text(), 'Select all text')]").click() # Finding select all text button by it's inside text

Then after a minor delay, we’ll find the result’s parent element by it’s
unique class name value and then get the result element’s text.

app.py

result = driver.find_element(By.XPATH, '//div[@class="VIH6Y AbOGud "]').find_element(By.TAG_NAME, 'h1').text # Getting result
print(result) # Printing result

Final Code

app.py

from selenium import webdriver # Importing webdriver
from selenium.webdriver.common.by import By # To get elements
from selenium.webdriver.common.keys import Keys # To press buttons
import time # Importing time module to wait for page load

CAPTCHA_URL = "https://i.ibb.co/3hnBz4f/69670.jpg" # Specifying captcha url

driver = webdriver.Chrome() # Initialising webdriver
driver.get("https://www.google.com/?olud") # Going to google lens upload page

time.sleep(1) # 1 second delay to make sure that the image uploader is available

# Getting the input element by placeholder and sending the captcha url
input_elem = driver.find_element(By.XPATH, '//input[@placeholder="Paste image link"]')
input_elem.send_keys(CAPTCHA_URL)
input_elem.send_keys(Keys.ENTER)

time.sleep(4) # Waiting for the image to get uploaded and scanned

driver.find_element(By.ID, 'text').click() # Going to text tab
time.sleep(2) # Delay to ensure text tab has loaded

driver.find_element(By.XPATH, "//span[contains(text(), 'Select all text')]").click() # Finding select all text button by it's inside text

time.sleep(1) # Minor delay to ensure text load

result = driver.find_element(By.XPATH, '//div[@class="VIH6Y AbOGud "]').find_element(By.TAG_NAME, 'h1').text # Getting result
print(result) # Printing result

And here you go, you’ve successfully created an image based text captcha solver with python without using machine learning! Thanks for reading, I hope you liked it and if you did, consider sharing it with your fellow mates. Bye for now and I’ll see you in the next post 🙂

Categorized in:

Projects, Python, Tutorials,