Table of Contents
Selenium Automation Tutorial
Selenium Automation Introduction
This blog is written for beginners and intermediate learners who want to learn Selenium Automation using python step by step. The language is kept simple, and every line of code is explained clearly. By the end, you will be able to build real automation scripts confidently.
What is Selenium?
Selenium is an automation tool used to control a web browser using code.For example, the official documentation of Selenium explains WebDriver setup, element locators, and advanced features in detail.ย
- Open a browser automatically
- Click buttons and links
- Fill forms
- Login to websites
- Test web applications
- It is free and open source
- Supports multiple programming languages
Prerequisites
- Basic Python (variables, functions)
- Basic HTML (input, button, id, class)
Selenium 4+ supoort Selenium Manager – No need to download chromedriver manualy. In this tutorial we will use 4.39.0 selenium .
Selenium Setup and Basic Script In Python
Install Selenium
Install Selenium using pip:
pip install selenium
Understanding How Selenium Sees Your Webpage
To control your website, Selenium needs to know where to click or type. It does this using “Locators.”
The Problem: You see a “Login” button. Selenium sees HTML code.
The Solution: You must find the unique address of that button in the HTML (usually an id, name, or class).
How to do it :
- Open your website in Chrome.
- Right-click on the button or text box you want to automate.
- Select Inspect.
- Look for
id="username"orname="submit". These are the identifiers we can use to get element .
First Selenium Script
Let us write our first automation code.
from selenium import webdriver
- Selenium is the library for browser automation .
- Webdriver helps control the browser
driver = webdriver.Chrome()
- webdriver.Chrome() opens Chrome browser .
- driver is a python variable that controls the browser.
driver.get("https://www.google.com")
print(driver.title)
dirver.close()
- get() is method of driver which tacke web url and open a website in browser.
- title is property of driver gets the page title.
- close() method is used to stop browser session.
Complete First Script Of Browser Automation
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.google.com")
print(driver.title)
driver.quit()
Locate Web Elements
To interact with elements, Selenium needs to find them.
Common locators:
- ID : This is the preferred method. IDs are supposed to be unique within a page, making this the fastest and most reliable locator.
- Name : Often used for form elements (inputs, checkboxes). If multiple elements share same name , selenium select the first one found.
- Class Name : Used to locate elements based on their CSS class.
- XPath : It allows you to navigate the HTML structure (up and down) and locate elements by their visible text.
- CSS Selector : It is faster than XPath and very readable. It uses standard CSS syntax to find elements.
summary: Which one should I use?
| Priority | Locator | Speed | Reliability | Why? |
|---|---|---|---|---|
| 1 | ID | Fast | High | Unique and optimized. |
| 2 | CSS Selector | Fast | High | flexible and faster than XPath. |
| 3 | XPath | Slow | Medium | Essential for complex navigation or text matching. |
| 4 | Name/Class | Medium | Medium | Good fallbacks if ID is missing. |
Implicit And Explicit Wait In Selenium Automation
Think of “Waits” in Selenium like telling a robot how much patience it should have. Sometimes websites are slow, and if the robot (Selenium) tries to click a button before it appears, it will crash.
1. Implicit Wait (The “General Rule”)
What it is: A global setting. You set it once, and it applies to every element you try to find in your script.
How it works: If Selenium can’t find an element immediately, it will wait for the time you specified (e.g., 10 seconds) before failing. If the element appears in 2 seconds, it moves on immediately.
Simple Syntax:
# Tell the driver: "Always wait up to 10 seconds if you can't find something."
driver.implicitly_wait(10)
2. Explicit Wait (The “Specific Rule”)
What it is: A specific setting for one specific element. You use it when you know onepart of the page takes longer to load than orthers.
How it works: You tell Selenium to pause and wait for a specific condition to happen (like “Wait until the Login button is clickable” or “Wait until the loading spinner disappears”).
Simple Syntax:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# "Wait up to 10 seconds, BUT only for this specific button to show up."
wait = WebDriverWait(driver, 10)
button = wait.until(EC.presence_of_element_located((By.ID, "submit")))
Handling Common Web Elements For Browser Automation
Handle Input Form
username = driver.find_element(By.ID, "username")
password = driver.find_element(By.ID, "password")
username.send_keys("admin")
password.send_keys("123456")
login_btn = driver.find_element(By.ID, "login")
login_btn.click()
Imagine driver is a robot sitting at a computer. This code gives the robot three specific instructions . given code is used to automate login process.
- Look for the boxes named “username” and “password”.
- Type “admin” and “123456” into them.
- Click the button named “login”.
Code Explanation :
1. Locating the Input Boxes
username = driver.find_element(By.ID, "username")
password = driver.find_element(By.ID, "password")
- find_element(...): This command tells Selenium to scan the web page's code (HTML) to find a specific item.
- By.ID: This is the strategy used to find the item. In HTML, elements often have a unique ID (like a social security number). This is the most accurate way to find an element.
2. Typing the Credentials
username.send_keys("admin")
password.send_keys("123456")
- .send_keys(...): This command mimics a physical keyboard.
- It takes the element we found in the previous step (username) and โtypesโ the string "admin" into it.
- It then does the same for the password field with "123456".
3. Clicking the Button
login_btn = driver.find_element(By.ID, "login")
login_btn.click()
- login_btn = ...: the robot scans the page for an element with the unique ID "login". This is usually the โSubmitโ or โSign Inโ button.(...): This command tells Selenium to scan the web page's code (HTML) to find a specific item.
- .click(): This mimics a left-click on a mouse. It activates the button, sending the form data to the server.
HTML for this code
For this code to work, the websiteโs source code (HTML) must look something like this:
If the website uses different IDs (e.g.,ย id="user_email"ย instead ofย id="username"), the script will crash with aย NoSuchElementException.
Scroll Automation
Why would you use this?
โLazy Loadingโ (Infinite Scroll):ย Many modern sites (like Twitter, Instagram, or YouTube) donโt load all content at once. You use Scroll Automation to trigger the website to load more items.
Accessing the Footer:ย Sometimes you need to click a โContact Usโ link in the footer, but the element is technically โhiddenโ because itโs off-screen. While Selenium usually auto-scrolls to elements it clicks .
1. Scroll to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
- window.scrollTo(x, y): This is a standard JavaScript function that moves the screen to a specific pixel coordinate.
- y = document.body.scrollHeight: The vertical position. scrollHeight is a dynamic property that calculates the total height of the entire webpage (even the parts you canโt see yet).
2. Scroll element into view
element = driver.find_element(By.ID, "footer")
driver.execute_script("arguments[0].scrollIntoView();", element)
- scrollIntoView(): This is the native JavaScript function that moves the viewport to show a specific HTML tag.
- arguments[0]: This is a placeholder. It is a variable that says, โI am waiting for you to pass me an object from Python.โ It refers to the first item in the list of arguments you provide later.
- When execute_script runs, it takes this Python element and โpassesโ it into the JavaScript code, replacing arguments[0] with the actual footer element.
Handle Dropdown
HTML Code Of Our Example
from selenium.webdriver.support.ui import Select
dropdown = Select(driver.find_element(By.ID, "country"))
dropdown.select_by_visible_text("India")
- Imports the Select class, which is required to handle select tag (dropdown) HTML elements in Selenium.
- First, Selenium locates the dropdown element using its id="country".
- Then, the Select wrapper is applied, enabling dropdown-specific actions.
- Finally, Selenium selects the option whose visible text is exactly "India".
- This works only if "India" is present as a dropdown option.
Handling Multiple Tab in Selenium Automation
๐น Launch Browser and Open Website
driver = webdriver.Chrome()
driver.get("https://www.python.org")
- First, webdriver.Chrome() starts the Chrome browser
- Then, get() opens the Python official website.
- This is the main window in selenium automation.
๐น Store Main Window Handle
main_window = driver.current_window_handle
- current_window_handle stores the unique ID of the current window.
- Therefore, Selenium remembers which tab is the main window.
- This step is very important in window switching.
๐น Find the Link to Open in New Tab
about_link = driver.find_element(By.LINK_TEXT, "About")
- find_element() locates the About link.
- By.LINK_TEXT finds the link using visible text.
- Selenium automation now has access to the link element.
๐น Open Link in a New Tab Using JavaScript
driver.execute_script("window.open(arguments[0].href, '_blank');", about_link)
WebDriverWait(driver,10).until(lambda d: len(d.window_handles) > 1)
-
execute_script()runs JavaScript inside the browser -
window.open()opens the link in a new tab -
_blankalways means open in new tab -
WebDriverWaitwaits up to 10 seconds -
window_handlesreturns all open tabs -
len(...) > 1confirms that a new tab is opened - As a result, selenium automation avoids timing issues
๐น Switch to the New Tab
for w in driver.window_handles:
if w != main_window:
driver.switch_to.window(w)
break
print(driver.title)
-
window_handlesgives a list of all window IDs. - The loop checks which window is not the main window.
-
switch_to.window()moves Selenium control to the new tab. - This is the core logic of tab switching in selenium automation.
-
driver.titlefetches the page title of the new tab.
๐น Close the Current Tab And Switch Back to Main Tab
driver.close()
driver.switch_to.window(main_window)
print(driver.title)
-
close()shuts only the current active tab. - Selenium automation now switches control back.
-
main_windowwas saved earlier. - Finally, printing the title confirms Selenium is back.
- This ensures proper window handling in selenium automation.
Complete Code
driver = webdriver.Chrome()
driver.get("https://www.python.org")
main_window = driver.current_window_handle
about_link = driver.find_element(By.LINK_TEXT, "About")
driver.execute_script("window.open(arguments[0].href, '_blank');", about_link)
WebDriverWait(driver,10).until(lambda d: len(d.window_handles) > 1)
for w in driver.window_handles:
if w != main_window:
driver.switch_to.window(w)
break
print(driver.title)
driver.close()
driver.switch_to.window(main_window)
print(driver.title)
Selenium Automation with Custom Chrome Profile
Benefits Of Using Custome Profile In Selenium Automation
- Cookies are stored inside the custom profile.
- You can stay logged in across runs
- This reduces repeated login steps
- Very useful for real-world selenium automation
- Custom profile supports long sessions.
- Suitable for scraping, monitoring, or testing.
- Improves reliability in advanced selenium automation.
- we can use extention in custom profile.
Important Note : Create new fresh chrome profile to avoid conflictย for selenium automation.
๐น Import Required Modules
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import os
-
First,
webdrivercontrols the Chrome browser. -
Then,
Optionshelps customize Chrome settings. -
Byis used to locate web elements. -
oshelps work with folders and files.
๐น Create a Separate Chrome Profile
options = Options()
profile_path = r"C:\Users\\AppData\Local\Google\Chrome\User Data\"
os.makedirs(profile_path, exist_ok=True)
-
Options()creates an option object. - Therefore, selenium automation can modify browser behavior
-
os.makedirs(...)create profile folder. -
exist_ok=Trueavoids error if folder already exists. - As a result, selenium automation runs independently.
- This prevents conflict with your normal Chrome browser.
๐น Attach Profile to Chrome And Add Stability Options
options.add_argument(f"--user-data-dir={profile_path}")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
-
"--user-data-dir={profile_path}"tells Chrome to use the custom profile. - Cookies, cache, and session data are saved.
- Very useful in selenium automation for login persistence.
- Second and third line of code is used for improve browser stability.
- Helpful for automation on low-memory systems.
๐น Start Chrome Driver
driver = webdriver.Chrome(options=options)
print(driver.current_url)
driver.quit()
- Chrome browser launches with all settings applied.
- Selenium automation now controls this browser.
-
driver.current_urlShows the current page URL. -
quit()Closes all browser windows. - Ends selenium automation session safely.
Complete Code For Custome Profile
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import os
options = Options()
profile_path = r"C:\Users\\AppData\Local\Google\Chrome\User Data\"
os.makedirs(profile_path, exist_ok=True)
options.add_argument(f"--user-data-dir={profile_path}")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
driver = webdriver.Chrome(options=options)
print(driver.current_url)
driver.quit()