Automate LinkedIn Birthday Replies & Browser Tasks with a Python AI Agent (Langchain & Browser Use)
Build an AI agent to handle your LinkedIn birthday replies and other browser tasks! This easy Python tutorial uses Langchain and browser_use to automate repetitive web actions in minutes. Get the code and steps inside.

Ever feel overwhelmed by the flood of LinkedIn birthday notifications? Or find yourself manually checking stats on sites like GitHub (you'll be like "who does that? seriously, Ahsan 😅)?
What about repetitive browser tasks can eat up valuable time. But what if you could delegate them to an AI?
In this tutorial, we'll build a simple yet powerful AI-powered browser agent using Python, Langchain, and the fantastic browser_use
library. This agent can handle tasks like checking GitHub follower counts and, yes, even automatically replying to those LinkedIn birthday wishes!
And just to be clear, I appreciate the birthday wishes, it is not that I look down upon them. However, replying to every message takes time. And if an AI does it for me, resulting in the same effect as what results in me responding personally, why not?
Anyways to automate the mundane? Let's dive in!

What You'll Build
We'll create a Python script capable of performing two distinct browser automation tasks:
- GitHub Follower Check: Navigate to a specified GitHub profile and retrieve the current follower count.
- LinkedIn Birthday Responder: Log in to LinkedIn, navigate to messages, identify simple birthday wishes, and send polite, randomized thank-you replies, while skipping other messages.
Why Automate This?
- Save Time: Reclaim time spent on repetitive social media pleasantries or data checking. Take THAT proscratination!
- Ensure Consistency: Never forget to thank someone for birthday wishes again.
- Learn Cutting-Edge Tech: Get hands-on experience with AI agents, LLMs, and browser automation tools.
- Foundation for More: This project serves as a great starting point for building more complex browser automation agents.
The Tech Stack
- Python: Our programming language foundation.
- Langchain: A powerful framework for developing applications powered by language models. We'll use it to orchestrate our agent's logic and interact with LLMs. We can use langchain with either OpenAI, or Google Gemini for this tutorial.
- browser_use: A library specifically designed to allow LLMs (like those managed by Langchain) to control and interact with a web browser. This is the magic that lets our AI "see" and "click".
- OpenAI / Google Gemini: We need a Large Language Model (LLM) to understand our instructions and decide what actions to take in the browser. The code supports both.
python-dotenv
: To securely manage our API keys and login credentials.
Prerequisites
- Python 3.6+ installed.
pip
(Python package installer).- A web browser (Chrome recommended).
- A LinkedIn account (for the birthday feature).
- API Key for either OpenAI or Google AI Studio (for Gemini models).
Setup & Installation
- Create and Activate a Virtual Environment:
Install Dependencies:
pip install -r requirements.txt
macOS / Linux:
python3 -m venv .venv
source .venv/bin/activate
Windows:
python -m venv .venv
.venv\Scripts\activate
Clone the Repository:
git clone https://github.com/AhsanAyaz/linkedin-birthday-wishes-agent
cd linkedin-birthday-wishes-agent
Configuration: The .env
File
Sensitive information like API keys and login details should never be hardcoded! We use a .env
file for this.
Edit the newly created .env
file with your actual details:
# Choose ONE API key to provide, based on the LLM you uncomment in agent.py
OPENAI_API_KEY=your_openai_api_key_here_if_using_openai
GOOGLE_API_KEY=your_google_api_key_here_if_using_gemini
# Your LinkedIn Login Credentials
USERNAME=your_linkedin_email@example.com
PASSWORD=your_super_secret_linkedin_password
# The GitHub profile URL for the follower check task
GITHUB_URL=https://github.com/ahsanayaz
Security Note: Ensure your .env
file is listed in your .gitignore
(it is in the provided example) to prevent accidentally committing secrets.
Copy the example file:
cp .env.example .env
Core Logic: agent.py
Explained
Let's look at the heart of our project, agent.py
.
from langchain_openai import ChatOpenAI
from langchain_google_genai import ChatGoogleGenerativeAI
from browser_use import Agent, Browser
import asyncio
from dotenv import dotenv_values # Import dotenv
# Load environment variables from .env file
config = dotenv_values(".env")
# Initialize the browser controlled by the agent
browser = Browser()
# --- LLM Selection ---
# Choose ONE LLM to use by uncommenting the desired line
# llm = ChatOpenAI(model="gpt-4o") # Requires OPENAI_API_KEY in .env
llm = ChatGoogleGenerativeAI(model="models/gemini-1.5-flash-latest") # Requires GOOGLE_API_KEY in .env
# --- Load Credentials & Config ---
USERNAME = config.get("USERNAME")
PASSWORD = config.get("PASSWORD")
GITHUB_URL = config.get("GITHUB_URL")
# --- Task Definitions (Prompts!) ---
# Task 1: GitHub Follower Check
task_github=f"""
Open browser, then go to {GITHUB_URL} and let me how many followers does he have
"""
# Task 2: LinkedIn Birthday Responder (Detailed Instructions)
task_linkedin=f"""
Open browser, wait for user to select user profile if needed.
Go to https://linkedin.com, login with username {USERNAME} and password {PASSWORD}.
Handle multi-factor authentication if prompted (user intervention may be required).
Wait for user to login. # Added pause for manual steps if needed
Once logged in, navigate to the main messaging page.
Carefully examine each unread message thread one by one.
If a message thread appears to be ONLY a simple birthday wish (like 'Happy birthday!', 'HBD!', 'Hope you have a great day!'), respond with a short, polite thank you message. Choose randomly from variants like: 'Thanks so much!', 'Appreciate the birthday wishes!', 'Thank you!', 'Thanks for thinking of me!'.
If a message thread contains MORE than just a simple birthday wish, OR is clearly not a birthday wish, DO NOT RESPOND. Just mark it as read (if possible by clicking into it) and move to the next.
Prioritize accuracy: It's better to miss replying to a birthday wish than to reply incorrectly to a non-birthday message.
Stop after checking/replying to about 10-15 unread messages or if there are no more unread messages.
Provide a summary of actions taken (e.g., 'Replied to 5 birthday messages, marked 3 other messages as read')
"""
# --- Async Functions to Run Tasks ---
async def run_github_task():
print("--- Running GitHub Follower Check ---")
agent = Agent(
task=task_github, # Pass the specific task prompt
llm=llm, # Pass the chosen LLM
browser=browser # Pass the browser instance
)
result = await agent.run() # Execute the task
print(f"GitHub Result: {result}")
await browser.close() # Clean up
async def run_linkedin_task():
print("--- Running LinkedIn Message Check ---")
# IMPORTANT: May need manual interaction first time (CAPTCHA/2FA)
# input("Agent will open browser. Handle login/2FA if needed, then press Enter...") # Optional pause
agent = Agent(
task=task_linkedin,
llm=llm,
browser=browser
)
result = await agent.run()
print(f"LinkedIn Result: {result}")
await browser.close()
# --- Execution ---
# Choose which task to run by uncommenting the appropriate line:
# asyncio.run(run_linkedin_task())
asyncio.run(run_github_task())
Key Points:
- Environment Variables: We load credentials securely using
dotenv
. - LLM Choice: You can easily switch between OpenAI and Google Gemini by uncommenting the respective line and ensuring the correct API key is in your
.env
. - Browser Instance:
browser = Browser()
creates the browser window that the agent will control. - Task Prompts: These are the crucial instructions given to the LLM. Notice how detailed
task_linkedin
is. This is effective Prompt Engineering:- Clear Steps: Login -> Navigate -> Examine -> Decide -> Act.
- Specificity: Details on what constitutes a birthday wish, what to reply, and what to do with other messages.
- Constraints/Safety: Explicitly tells the agent not to reply to non-birthday messages and prioritizes accuracy.
- Implicit Examples: Providing response variants guides the LLM's output style.
- Agent Initialization:
Agent(task=..., llm=..., browser=...)
creates the agent instance, linking the instructions, the "brain" (LLM), and the "hands" (browser). - Async Execution:
asyncio.run()
executes the chosen asynchronous task function (run_github_task
orrun_linkedin_task
).
Running the Agent
- Select the Task: Open
agent.py
and make sure the correctasyncio.run()
line at the very bottom is uncommented for the task you want to perform (eitherrun_github_task()
orrun_linkedin_task()
). - Observe! A browser window should open.
- For the LinkedIn task, you might need to manually handle login steps like CAPTCHA or 2-Factor Authentication the first time you run it or if your session expires. The script may pause waiting for you (especially if you add an
input()
prompt). - The agent will then attempt to follow the instructions in your chosen task prompt. You'll see output in your terminal detailing its thoughts and actions (if verbose mode is enabled in
browser_use
or Langchain, though not explicitly set here).
- For the LinkedIn task, you might need to manually handle login steps like CAPTCHA or 2-Factor Authentication the first time you run it or if your session expires. The script may pause waiting for you (especially if you add an
Execute the Script: From your activated virtual environment in the terminal:
python agent.py
How It Works (Behind the Scenes)
The browser_use
library, integrated via the Agent
class, acts as an intermediary.
- The
Agent
receives the task (your detailed prompt). - It uses the LLM (Gemini/OpenAI via Langchain) to "read" the current state of the web page provided by
Browser()
. - The LLM interprets your prompt and the page content to decide the next best action (e.g., "type 'your_email' into the username field", "click the login button", "find the text of the first unread message", "type 'Thanks!' into the reply box").
browser_use
translates the LLM's desired action into actual browser commands (like Selenium or Playwright commands).- This loop repeats: observe page -> LLM decides action -> execute action -> observe new page state.
Important Considerations
- Website Changes: Web UIs change! If LinkedIn or GitHub updates their site structure, the agent's prompts might need adjustments to keep working correctly. This is the nature of browser automation.
- Security: Handle your
.env
file with care. Do not commit it to public repositories. - LLM Costs & Reliability: Using LLM APIs incurs costs. Performance can also vary slightly between runs or models. Gemini Flash is often faster/cheaper for simpler tasks.
- Error Handling: This basic script doesn't include robust error handling. In a production scenario, you'd add
try...except
blocks. - CAPTCHAs/2FA: Complex login mechanisms might require initial manual intervention in the browser window the agent opens.
Conclusion
You've now built a functional AI browser agent capable of handling real-world tasks! By combining Python's flexibility, Langchain's LLM orchestration, and browser_use
's browser interaction capabilities, you can automate a vast range of repetitive online activities.
This project is just the beginning. Think about what other tasks you could automate – checking prices, filling forms, scraping data? Experiment with different prompts and explore the possibilities!
By the way, want to see the agent in action? Check out the video:
Found this useful? Don't forget to subscribe to the newsletter 🙂And join in on the discussion in the comments section below 👇🏽
As always, happy coding!