linkedin-spider
No description available
Ask AI about linkedin-spider
Powered by Claude Β· Grounded in docs
I know everything about linkedin-spider. Ask me about installation, configuration, usage, or troubleshooting.
0/500
Reviews
Documentation
linkedin-spider
A modern LinkedIn scraping library with built-in anti-detection, available as a Python library, CLI tool, and MCP server.
MCP Server Β· Python Library Β· CLI Β· Docker
Features
- Profile search with advanced filters (location, industry, company, connection degree)
- Post search by keywords with engagement metrics and comments
- Profile scraping with experience, education, skills, and contact details
- Company scraping with industry, size, specialties, and more
- Connection management β retrieve and send connection requests
- Conversations β list threads, read message history, and send messages
- Proxy support β route traffic through HTTP or SOCKS5 proxies
- Anti-detection β human-like behavior simulation, stealth mode, session persistence
Installation
pip install linkedin-spider # library only
pip install linkedin-spider[cli] # library + CLI
pip install linkedin-spider[mcp] # library + MCP server
pip install linkedin-spider[all] # everything
Authentication
linkedin-spider supports two authentication methods. Sessions are persisted in a Chrome profile so you typically only authenticate once.
LinkedIn Cookie (Recommended)
- Log in to LinkedIn in your browser
- Open DevTools (F12) β Application β Cookies β
linkedin.com - Copy the
li_atcookie value
scraper = LinkedinSpider(li_at_cookie="your_cookie_value")
Email & Password
scraper = LinkedinSpider(email="you@example.com", password="your_password")
For CLI and MCP, pass --cookie (or --email/--password) flags, or set the LINKEDIN_COOKIE (or LINKEDIN_EMAIL/LINKEDIN_PASSWORD) environment variables.
MCP Server
The MCP server exposes LinkedIn data to AI assistants like Claude.
Example prompts you can give Claude once connected:
Research the background of this candidate https://www.linkedin.com/in/johndoe
Find 10 product managers in San Francisco and summarize their experience
What has OpenAI been posting about recently? https://www.linkedin.com/company/openai
Show me my pending connection requests and summarize who they are
Send a message to John Doe saying "Thanks for connecting!"
It provides 12 tools:
| Tool | Description |
|---|---|
search_profiles | Search profiles with filters (location, industry, company, connections) |
scrape_profile | Extract complete profile data from a URL |
search_posts | Search posts by keywords with date filters |
scrape_company | Get company details from a URL |
scrape_incoming_connections | List pending connection requests received |
scrape_outgoing_connections | List connection requests you've sent |
send_connection_request | Send a connection request with optional note |
scrape_conversations_list | List messaging conversations |
scrape_conversation | Read messages from a conversation |
send_message | Send a message in an existing or new conversation |
get_session_status | Check if the browser session is active |
reset_session | Close and reset the browser session |
Start the server
# stdio (default β for Claude Desktop, Claude Code)
linkedin-spider-mcp serve --cookie your_li_at_cookie_value
# SSE or HTTP (for remote clients)
linkedin-spider-mcp serve --transport sse --host 127.0.0.1 --port 8000
linkedin-spider-mcp serve --transport http --host 0.0.0.0 --port 9000
Or configure via environment variables in a .env file:
LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://host:port # optional
Claude Desktop
Add to your Claude Desktop config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
With Docker (Recommended)
No local dependencies required β the Docker image includes Python, Chrome, and everything needed.
{
"mcpServers": {
"linkedin-spider": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-e",
"LINKEDIN_COOKIE=your_li_at_cookie_value",
"-e",
"HEADLESS=true",
"-e",
"TRANSPORT=stdio",
"vertexcoverlabs/linkedin-mcp"
]
}
}
}
Without Docker
Requires pip install linkedin-spider[mcp] and Chrome installed locally.
{
"mcpServers": {
"linkedin-spider": {
"command": "linkedin-spider-mcp",
"args": ["serve", "--cookie", "your_li_at_cookie_value"]
}
}
}
Claude Code
With Docker
claude mcp add linkedin-spider -- \
docker run --rm -i \
-e LINKEDIN_COOKIE=your_li_at_cookie_value \
-e HEADLESS=true \
-e TRANSPORT=stdio \
vertexcoverlabs/linkedin-mcp
Without Docker
claude mcp add linkedin-spider -- \
linkedin-spider-mcp serve --cookie your_li_at_cookie_value
Or connect to a running SSE/HTTP server:
claude mcp add linkedin-spider --transport sse http://localhost:8000/sse
Troubleshooting
Authentication issues
- Cookie expired: LinkedIn cookies expire periodically. Grab a fresh
li_atvalue from your browser. - Email/password not working: LinkedIn may trigger a CAPTCHA or verification. Try cookie auth instead.
- Session reuse: Sessions are saved in a Chrome profile. If things break, delete the profile directory and re-authenticate.
Docker issues
- First-time pull is slow: The image is ~1.4GB. Pre-pull with
docker pull vertexcoverlabs/linkedin-mcpbefore configuring Claude Desktop to avoid timeout. - Port conflicts: If port 8080 is in use, map to a different host port:
-p 9090:8080.
Browser / Chrome issues
- Chrome not found (non-Docker): Set
CHROMEDRIVER_PATHin your.envor passchromedriver_pathinScraperConfig. - Page load timeouts: Increase
page_load_timeoutinScraperConfigor use a faster proxy. - Headless mode issues: Some LinkedIn pages behave differently in headless mode. Try
headless=Falsefor debugging.
Python Library
from linkedin_spider import LinkedinSpider, ScraperConfig
config = ScraperConfig(headless=True, page_load_timeout=30)
scraper = LinkedinSpider(
li_at_cookie="your_cookie_value",
config=config,
)
# Search profiles
results = scraper.search_profiles("software engineer", max_results=10)
# Search posts
posts = scraper.search_posts("artificial intelligence", max_results=10)
# Scrape a single profile
profile = scraper.scrape_profile("https://linkedin.com/in/someone")
# Scrape a company page
company = scraper.scrape_company("https://linkedin.com/company/openai")
# Connection requests
incoming = scraper.scrape_incoming_connections(max_results=20)
scraper.send_connection_request("https://linkedin.com/in/someone", note="Hi!")
# Conversations
threads = scraper.scrape_conversations_list(max_results=10)
messages = scraper.scrape_conversation_messages("John Doe")
scraper.send_message("Thanks for connecting!", participant_name="John Doe")
# Always clean up
scraper.close()
See the examples/ directory for more detailed usage.
Configuration
ScraperConfig accepts the following options:
| Option | Default | Description |
|---|---|---|
headless | False | Run browser without a visible window |
stealth_mode | True | Inject anti-detection scripts |
page_load_timeout | 30 | Page load timeout in seconds |
implicit_wait | 10 | Implicit wait timeout in seconds |
human_delay_range | (0.5, 2.0) | Random delay range between actions |
proxy | None | Proxy URL (http:// or socks5://) |
custom_user_agent | None | Override the default user agent |
Command Line Interface
# Search profiles
linkedin-spider-cli search -q "product manager" -n 10 -o results.json
# Search posts
linkedin-spider-cli search-posts -k "artificial intelligence" -n 10 -o posts.json
# Scrape a profile
linkedin-spider-cli profile -u "https://linkedin.com/in/johndoe" -o profile.json
# Scrape a company
linkedin-spider-cli company -u "https://linkedin.com/company/openai" -o company.json
# List connection requests
linkedin-spider-cli connections -n 20 -o connections.json
Pass --cookie on first use (or set LINKEDIN_COOKIE env var). You can also use --email/--password instead. The session is saved and reused for subsequent commands.
Output defaults to stdout. Use -o to write JSON or CSV files.
Docker
A pre-built Docker image is published on Docker Hub:
docker pull vertexcoverlabs/linkedin-mcp
Or build locally:
docker build -t linkedin-spider .
Run with different transports:
# stdio
docker run --rm -i -e TRANSPORT=stdio --env-file .env vertexcoverlabs/linkedin-mcp
# SSE
docker run -p 8080:8080 -e TRANSPORT=sse --env-file .env vertexcoverlabs/linkedin-mcp
# HTTP
docker run -p 8080:8080 -e TRANSPORT=http --env-file .env vertexcoverlabs/linkedin-mcp
Running in the Cloud
LinkedIn blocks requests from known datacenter IP ranges. To run linkedin-spider on a cloud server (AWS, GCP, Azure, etc.), route browser traffic through a residential or mobile proxy using the PROXY_URL environment variable.
LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://user:pass@proxy-host:port
Both HTTP and SOCKS5 proxies are supported:
PROXY_URL=http://user:pass@proxy-host:port
PROXY_URL=socks5://user:pass@proxy-host:port
When using Docker, pass it as an environment variable:
docker run --rm -i \
-e LINKEDIN_COOKIE=your_li_at_cookie_value \
-e PROXY_URL=http://user:pass@proxy-host:port \
-e HEADLESS=true \
-e TRANSPORT=stdio \
vertexcoverlabs/linkedin-mcp
When using the Python library, pass it via ScraperConfig:
config = ScraperConfig(headless=True, proxy="http://user:pass@proxy-host:port")
[!TIP] Residential proxies are recommended over datacenter proxies to avoid detection.
Development
git clone https://github.com/vertexcover-io/linkedin-spider
cd linkedin-spider
uv sync
cp .env.example .env # add your credentials
make check # lint + typecheck
make test # run tests
make build # build wheel
Disclaimer
This tool is for personal and educational use. Please respect LinkedIn's Terms of Service, use reasonable rate limits, and handle collected data responsibly.
