Skip to content

Advanced Topics

Deep dives into Xeepy internals and advanced usage patterns.

Architecture

Infrastructure

Development

Topic Description Difficulty
Architecture System design overview Intermediate
Custom Scrapers Build new scrapers Advanced
Proxies Proxy configuration Intermediate
Stealth Detection avoidance Advanced
Distributed Multi-machine scaling Expert
Performance Speed optimization Intermediate
Docker Container deployment Intermediate
Testing Testing strategies Intermediate
Errors Error handling Beginner
Plugins Plugin development Advanced

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                         Xeepy                                   │
├─────────────────────────────────────────────────────────────────┤
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐        │
│  │ Scrapers │  │ Actions  │  │ Monitor  │  │ Analytics│        │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘        │
│       │             │             │             │               │
│  ┌────▼─────────────▼─────────────▼─────────────▼────┐         │
│  │                    Core                            │         │
│  │  ┌────────────┐  ┌────────────┐  ┌────────────┐  │         │
│  │  │  Browser   │  │    Auth    │  │Rate Limiter│  │         │
│  │  └────────────┘  └────────────┘  └────────────┘  │         │
│  └───────────────────────┬───────────────────────────┘         │
│                          │                                      │
│  ┌───────────────────────▼───────────────────────────┐         │
│  │                    Storage                         │         │
│  │  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐  │         │
│  │  │ SQLite │  │  CSV   │  │  JSON  │  │  Excel │  │         │
│  │  └────────┘  └────────┘  └────────┘  └────────┘  │         │
│  └───────────────────────────────────────────────────┘         │
└─────────────────────────────────────────────────────────────────┘

Key Concepts

Browser Management

Xeepy uses Playwright for browser automation:

from xeepy.core.browser import BrowserManager

class BrowserManager:
    """Manages browser lifecycle and page pool."""

    async def start(self) -> None:
        """Launch browser with stealth configuration."""

    async def new_page(self) -> Page:
        """Create new page with rate limiting."""

    async def close(self) -> None:
        """Clean shutdown with session save."""

Rate Limiting

Intelligent rate limiting protects your account:

from xeepy.core.rate_limiter import RateLimiter

class RateLimiter:
    """Adaptive rate limiter with backoff."""

    def __init__(
        self,
        requests_per_minute: int = 20,
        burst_limit: int = 5,
        backoff_factor: float = 2.0
    ): ...

    async def wait(self) -> None:
        """Wait for rate limit clearance."""

    def record_response(self, status: int) -> None:
        """Adjust limits based on response."""

Event System

Xeepy emits events for monitoring:

from xeepy.core.events import EventEmitter

class Xeepy(EventEmitter):
    """Emits events during operations."""

    # Events:
    # - "scrape:start", "scrape:complete", "scrape:error"
    # - "action:start", "action:complete", "action:error"
    # - "auth:login", "auth:logout", "auth:expired"
    # - "rate_limit:warning", "rate_limit:hit"

# Subscribe to events
x.on("scrape:complete", lambda data: print(f"Scraped {len(data)} items"))
x.on("rate_limit:warning", lambda: print("Slowing down..."))

Configuration Hierarchy

Configuration is loaded in this order (later overrides earlier):

  1. Defaults - Built-in defaults
  2. System config - /etc/xeepy/config.toml
  3. User config - ~/.config/xeepy/config.toml
  4. Project config - ./xeepy.toml
  5. Environment variables - XEEPY_*
  6. CLI arguments - --option value
  7. Runtime - x.config.setting = value

Extension Points

Xeepy is designed for extensibility:

Custom Scrapers

from xeepy.scrapers.base import BaseScraper

class MyScraper(BaseScraper):
    """Custom scraper implementation."""

    async def scrape(self, target: str, **kwargs) -> ScrapeResult:
        # Your implementation
        pass

Custom Actions

from xeepy.actions.base import BaseAction

class MyAction(BaseAction):
    """Custom action implementation."""

    async def execute(self, **kwargs) -> ActionResult:
        # Your implementation
        pass

Custom Notifications

from xeepy.notifications.base import BaseNotifier

class MyNotifier(BaseNotifier):
    """Custom notification channel."""

    async def send(self, message: str, **kwargs) -> bool:
        # Your implementation
        pass

Performance Characteristics

Operation Typical Speed Memory Usage
Profile scrape 1-2 sec ~50 MB
100 tweets 10-20 sec ~100 MB
1000 followers 60-120 sec ~200 MB
Follow action 2-3 sec ~50 MB
Like action 1-2 sec ~50 MB

Security Considerations

  • Sessions are stored encrypted by default
  • Credentials never logged
  • Rate limiting protects accounts
  • Proxy support for IP rotation
  • Stealth mode to avoid detection

Next Steps