A comprehensive guide to profiling and optimizing Python packages for real-world performance. Learn how to measure, analyze, and improve your package’s startup time, import overhead, and runtime performance.
Why Profile Your Python Package?
Performance matters. Users notice when your CLI takes 3 seconds to start or when your library adds 500ms to import time. This guide covers practical techniques to identify and fix performance bottlenecks.
Measuring Import Time
Python’s import system can be surprisingly slow. Here’s how to measure it:
Eager imports at module level are the #1 cause of slow startup. Look for:
# BAD: Eager import at module level
from heavy_dependency import HeavyClass
# GOOD: Lazy import when needed
def get_heavy_class():
from heavy_dependency import HeavyClass
return HeavyClass
Using importlib.util.find_spec
Check if a module is available without importing it:
import importlib.util
# Fast availability check (no import)
HEAVY_AVAILABLE = importlib.util.find_spec("heavy_module") is not None
# Lazy import helper
def _get_heavy_module():
if HEAVY_AVAILABLE:
import heavy_module
return heavy_module
return None
Lazy Loading with __getattr__
Python 3.7+ supports module-level __getattr__ for lazy loading:
# In your_package/__init__.py
_lazy_cache = {}
def __getattr__(name):
if name in _lazy_cache:
return _lazy_cache[name]
if name == "HeavyClass":
from .heavy_module import HeavyClass
_lazy_cache[name] = HeavyClass
return HeavyClass
raise AttributeError(f"module has no attribute {name}")
Creating Timeline Diagrams
Visualize execution phases with ASCII timeline diagrams:
def create_timeline(phases):
"""
phases: list of (name, duration_ms) tuples
"""
total = sum(d for _, d in phases)
scale = 50.0 / total
# Top line
line = "ENTER "
for name, ms in phases:
width = max(8, int(ms * scale))
line += "─" * width
line += "► END"
print(line)
# Phase names
line = " "
for name, ms in phases:
width = max(8, int(ms * scale))
line += "│" + name.center(width - 1)
line += "│"
print(line)
print(f"{'':>50} TOTAL: {total:.0f}ms")
Comparing SDK vs Wrapper Performance
When building wrappers around SDKs, measure the overhead:
Target: Keep wrapper overhead under 5% of SDK time.
CLI vs Python API Performance
CLI tools have additional overhead from subprocess spawning:
# Python API (faster)
from your_package import YourClass
result = YourClass().run()
# CLI (slower due to subprocess)
subprocess.run(['your-cli', 'command'])
Typical CLI overhead: 100-300ms for subprocess spawn + Python startup.
Caching Strategies
Module-Level Caching
class MyClass:
_cached_client = None
@classmethod
def get_client(cls):
if cls._cached_client is None:
cls._cached_client = ExpensiveClient()
return cls._cached_client
Configuration Caching
_config_applied = False
def apply_config():
global _config_applied
if _config_applied:
return
# Expensive configuration
_config_applied = True
Profiling Pitfalls
1. Measuring in Development Mode
Debug mode, assertions, and development dependencies add overhead. Profile in production-like conditions.
2. Ignoring Variance
Always run multiple iterations and report standard deviation:
times = [measure() for _ in range(10)]
print(f'{statistics.mean(times):.0f}ms (±{statistics.stdev(times):.0f}ms)')
3. Profiler Overhead
cProfile adds ~10-20% overhead. For accurate timing, use time.perf_counter() for wall-clock measurements.
4. Network Variance
API calls have high variance. Separate network timing from local computation.
Performance Targets
Reasonable targets for Python packages:
Metric
Target
CLI –help
< 500ms
Package import
< 100ms
Wrapper overhead vs SDK
< 5%
Profiling overhead
< 5%
Summary
Key techniques for Python package performance:
Measure first: Use time.perf_counter() and cProfile
Separate phases: Import, init, network, execution
Lazy load: Use __getattr__ and importlib.util.find_spec
Cache wisely: Module-level caching for expensive operations
Multiple runs: Report mean and standard deviation
Timeline diagrams: Visualize where time is spent
Performance optimization is iterative. Measure, identify bottlenecks, fix, and measure again.
Agent Skills are a powerful way to extend AI agents with specialized capabilities. This tutorial walks you through creating skills at three complexity levels, from basic instructions to full-featured skills with scripts and resources.
What are Agent Skills?
Agent Skills are folders containing instructions, scripts, and resources that AI agents can discover and use to perform specific tasks more reliably. They follow the open agentskills.io specification and use progressive disclosure to manage context efficiently.
The simplest skill contains just a SKILL.md file with YAML frontmatter and instructions.
Example: Code Review Skill
Create the directory structure:
mkdir -p ~/.praison/skills/code-review
Create ~/.praison/skills/code-review/SKILL.md:
---
name: code-review
description: Review code for bugs, security issues, and best practices. Use when asked to review, audit, or check code quality.
---
# Code Review Skill
## When to Use
Activate this skill when the user asks to:
- Review code for bugs or issues
- Check code quality
- Audit security vulnerabilities
- Suggest improvements
## Review Checklist
### 1. Security
- [ ] No hardcoded secrets or API keys
- [ ] Input validation present
- [ ] SQL injection prevention
- [ ] XSS protection for web code
### 2. Code Quality
- [ ] Functions are single-purpose
- [ ] Variable names are descriptive
- [ ] No code duplication
- [ ] Error handling is comprehensive
### 3. Performance
- [ ] No unnecessary loops
- [ ] Efficient data structures used
- [ ] Database queries optimized
## Output Format
Provide findings in this format:
1. **Critical Issues** - Must fix before deployment
2. **Warnings** - Should address soon
3. **Suggestions** - Nice to have improvements
Using the Basic Skill
from praisonaiagents import Agent
# Agent automatically discovers skills from default directories
agent = Agent(
instructions="You are a code assistant",
skills_dirs=["~/.praison/skills"]
)
# The skill is available when relevant
response = agent.chat("Review this Python function for issues: def login(user, pwd): return db.query(f'SELECT * FROM users WHERE name={user}')")
print(response)
Level 2: Skill with Script
Add executable scripts for deterministic operations that benefit from code execution rather than LLM generation.
Example: Data Validation Skill
Create the directory structure:
mkdir -p ~/.praison/skills/data-validator/scripts
Create ~/.praison/skills/data-validator/SKILL.md:
---
name: data-validator
description: Validate CSV and JSON data files for schema compliance, data types, and required fields. Use when asked to validate, check, or verify data files.
---
# Data Validation Skill
## When to Use
Activate this skill when the user needs to:
- Validate CSV or JSON files
- Check data types and formats
- Verify required fields exist
- Find data quality issues
## Available Scripts
### validate_csv.py
Validates CSV files against expected schema.
Usage:
```bash
python scripts/validate_csv.py <file.csv> [--schema schema.json]
```
### validate_json.py
Validates JSON files against JSON Schema.
Usage:
```bash
python scripts/validate_json.py <file.json> --schema schema.json
```
## Workflow
1. Identify the file type (CSV or JSON)
2. Run the appropriate validation script
3. Report any validation errors found
4. Suggest fixes for common issues
---
name: report-generator
description: Generate professional reports in multiple formats (PDF, HTML, Markdown). Use when asked to create reports, summaries, or documentation from data.
license: Apache-2.0
compatibility: Requires Python 3.8+ with reportlab and jinja2
metadata:
author: PraisonAI
version: "1.0"
---
# Report Generator Skill
## Overview
This skill generates professional reports from data in multiple formats.
## When to Use
Activate when the user needs to:
- Generate PDF reports from data
- Create HTML documentation
- Build formatted Markdown reports
- Use branded report templates
## Available Scripts
### generate_report.py
Main report generation script.
```bash
python scripts/generate_report.py --input data.json --output report.pdf --template default
```
Options:
- `--input`: Input data file (JSON or CSV)
- `--output`: Output file path
- `--format`: Output format (pdf, html, md)
- `--template`: Template name from assets/
## References
- See [TEMPLATES.md](references/TEMPLATES.md) for template customization
- See [STYLING.md](references/STYLING.md) for styling options
## Assets
- `assets/default_template.html` - Default HTML template
- `assets/report_styles.css` - CSS styles for reports
- `assets/logo.png` - Default logo for headers
# Template Customization Guide
## Available Templates
### default
The default template provides a clean, professional layout suitable for most reports.
### executive
A condensed template for executive summaries with key metrics highlighted.
### technical
Detailed template with code blocks, tables, and technical formatting.
## Creating Custom Templates
1. Copy an existing template from `assets/`
2. Modify the HTML/CSS as needed
3. Reference your template with `--template your-template`
## Template Variables
Templates support these variables:
- `{{title}}` - Report title
- `{{date}}` - Generation date
- `{{sections}}` - Content sections
- `{{logo}}` - Logo path
from praisonaiagents import Agent
from praisonaiagents.skills import SkillManager, SkillLoader
# Initialize and discover skills
manager = SkillManager()
manager.discover()
# Get skill info
skill = manager.get_skill("report-generator")
print(f"Skill: {skill.properties.name}")
print(f"Path: {skill.properties.path}")
# Load all resources (Level 3)
loader = SkillLoader()
loaded = loader.load(str(skill.properties.path), activate=True)
loader.load_all_resources(loaded)
# Access different resource types
print(f"\nScripts: {list(loaded.get_scripts().keys())}")
print(f"References: {list(loaded.get_references().keys())}")
print(f"Assets: {list(loaded.get_assets().keys())}")
# Use with an agent
agent = Agent(
instructions="You are a report generation assistant",
skills=["~/.praison/skills/report-generator"]
)
# Get the skills XML for system prompt
skills_xml = agent.get_skills_prompt()
print(f"\nSkills XML:\n{skills_xml}")
Progressive Disclosure in Action
Skills use three levels of loading to manage context efficiently:
PraisonAI includes built-in code execution tools that work seamlessly with skill scripts:
from praisonai.code import execute_command, run_python
# Execute a skill script
result = execute_command(
"python ~/.praison/skills/data-validator/scripts/validate_csv.py data.csv"
)
print(result['stdout'])
# Run Python code directly
result = run_python("""
import json
data = {"status": "success", "count": 42}
print(json.dumps(data))
""")
print(result['stdout'])
CLI Commands
PraisonAI provides CLI commands for skill management:
# List all discovered skills
praisonai skills list
# Validate a skill directory
praisonai skills validate --path ~/.praison/skills/report-generator
# Create a new skill from template
praisonai skills create --name my-new-skill
# Generate XML prompt for skills
praisonai skills prompt
Best Practices
Keep SKILL.md lean - Move detailed documentation to references/
Use scripts for determinism - Code execution is more reliable than LLM generation for specific operations
Be specific in descriptions - Clear descriptions help agents know when to use skills
Follow naming conventions - Lowercase, hyphens, max 64 characters
Test your skills - Use praisonai skills validate before deployment
Conclusion
Agent Skills provide a powerful, standardized way to extend AI agents with specialized capabilities. Whether you need simple instructions or complex workflows with scripts and resources, the progressive disclosure model ensures efficient context usage while enabling sophisticated functionality.
PraisonAI's implementation is fully compliant with the agentskills.io specification, making your skills portable across compatible agent products.
The Model Context Protocol (MCP) is an open standard created by Anthropic that allows AI applications to connect to external tools and data sources in a standardized way. Converting your Python package to support MCP enables it to be used with Claude Desktop, Cursor, VS Code, and other MCP-compatible AI tools.
Prerequisites
Python 3.10 or higher
uv package manager (recommended) or pip
Basic understanding of async Python
Step 1: Install the MCP Python SDK
The official Python SDK makes it easy to create MCP servers:
# Using uv (recommended)
uv add mcp
# Or using pip
pip install mcp
Step 2: Create Your MCP Server
Use FastMCP to quickly create an MCP server that exposes your package’s functionality as tools:
from mcp.server.fastmcp import FastMCP
# Create an MCP server
mcp = FastMCP("My Package Server")
# Expose a function as a tool
@mcp.tool()
def my_function(param1: str, param2: int = 10) -> str:
"""Description of what this tool does."""
# Your existing package logic here
return f"Result: {param1}, {param2}"
# Expose data as a resource
@mcp.resource("data://{item_id}")
def get_data(item_id: str) -> str:
"""Get data by ID."""
return f"Data for {item_id}"
# Run the server
if __name__ == "__main__":
mcp.run() # Uses stdio transport by default
Step 3: Add Entry Point to Your Package
Update your pyproject.toml to include an entry point:
import re
s = """
... some line abc
... some other line
... name my_user_name is valid
... some more lines"""
p = re.compile("name (.*) is valid")
result = p.search(s)
print(result.group(1))
export AIRFLOW_HOME=~/airflow
# install from pypi using pip
pip install apache-airflow
# initialize the database
airflow initdb
# start the web server, default port is 8080
airflow webserver -p 8080
# start the scheduler
airflow scheduler
# visit localhost:8080 in the browser and enable the example dag in the home page