{"id":65803,"date":"2026-01-30T03:51:23","date_gmt":"2026-01-30T03:51:23","guid":{"rendered":"https:\/\/www.askpython.com\/?p=65803"},"modified":"2026-04-16T03:48:00","modified_gmt":"2026-04-16T03:48:00","slug":"openai-python-sdk-developer-guide","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python\/examples\/openai-python-sdk-developer-guide","title":{"rendered":"OpenAI Python SDK: Complete Developer Guide (2026)"},"content":{"rendered":"<p>Want to add ChatGPT, image generation, and AI capabilities to your Python apps? The OpenAI Python SDK makes this straightforward. In this guide, you&#8217;ll build AI-powered features-from chat interfaces to semantic search-using Python 3.13 and the latest SDK patterns.<\/p>\n<p><strong>What you&#8217;ll learn<\/strong>: Chat completions with streaming responses, function calling for API integration, embeddings for semantic search, vision analysis, and production deployment with proper error handling. All code is ready to copy and run.<\/p>\n<p><strong>Quick heads-up<\/strong>: If you&#8217;re using old code with <code>openai.ChatCompletion.create()<\/code>, that pattern broke in November 2023 when version 1.0 launched. The SDK now uses client instances, which is actually cleaner once you see it in action. Don&#8217;t worry-I&#8217;ll show you the new patterns step by step.<\/p>\n<h2 class=\"wp-block-heading\">Quick start: Your first AI-powered response in 5 minutes<\/h2>\n<p>Let&#8217;s get you making AI requests right away. Install the SDK and make your first API call.<\/p>\n<p><strong>Install the SDK<\/strong>:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\npip3 install openai\n<\/pre>\n<\/div>\n<p>The package works with Python 3.13 or higher. If you&#8217;re on an older version, this is a good time to upgrade!<\/p>\n<p><strong>Get your API key<\/strong>: Head to <a href=\"https:\/\/platform.openai.com\/api-keys\" target=\"_blank\" rel=\"noopener\">platform.openai.com\/api-keys<\/a> and create a new key. Store it as an environment variable (never hardcode it in your source files-trust me, you&#8217;ll thank yourself later):<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: bash; title: ; notranslate\" title=\"\">\nexport OPENAI_API_KEY=&#039;sk-proj-...&#039;\n<\/pre>\n<\/div>\n<p><strong>Make your first request<\/strong>:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom openai import OpenAI\nimport os\n\n# Create a client instance (reuse this, don&#039;t create per request)\nclient = OpenAI(api_key=os.environ.get(&quot;OPENAI_API_KEY&quot;))\n\n# Let&#039;s talk to GPT!\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5-mini&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Hello!&quot;}]\n)\n\nprint(response.choices&#x5B;0].message.content)\n# Output: &quot;Hello! How can I help you today?&quot;\n<\/pre>\n<\/div>\n<p>&#x1f389; <strong>Nice work!<\/strong> You just made your first AI request. The client handles authentication, retries, and connection pooling automatically-one less thing to worry about.<\/p>\n<p><strong>Pro tip<\/strong>: Create one client instance per application and reuse it. The SDK uses connection pooling internally, which significantly improves performance.<\/p>\n<p>For async applications (FastAPI, asyncio), use the async client:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom openai import AsyncOpenAI\n\nasync_client = AsyncOpenAI(api_key=os.environ.get(&quot;OPENAI_API_KEY&quot;))\n\nasync def get_completion(prompt):\n    response = await async_client.chat.completions.create(\n        model=&quot;gpt-5-mini&quot;,\n        messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt}]\n    )\n    return response.choices&#x5B;0].message.content\n<\/pre>\n<\/div>\n<h2 class=\"wp-block-heading\">Building conversations with chat completions<\/h2>\n<p>Chat completions power everything from customer support bots to code generation tools. You send a list of messages (think of it as the conversation history), and the API returns the model&#8217;s response. Let&#8217;s explore how this works.<\/p>\n<h3 class=\"wp-block-heading\">Understanding message roles<\/h3>\n<p>The messages array is like a conversation transcript. Each message has a <code>role<\/code> (who&#8217;s speaking) and <code>content<\/code> (what they said). The model reads this entire conversation to understand context and generate relevant responses.<\/p>\n<p><strong>System messages<\/strong> set the assistant&#8217;s behavior. They define personality, constraints, and output format. The model treats system messages as instructions that override default behavior.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nmessages = &#x5B;\n    {&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You are a Python expert who explains concepts concisely with code examples.&quot;},\n    {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;What are Python decorators?&quot;}\n]\n<\/pre>\n<\/div>\n<p><strong>User messages<\/strong> represent input from the person using your application. These are questions, commands, or prompts.<\/p>\n<p><strong>Assistant messages<\/strong> store the model&#8217;s previous responses. Include them to maintain conversation context across multiple turns.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nmessages = &#x5B;\n    {&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You are a helpful assistant.&quot;},\n    {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;What is RAG?&quot;},\n    {&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: &quot;RAG (Retrieval Augmented Generation) combines information retrieval with text generation...&quot;},\n    {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;How do I implement it in Python?&quot;}\n]\n<\/pre>\n<\/div>\n<p>The model sees the entire message history and generates a response based on all context. This enables multi-turn conversations where the assistant remembers previous exchanges.<\/p>\n<h3 class=\"wp-block-heading\">Model parameters<\/h3>\n<p>Control model behavior with parameters. These affect creativity, response length, and output diversity.<\/p>\n<p><strong>temperature<\/strong> (0.0-2.0) controls randomness. Lower values (0.0-0.3) produce deterministic, focused responses. Higher values (0.7-1.5) increase creativity and variation. Use low temperature for factual tasks (code generation, data extraction) and higher temperature for creative tasks (brainstorming, storytelling).<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# Deterministic code generation\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Write a Python function to calculate fibonacci&quot;}],\n    temperature=0.1\n)\n\n# Creative writing\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Write a sci-fi story opening&quot;}],\n    temperature=1.2\n)\n<\/pre>\n<\/div>\n<p><strong>max_tokens<\/strong> limits response length. The model stops generating after reaching this limit. Count both input and output tokens against context window limits. GPT-5.2 supports 200k tokens, GPT-5 supports 128k tokens.<\/p>\n<p><strong>top_p<\/strong> (0.0-1.0) implements nucleus sampling. The model considers only the top P probability mass of tokens. Use <code>top_p=0.1<\/code> for focused responses or <code>top_p=0.9<\/code> for diverse outputs. Don&#8217;t adjust both temperature and top_p simultaneously.<\/p>\n<p><strong>presence_penalty<\/strong> (-2.0 to 2.0) reduces repetition of topics. Positive values penalize tokens that already appeared, encouraging the model to explore new topics.<\/p>\n<p><strong>frequency_penalty<\/strong> (-2.0 to 2.0) reduces repetition of specific tokens. Positive values penalize tokens based on their frequency in the output, discouraging verbatim repetition.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Explain Python async\/await&quot;}],\n    temperature=0.7,\n    max_tokens=500,\n    presence_penalty=0.6,\n    frequency_penalty=0.3\n)\n<\/pre>\n<\/div>\n<h3 class=\"wp-block-heading\">Streaming responses<\/h3>\n<p>Streaming sends response tokens as they&#8217;re generated instead of waiting for completion. This improves perceived latency in user-facing applications. Users see text appearing progressively rather than waiting 5-10 seconds for a full response.<\/p>\n<p>Enable streaming with <code>stream=True<\/code>. The API returns an iterator of chunk objects. Each chunk contains a delta with new content.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Explain RAG in detail&quot;}],\n    stream=True\n)\n\nfor chunk in response:\n    if chunk.choices&#x5B;0].delta.content:\n        print(chunk.choices&#x5B;0].delta.content, end=&quot;&quot;, flush=True)\n<\/pre>\n<\/div>\n<p>The streaming response yields <code>ChatCompletionChunk<\/code> objects. Access content via <code>chunk.choices[0].delta.content<\/code>. The first chunk often has empty content, and the final chunk signals completion with <code>finish_reason<\/code>.<\/p>\n<p>Handle streaming errors carefully. Network failures mid-stream leave partial responses. Wrap streaming in try\/except and implement retry logic.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndef stream_completion(prompt):\n    try:\n        response = client.chat.completions.create(\n            model=&quot;gpt-5&quot;,\n            messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt}],\n            stream=True,\n            timeout=30\n        )\n        \n        full_response = &quot;&quot;\n        for chunk in response:\n            if chunk.choices&#x5B;0].delta.content:\n                content = chunk.choices&#x5B;0].delta.content\n                print(content, end=&quot;&quot;, flush=True)\n                full_response += content\n        \n        return full_response\n    except Exception as e:\n        print(f&quot;\\nStreaming error: {e}&quot;)\n        return None\n<\/pre>\n<\/div>\n<h3 class=\"wp-block-heading\">Model comparison<\/h3>\n<p>Choose models based on task complexity, latency requirements, and budget. GPT-5.2 excels at complex reasoning but costs more. GPT-5-mini handles simple tasks at 1\/10th the cost.<\/p>\n<figure class=\"wp-block-table\">\n<table>\n<thead>\n<tr>\n<th>Model<\/th>\n<th>Cost (1M tokens input\/output)<\/th>\n<th>Context Window<\/th>\n<th>Speed<\/th>\n<th>Use Case<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>GPT-5.2<\/td>\n<td>$8\/$24<\/td>\n<td>200k<\/td>\n<td>Medium<\/td>\n<td>Complex reasoning, research, analysis<\/td>\n<\/tr>\n<tr>\n<td>GPT-5<\/td>\n<td>$3\/$9<\/td>\n<td>128k<\/td>\n<td>Fast<\/td>\n<td>General tasks, content generation<\/td>\n<\/tr>\n<tr>\n<td>GPT-5-mini<\/td>\n<td>$0.30\/$0.90<\/td>\n<td>64k<\/td>\n<td>Fastest<\/td>\n<td>Simple tasks, high-volume processing<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<p>GPT-5.2 outperforms GPT-5 on benchmarks requiring multi-step reasoning (GPQA, MATH, HumanEval). For tasks like classification, summarization, or simple Q&amp;A, GPT-5-mini provides 90% of the quality at 10% of the cost.<\/p>\n<p>Test your specific use case with all three models. Measure quality (human eval or automated metrics) and cost. Many applications can use GPT-5-mini for 80% of requests and route complex queries to GPT-5.2.<\/p>\n<h2 class=\"wp-block-heading\">Function calling and tool integration<\/h2>\n<p>Function calling lets the model decide when to call external functions. Instead of generating text, the model outputs structured JSON with function names and arguments. Your code executes the function and returns results to the model, which incorporates them into the response.<\/p>\n<p>This enables API integration, database queries, calculations, and external tool usage. The model determines which function to call based on the user&#8217;s request and available tools.<\/p>\n<h3 class=\"wp-block-heading\">What is function calling?<\/h3>\n<p>Function calling solves the problem of structured output and external data access. Without it, you&#8217;d need to parse unstructured text to extract API calls or database queries. Function calling provides a structured interface.<\/p>\n<p>The workflow:<\/p>\n<ol class=\"wp-block-list\">\n<li>Define available functions with JSON schemas<\/li>\n<li>Send user message with function definitions<\/li>\n<li>Model returns function call (if needed) or text response<\/li>\n<li>Execute function with provided arguments<\/li>\n<li>Send function result back to model<\/li>\n<li>Model generates final response using function output<\/li>\n<\/ol>\n<p>Use cases include weather APIs, database lookups, calculator functions, web searches, and any external data source the model needs to access.<\/p>\n<h3 class=\"wp-block-heading\">Tool definitions<\/h3>\n<p>Define functions using JSON schemas. Each function needs a name, description, and parameter specification. The model uses descriptions to decide when to call functions.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\ntools = &#x5B;\n    {\n        &quot;type&quot;: &quot;function&quot;,\n        &quot;function&quot;: {\n            &quot;name&quot;: &quot;get_weather&quot;,\n            &quot;description&quot;: &quot;Get current weather for a location. Use this when users ask about weather conditions.&quot;,\n            &quot;parameters&quot;: {\n                &quot;type&quot;: &quot;object&quot;,\n                &quot;properties&quot;: {\n                    &quot;location&quot;: {\n                        &quot;type&quot;: &quot;string&quot;,\n                        &quot;description&quot;: &quot;City name or coordinates (e.g., &#039;San Francisco&#039; or &#039;37.7749,-122.4194&#039;)&quot;\n                    },\n                    &quot;unit&quot;: {\n                        &quot;type&quot;: &quot;string&quot;,\n                        &quot;enum&quot;: &#x5B;&quot;celsius&quot;, &quot;fahrenheit&quot;],\n                        &quot;description&quot;: &quot;Temperature unit&quot;\n                    }\n                },\n                &quot;required&quot;: &#x5B;&quot;location&quot;]\n            }\n        }\n    },\n    {\n        &quot;type&quot;: &quot;function&quot;,\n        &quot;function&quot;: {\n            &quot;name&quot;: &quot;calculate&quot;,\n            &quot;description&quot;: &quot;Perform mathematical calculations. Use for arithmetic operations.&quot;,\n            &quot;parameters&quot;: {\n                &quot;type&quot;: &quot;object&quot;,\n                &quot;properties&quot;: {\n                    &quot;expression&quot;: {\n                        &quot;type&quot;: &quot;string&quot;,\n                        &quot;description&quot;: &quot;Mathematical expression to evaluate (e.g., &#039;2 + 2&#039; or &#039;sqrt(16)&#039;)&quot;\n                    }\n                },\n                &quot;required&quot;: &#x5B;&quot;expression&quot;]\n            }\n        }\n    }\n]\n<\/pre>\n<\/div>\n<p>Write clear descriptions. The model uses them to decide which function matches the user&#8217;s intent. Specify parameter types, constraints (enums), and whether parameters are required.<\/p>\n<h3 class=\"wp-block-heading\">Parallel function calls<\/h3>\n<p>The model can call multiple functions in a single request. This reduces latency when multiple operations are independent.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport json\n\ndef get_weather(location, unit=&quot;celsius&quot;):\n    # Simulate API call\n    return {&quot;temperature&quot;: 22, &quot;condition&quot;: &quot;sunny&quot;, &quot;unit&quot;: unit}\n\ndef calculate(expression):\n    # Safe eval for demo - use a proper parser in production\n    try:\n        result = eval(expression)\n        return {&quot;result&quot;: result}\n    except:\n        return {&quot;error&quot;: &quot;Invalid expression&quot;}\n\n# User asks: &quot;What&#039;s the weather in Tokyo and what&#039;s 15 * 24?&quot;\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;What&#039;s the weather in Tokyo and what&#039;s 15 * 24?&quot;}],\n    tools=tools\n)\n\n# Model returns multiple tool calls\ntool_calls = response.choices&#x5B;0].message.tool_calls\n\nif tool_calls:\n    # Execute all function calls\n    messages = &#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;What&#039;s the weather in Tokyo and what&#039;s 15 * 24?&quot;}]\n    messages.append(response.choices&#x5B;0].message)\n    \n    for tool_call in tool_calls:\n        function_name = tool_call.function.name\n        arguments = json.loads(tool_call.function.arguments)\n        \n        if function_name == &quot;get_weather&quot;:\n            result = get_weather(**arguments)\n        elif function_name == &quot;calculate&quot;:\n            result = calculate(**arguments)\n        \n        messages.append({\n            &quot;role&quot;: &quot;tool&quot;,\n            &quot;tool_call_id&quot;: tool_call.id,\n            &quot;content&quot;: json.dumps(result)\n        })\n    \n    # Get final response with function results\n    final_response = client.chat.completions.create(\n        model=&quot;gpt-5&quot;,\n        messages=messages\n    )\n    \n    print(final_response.choices&#x5B;0].message.content)\n    # Output: &quot;The weather in Tokyo is 22\u00b0C and sunny. 15 * 24 equals 360.&quot;\n<\/pre>\n<\/div>\n<p>The model executes both function calls in parallel (conceptually). Your code runs them and returns results. The model then generates a natural language response incorporating both answers.<\/p>\n<h3 class=\"wp-block-heading\">Tool choice control<\/h3>\n<p>Control whether the model must call functions or can respond with text. The <code>tool_choice<\/code> parameter accepts:<\/p>\n<ul class=\"wp-block-list\">\n<li><code>\"auto\"<\/code>: Model decides whether to call functions (default)<\/li>\n<li><code>\"required\"<\/code>: Model must call at least one function<\/li>\n<li><code>\"none\"<\/code>: Disable function calling for this request<\/li>\n<li><code>{\"type\": \"function\", \"function\": {\"name\": \"function_name\"}}<\/code>: Force specific function<\/li>\n<\/ul>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# Force the model to call get_weather\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Tell me about Tokyo&quot;}],\n    tools=tools,\n    tool_choice={&quot;type&quot;: &quot;function&quot;, &quot;function&quot;: {&quot;name&quot;: &quot;get_weather&quot;}}\n)\n<\/pre>\n<\/div>\n<p>Use <code>\"required\"<\/code> when you always need structured output. Use specific function selection when you know which function to call but want the model to extract parameters from natural language.<\/p>\n<h2 class=\"wp-block-heading\">Embeddings for semantic search and clustering<\/h2>\n<p>Embeddings convert text into dense vectors that capture semantic meaning. Similar texts produce similar vectors. Use embeddings for semantic search, clustering, recommendation systems, and anomaly detection.<\/p>\n<h3 class=\"wp-block-heading\">text-embedding-3-large vs text-embedding-3-small<\/h3>\n<p>OpenAI provides two embedding models with different dimension sizes and costs.<\/p>\n<figure class=\"wp-block-table\">\n<table>\n<thead>\n<tr>\n<th>Model<\/th>\n<th>Dimensions<\/th>\n<th>Cost (1M tokens)<\/th>\n<th>Use Case<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>text-embedding-3-large<\/td>\n<td>3072<\/td>\n<td>$0.13<\/td>\n<td>High-quality semantic search, research<\/td>\n<\/tr>\n<tr>\n<td>text-embedding-3-small<\/td>\n<td>1536<\/td>\n<td>$0.02<\/td>\n<td>Cost-sensitive applications, prototyping<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/figure>\n<p>Higher dimensions capture more nuance but cost more to store and search. For most applications, <code>text-embedding-3-small<\/code> provides sufficient quality at 1\/6th the cost.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# Generate embeddings\nresponse = client.embeddings.create(\n    model=&quot;text-embedding-3-small&quot;,\n    input=&quot;Retrieval Augmented Generation combines information retrieval with text generation&quot;\n)\n\nembedding = response.data&#x5B;0].embedding\nprint(f&quot;Embedding dimensions: {len(embedding)}&quot;)\n# Output: Embedding dimensions: 1536\n<\/pre>\n<\/div>\n<p>The response contains a list of embedding objects. Each has an <code>embedding<\/code> field with the vector and an <code>index<\/code> field indicating position in the input batch.<\/p>\n<h3 class=\"wp-block-heading\">Semantic search with embeddings<\/h3>\n<p>Build semantic search by embedding documents, storing vectors in a database, and finding nearest neighbors for queries.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport numpy as np\nfrom sklearn.metrics.pairwise import cosine_similarity\n\n# Sample documents\ndocuments = &#x5B;\n    &quot;Python is a high-level programming language&quot;,\n    &quot;Machine learning models require training data&quot;,\n    &quot;Vector databases store embeddings for similarity search&quot;,\n    &quot;FastAPI is a modern web framework for Python&quot;,\n    &quot;Neural networks consist of layers of interconnected nodes&quot;\n]\n\n# Generate embeddings for all documents\ndoc_embeddings = &#x5B;]\nfor doc in documents:\n    response = client.embeddings.create(\n        model=&quot;text-embedding-3-small&quot;,\n        input=doc\n    )\n    doc_embeddings.append(response.data&#x5B;0].embedding)\n\n# Convert to numpy array\ndoc_embeddings = np.array(doc_embeddings)\n\n# Search function\ndef search(query, top_k=3):\n    # Embed query\n    query_response = client.embeddings.create(\n        model=&quot;text-embedding-3-small&quot;,\n        input=query\n    )\n    query_embedding = np.array(&#x5B;query_response.data&#x5B;0].embedding])\n    \n    # Calculate cosine similarity\n    similarities = cosine_similarity(query_embedding, doc_embeddings)&#x5B;0]\n    \n    # Get top k results\n    top_indices = np.argsort(similarities)&#x5B;::-1]&#x5B;:top_k]\n    \n    results = &#x5B;]\n    for idx in top_indices:\n        results.append({\n            &quot;document&quot;: documents&#x5B;idx],\n            &quot;similarity&quot;: similarities&#x5B;idx]\n        })\n    \n    return results\n\n# Test search\nresults = search(&quot;What is a web framework?&quot;)\nfor i, result in enumerate(results, 1):\n    print(f&quot;{i}. {result&#x5B;&#039;document&#039;]} (similarity: {result&#x5B;&#039;similarity&#039;]:.3f})&quot;)\n\n# Output:\n# 1. FastAPI is a modern web framework for Python (similarity: 0.842)\n# 2. Python is a high-level programming language (similarity: 0.721)\n# 3. Vector databases store embeddings for similarity search (similarity: 0.654)\n<\/pre>\n<\/div>\n<p>For production systems, use vector databases like <a href=\"https:\/\/github.com\/facebookresearch\/faiss\" target=\"_blank\" rel=\"noopener\">FAISS<\/a>, <a href=\"https:\/\/www.pinecone.io\/\" target=\"_blank\" rel=\"noopener\">Pinecone<\/a>, or <a href=\"https:\/\/weaviate.io\/\" target=\"_blank\" rel=\"noopener\">Weaviate<\/a> instead of computing similarities in Python. These databases provide approximate nearest neighbor search that scales to millions of vectors.<\/p>\n<h3 class=\"wp-block-heading\">Clustering and classification<\/h3>\n<p>Embeddings enable unsupervised clustering and few-shot classification. Similar documents cluster together in embedding space.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom sklearn.cluster import KMeans\n\n# Cluster documents into 2 groups\nkmeans = KMeans(n_clusters=2, random_state=42)\nclusters = kmeans.fit_predict(doc_embeddings)\n\nfor i, doc in enumerate(documents):\n    print(f&quot;Cluster {clusters&#x5B;i]}: {doc}&quot;)\n\n# Output:\n# Cluster 0: Python is a high-level programming language\n# Cluster 1: Machine learning models require training data\n# Cluster 1: Vector databases store embeddings for similarity search\n# Cluster 0: FastAPI is a modern web framework for Python\n# Cluster 1: Neural networks consist of layers of interconnected nodes\n<\/pre>\n<\/div>\n<p>The model groups Python\/FastAPI together (programming) and ML\/vectors\/neural networks together (AI\/ML) without supervision.<\/p>\n<h2 class=\"wp-block-heading\">Image analysis with GPT-5.2 Vision<\/h2>\n<p>GPT-5.2 Vision analyzes images and answers questions about visual content. Use it for image captioning, OCR, visual question answering, and content moderation.<\/p>\n<h3 class=\"wp-block-heading\">Supported image formats<\/h3>\n<p>The API accepts images as URLs or base64-encoded data. Supported formats include JPEG, PNG, GIF, and WebP. Maximum file size is 20MB.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\n# Analyze image from URL\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5.2&quot;,\n    messages=&#x5B;\n        {\n            &quot;role&quot;: &quot;user&quot;,\n            &quot;content&quot;: &#x5B;\n                {&quot;type&quot;: &quot;text&quot;, &quot;text&quot;: &quot;What&#039;s in this image?&quot;},\n                {\n                    &quot;type&quot;: &quot;image_url&quot;,\n                    &quot;image_url&quot;: {\n                        &quot;url&quot;: &quot;https:\/\/example.com\/image.jpg&quot;\n                    }\n                }\n            ]\n        }\n    ]\n)\n\nprint(response.choices&#x5B;0].message.content)\n<\/pre>\n<\/div>\n<p>For base64 images, encode the file and include it in the data URL:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport base64\n\ndef encode_image(image_path):\n    with open(image_path, &quot;rb&quot;) as image_file:\n        return base64.b64encode(image_file.read()).decode(&#039;utf-8&#039;)\n\nbase64_image = encode_image(&quot;diagram.png&quot;)\n\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5.2&quot;,\n    messages=&#x5B;\n        {\n            &quot;role&quot;: &quot;user&quot;,\n            &quot;content&quot;: &#x5B;\n                {&quot;type&quot;: &quot;text&quot;, &quot;text&quot;: &quot;Describe this diagram&quot;},\n                {\n                    &quot;type&quot;: &quot;image_url&quot;,\n                    &quot;image_url&quot;: {\n                        &quot;url&quot;: f&quot;data:image\/png;base64,{base64_image}&quot;\n                    }\n                }\n            ]\n        }\n    ]\n)\n<\/pre>\n<\/div>\n<h3 class=\"wp-block-heading\">Multi-image analysis<\/h3>\n<p>Send multiple images in a single request. The model analyzes all images and answers questions about them.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5.2&quot;,\n    messages=&#x5B;\n        {\n            &quot;role&quot;: &quot;user&quot;,\n            &quot;content&quot;: &#x5B;\n                {&quot;type&quot;: &quot;text&quot;, &quot;text&quot;: &quot;Compare these two architecture diagrams&quot;},\n                {&quot;type&quot;: &quot;image_url&quot;, &quot;image_url&quot;: {&quot;url&quot;: &quot;https:\/\/example.com\/arch1.jpg&quot;}},\n                {&quot;type&quot;: &quot;image_url&quot;, &quot;image_url&quot;: {&quot;url&quot;: &quot;https:\/\/example.com\/arch2.jpg&quot;}}\n            ]\n        }\n    ]\n)\n<\/pre>\n<\/div>\n<p>Use cases include comparing before\/after images, analyzing multi-page documents, or processing image sequences.<\/p>\n<h3 class=\"wp-block-heading\">OCR and text extraction<\/h3>\n<p>GPT-5.2 Vision extracts text from images without dedicated OCR libraries. It handles handwriting, complex layouts, and multiple languages.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5.2&quot;,\n    messages=&#x5B;\n        {\n            &quot;role&quot;: &quot;user&quot;,\n            &quot;content&quot;: &#x5B;\n                {&quot;type&quot;: &quot;text&quot;, &quot;text&quot;: &quot;Extract all text from this image and format it as markdown&quot;},\n                {&quot;type&quot;: &quot;image_url&quot;, &quot;image_url&quot;: {&quot;url&quot;: &quot;https:\/\/example.com\/document.jpg&quot;}}\n            ]\n        }\n    ]\n)\n\nextracted_text = response.choices&#x5B;0].message.content\nprint(extracted_text)\n<\/pre>\n<\/div>\n<p>The model understands document structure and can format extracted text appropriately (preserving headings, lists, tables).<\/p>\n<h2 class=\"wp-block-heading\">Production error handling patterns<\/h2>\n<p>Production systems need robust error handling. The OpenAI API returns specific exceptions for different failure modes. Handle them appropriately to build reliable applications.<\/p>\n<h3 class=\"wp-block-heading\">Common errors<\/h3>\n<p>The SDK raises typed exceptions for different error conditions:<\/p>\n<p><strong>RateLimitError<\/strong>: You exceeded your rate limit (requests per minute or tokens per minute). This happens during traffic spikes or when processing large batches.<\/p>\n<p><strong>APIError<\/strong>: OpenAI&#8217;s servers returned a 500 error. This indicates a temporary server issue. Retry with exponential backoff.<\/p>\n<p><strong>AuthenticationError<\/strong>: Invalid API key or insufficient permissions. Check your API key and organization settings.<\/p>\n<p><strong>InvalidRequestError<\/strong>: Malformed request (invalid parameters, unsupported model, etc.). Fix the request parameters.<\/p>\n<p><strong>APIConnectionError<\/strong>: Network failure or timeout. Retry with backoff.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom openai import OpenAI, RateLimitError, APIError, AuthenticationError\n\nclient = OpenAI()\n\ntry:\n    response = client.chat.completions.create(\n        model=&quot;gpt-5&quot;,\n        messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Hello&quot;}]\n    )\nexcept RateLimitError as e:\n    print(f&quot;Rate limit exceeded: {e}&quot;)\n    # Implement backoff and retry\nexcept APIError as e:\n    print(f&quot;Server error: {e}&quot;)\n    # Retry after delay\nexcept AuthenticationError as e:\n    print(f&quot;Authentication failed: {e}&quot;)\n    # Check API key\nexcept Exception as e:\n    print(f&quot;Unexpected error: {e}&quot;)\n<\/pre>\n<\/div>\n<h3 class=\"wp-block-heading\">Retry strategies with exponential backoff<\/h3>\n<p>Implement retries for transient errors (rate limits, server errors). Use exponential backoff to avoid overwhelming the API during outages.<\/p>\n<p>The <code>tenacity<\/code> library provides decorators for retry logic:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom tenacity import (\n    retry,\n    stop_after_attempt,\n    wait_exponential,\n    retry_if_exception_type\n)\nfrom openai import RateLimitError, APIError\n\n@retry(\n    retry=retry_if_exception_type((RateLimitError, APIError)),\n    stop=stop_after_attempt(3),\n    wait=wait_exponential(multiplier=1, min=2, max=10)\n)\ndef call_gpt5(prompt):\n    response = client.chat.completions.create(\n        model=&quot;gpt-5&quot;,\n        messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt}]\n    )\n    return response.choices&#x5B;0].message.content\n\n# Usage\ntry:\n    result = call_gpt5(&quot;Explain Python decorators&quot;)\n    print(result)\nexcept Exception as e:\n    print(f&quot;Failed after retries: {e}&quot;)\n<\/pre>\n<\/div>\n<p>This retries up to 3 times with exponential backoff (2s, 4s, 8s). The decorator only retries on <code>RateLimitError<\/code> and <code>APIError<\/code>, not on authentication or invalid request errors.<\/p>\n<h3 class=\"wp-block-heading\">Timeout configuration<\/h3>\n<p>Set timeouts to prevent hanging requests. The SDK accepts a <code>timeout<\/code> parameter (in seconds).<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nclient = OpenAI(timeout=30.0)  # 30 second timeout\n\n# Or per-request\nresponse = client.chat.completions.create(\n    model=&quot;gpt-5&quot;,\n    messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Long task&quot;}],\n    timeout=60.0\n)\n<\/pre>\n<\/div>\n<p>Use shorter timeouts (10-30s) for user-facing applications. Use longer timeouts (60-120s) for batch processing or complex reasoning tasks.<\/p>\n<h2 class=\"wp-block-heading\">Production deployment and optimization<\/h2>\n<p>Production systems need cost tracking, logging, caching, and rate limiting. These patterns reduce costs and improve reliability.<\/p>\n<h3 class=\"wp-block-heading\">Cost tracking with token counting<\/h3>\n<p>Track token usage to monitor costs and optimize prompts. The <code>tiktoken<\/code> library counts tokens for OpenAI models.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport tiktoken\n\ndef count_tokens(text, model=&quot;gpt-5&quot;):\n    encoding = tiktoken.encoding_for_model(model)\n    return len(encoding.encode(text))\n\n# Count tokens in a prompt\nprompt = &quot;Explain retrieval augmented generation in detail&quot;\ntoken_count = count_tokens(prompt)\nprint(f&quot;Prompt tokens: {token_count}&quot;)\n\n# Estimate cost\ninput_cost_per_1m = 3.00  # GPT-5\noutput_cost_per_1m = 9.00\nestimated_output_tokens = 500\n\ninput_cost = (token_count \/ 1_000_000) * input_cost_per_1m\noutput_cost = (estimated_output_tokens \/ 1_000_000) * output_cost_per_1m\ntotal_cost = input_cost + output_cost\n\nprint(f&quot;Estimated cost: ${total_cost:.6f}&quot;)\n<\/pre>\n<\/div>\n<p>Log token usage for every request. Aggregate by user, endpoint, or time period to identify cost drivers.<\/p>\n<h3 class=\"wp-block-heading\">Caching strategies<\/h3>\n<p>Cache responses for identical prompts. This eliminates redundant API calls and reduces costs.<\/p>\n<p>Use Redis for distributed caching:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport redis\nimport json\nimport hashlib\n\nredis_client = redis.Redis(host=&#039;localhost&#039;, port=6379, decode_responses=True)\n\ndef cached_completion(prompt, model=&quot;gpt-5&quot;, ttl=3600):\n    # Create cache key\n    cache_key = hashlib.sha256(f&quot;{model}:{prompt}&quot;.encode()).hexdigest()\n    \n    # Check cache\n    cached = redis_client.get(cache_key)\n    if cached:\n        return json.loads(cached)\n    \n    # Call API\n    response = client.chat.completions.create(\n        model=model,\n        messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt}]\n    )\n    \n    result = response.choices&#x5B;0].message.content\n    \n    # Store in cache\n    redis_client.setex(cache_key, ttl, json.dumps(result))\n    \n    return result\n<\/pre>\n<\/div>\n<p>Set appropriate TTLs based on content freshness requirements. Cache static content (documentation Q&amp;A) for hours or days. Cache dynamic content (news summaries) for minutes.<\/p>\n<h3 class=\"wp-block-heading\">Rate limiting<\/h3>\n<p>Respect OpenAI&#8217;s rate limits to avoid 429 errors. Implement client-side throttling for high-volume applications.<\/p>\n<p>GPT-5 tier limits (as of 2026):<\/p>\n<ul class=\"wp-block-list\">\n<li>Free tier: 200 requests\/day, 40k tokens\/day<\/li>\n<li>Tier 1: 500 requests\/minute, 200k tokens\/minute<\/li>\n<li>Tier 2: 5,000 requests\/minute, 2M tokens\/minute<\/li>\n<\/ul>\n<p>Use a token bucket algorithm for rate limiting:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport time\nfrom threading import Lock\n\nclass RateLimiter:\n    def __init__(self, requests_per_minute):\n        self.requests_per_minute = requests_per_minute\n        self.tokens = requests_per_minute\n        self.last_update = time.time()\n        self.lock = Lock()\n    \n    def acquire(self):\n        with self.lock:\n            now = time.time()\n            elapsed = now - self.last_update\n            \n            # Refill tokens\n            self.tokens = min(\n                self.requests_per_minute,\n                self.tokens + elapsed * (self.requests_per_minute \/ 60)\n            )\n            self.last_update = now\n            \n            if self.tokens &gt;= 1:\n                self.tokens -= 1\n                return True\n            else:\n                # Wait until next token available\n                wait_time = (1 - self.tokens) * (60 \/ self.requests_per_minute)\n                time.sleep(wait_time)\n                self.tokens = 0\n                return True\n\n# Usage\nlimiter = RateLimiter(requests_per_minute=500)\n\ndef rate_limited_completion(prompt):\n    limiter.acquire()\n    return client.chat.completions.create(\n        model=&quot;gpt-5&quot;,\n        messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt}]\n    )\n<\/pre>\n<\/div>\n<h3 class=\"wp-block-heading\">Logging and monitoring<\/h3>\n<p>Log all requests and responses for debugging and analysis. Track latency, error rates, and token usage.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport logging\nimport time\n\nlogging.basicConfig(level=logging.INFO)\nlogger = logging.getLogger(__name__)\n\ndef monitored_completion(prompt, model=&quot;gpt-5&quot;):\n    start_time = time.time()\n    \n    try:\n        response = client.chat.completions.create(\n            model=model,\n            messages=&#x5B;{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt}]\n        )\n        \n        latency = time.time() - start_time\n        \n        logger.info({\n            &quot;model&quot;: model,\n            &quot;prompt_length&quot;: len(prompt),\n            &quot;completion_tokens&quot;: response.usage.completion_tokens,\n            &quot;total_tokens&quot;: response.usage.total_tokens,\n            &quot;latency_ms&quot;: latency * 1000,\n            &quot;status&quot;: &quot;success&quot;\n        })\n        \n        return response.choices&#x5B;0].message.content\n        \n    except Exception as e:\n        latency = time.time() - start_time\n        \n        logger.error({\n            &quot;model&quot;: model,\n            &quot;error&quot;: str(e),\n            &quot;error_type&quot;: type(e).__name__,\n            &quot;latency_ms&quot;: latency * 1000,\n            &quot;status&quot;: &quot;error&quot;\n        })\n        \n        raise\n<\/pre>\n<\/div>\n<p>Integrate with observability platforms like <a href=\"https:\/\/www.langchain.com\/langsmith\" target=\"_blank\" rel=\"noopener\">LangSmith<\/a>, <a href=\"https:\/\/arize.com\/\" target=\"_blank\" rel=\"noopener\">Arize<\/a>, or <a href=\"https:\/\/www.datadoghq.com\/\" target=\"_blank\" rel=\"noopener\">Datadog<\/a> for production monitoring.<\/p>\n<h2 class=\"wp-block-heading\">Frequently asked questions<\/h2>\n<h3 class=\"wp-block-heading\">What&#8217;s the difference between GPT-5.2 and GPT-5?<\/h3>\n<p>GPT-5.2 is the flagship model with superior reasoning, longer context (200k vs 128k tokens), and better performance on complex tasks. It costs 2.7x more than GPT-5. Use GPT-5.2 for research, analysis, and tasks requiring multi-step reasoning. Use GPT-5 for general content generation, summarization, and conversational AI.<\/p>\n<h3 class=\"wp-block-heading\">How do I reduce API costs?<\/h3>\n<p>Use GPT-5-mini for simple tasks (classification, extraction, simple Q&amp;A). It costs 1\/10th of GPT-5 with 90% of the quality. Implement caching to avoid redundant API calls. Count tokens and optimize prompts to reduce input length. Use lower <code>max_tokens<\/code> limits to cap response length. Batch requests when possible.<\/p>\n<h3 class=\"wp-block-heading\">Can I use the SDK with Azure OpenAI?<\/h3>\n<p>Yes. Azure OpenAI provides the same models through a different endpoint. Initialize the client with Azure credentials:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom openai import AzureOpenAI\n\nclient = AzureOpenAI(\n    api_key=&quot;your-azure-key&quot;,\n    api_version=&quot;2024-02-01&quot;,\n    azure_endpoint=&quot;https:\/\/your-resource.openai.azure.com&quot;\n)\n<\/pre>\n<\/div>\n<p>The API is identical except for authentication and endpoint configuration.<\/p>\n<h3 class=\"wp-block-heading\">What&#8217;s the rate limit for GPT-5?<\/h3>\n<p>Rate limits depend on your usage tier. Tier 1 (paid accounts) gets 500 requests\/minute and 200k tokens\/minute. Tier 2 gets 5,000 requests\/minute and 2M tokens\/minute. Check your limits at <a href=\"https:\/\/platform.openai.com\/account\/limits\" target=\"_blank\" rel=\"noopener\">platform.openai.com\/account\/limits<\/a>.<\/p>\n<h3 class=\"wp-block-heading\">How do I handle long conversations that exceed context limits?<\/h3>\n<p>Implement conversation summarization. When the conversation approaches the context limit (128k tokens for GPT-5), summarize old messages and keep only recent context. Use a sliding window approach or hierarchical summarization. Alternatively, use the Assistants API which manages conversation state automatically.<\/p>\n<h3 class=\"wp-block-heading\">Should I use sync or async client?<\/h3>\n<p>Use <code>AsyncOpenAI()<\/code> for async applications (FastAPI, asyncio-based services). Use <code>OpenAI()<\/code> for synchronous scripts and applications. The async client provides better concurrency when making multiple API calls. Don&#8217;t use async unless your application is already async.<\/p>\n<h3 class=\"wp-block-heading\">How do I test OpenAI integrations?<\/h3>\n<p>Mock the OpenAI client in tests. Use dependency injection to swap the real client with a mock:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\nfrom unittest.mock import Mock\n\ndef test_completion():\n    mock_client = Mock()\n    mock_response = Mock()\n    mock_response.choices = &#x5B;Mock(message=Mock(content=&quot;Test response&quot;))]\n    mock_client.chat.completions.create.return_value = mock_response\n    \n    # Test your code with mock_client\n    result = your_function(mock_client)\n    assert result == &quot;Test response&quot;\n<\/pre>\n<\/div>\n<p>For integration tests, use a test API key with low rate limits and monitor costs.<\/p>\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n<p>The OpenAI Python SDK 1.x provides a modern, type-safe interface to GPT-5.2, GPT-5, embeddings, vision, and assistants. The SDK uses client instances, Pydantic response models, and proper async support. Production systems need error handling with retries, cost tracking with token counting, caching for redundant requests, and rate limiting to respect API quotas.<\/p>\n<p>Start with GPT-5-mini for prototyping and simple tasks. Use GPT-5 for general applications and GPT-5.2 for complex reasoning. Implement function calling to integrate external APIs and tools. Use embeddings for semantic search and clustering. Add vision capabilities for image analysis. Deploy assistants for stateful conversations with built-in code execution and file search.<\/p>\n<p>Test different models on your specific use case. Measure quality and cost. Optimize prompts to reduce token usage. Implement caching and rate limiting. Monitor latency and error rates. The SDK provides the building blocks for production AI applications in 2026.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Want to add ChatGPT, image generation, and AI capabilities to your Python apps? The OpenAI Python SDK makes this straightforward. In this guide, you&#8217;ll build AI-powered features-from chat interfaces to semantic search-using Python 3.13 and the latest SDK patterns. What you&#8217;ll learn: Chat completions with streaming responses, function calling for API integration, embeddings for semantic [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":65805,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[],"class_list":["post-65803","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-examples"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/65803","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=65803"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/65803\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/65805"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=65803"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=65803"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=65803"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}