In order to build production ready LLM applications, developers need to understand the state of their systems. You want to track key LLM metrics, such as request inputs and outputs, as well as model specific metadata, model parameters, tokens, cost, etc. But you also want to track your functions which may manipulate data from one LLM and chain it into another. Parea makes it easy to get this deep visibility into any LLM stack.

Prerequisites

  1. First, you’ll need a Parea API key. See Authentication to get started.
  2. For any model you want to use with the SDK, set up your Provider API keys.
  3. Install the Parea SDK.

Logging

Parea automatically logs all LLM requests when using the SDK, or when using OpenAI’s API.

Usage

Parea supports automatic logging for OpenAI, Anthropic, Langchain, or any model if using Parea’s completion method (schema definition).

OpenAI API

If you want to use OpenAI directly, you can still get automatic logging using Parea’s wrap_openai_client helper.

openai.py
from openai import OpenAI
from parea import Parea

client = OpenAI(api_key="OPENAI_API_KEY")

# All you need to do is add these two lines
p = Parea(api_key="PAREA_API_KEY")  # replace with your API key
p.wrap_openai_client(client)

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    temperature=0.5,
    messages=[
        {
            "role": "user",
            "content": "Write a Hello World program in Python using FastAPI.",
        }
    ],
)
print(response.choices[0].message.content)

# Also works with the assistants API
assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions=instructions,
    tools=[{"type": "code_interpreter"}],
    model="gpt-4-turbo-preview",
)
print(assistant)

Anthropic API

If you want to use Anthropic’s Claude directly, you can still get automatic logging using Parea’s wrap_anthropic_client helper.

anthropic.py
import anthropic
from parea import Parea

p = Parea(api_key="PAREA_API_KEY")  # replace with your API key

client = anthropic.Anthropic()
p.wrap_anthropic_client(client)

message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Write a Hello World program in Python using FastAPI.",
        }
    ],
)
print(message.content[0].text)

Parea Completion Method

The completion method allows you to call any LLM model you have access to on Parea with the same API interface.

You have granular control over what is logged via the parameters on Parea’s completion method.

  • log_omit_inputs: bool = field(default=False) # omit the inputs to the LLM call
  • log_omit_outputs: bool = field(default=False) # omit the outputs from the LLM call
  • log_omit: bool = field(default=False) # do not log anything
parea_completion.py
from parea import Parea
from parea.schemas import LLMInputs, Message, ModelParams, Role, Completion

p = Parea(api_key="PAREA_API_KEY")  # replace with your API key

response = p.completion(
    Completion(llm_configuration=LLMInputs(
        model="gpt-3.5-turbo",  # this can be any model enabled on Parea
        model_params=ModelParams(temp=0.5),
        messages=[Message(
            role=Role.user,
            content="Write a Hello World program in Python using FastAPI.",
        )],
    ))
)
print(response.content)

LangChain Framework

Parea also supports frameworks such as Langchain. You can use PareaAILangchainTracer as a callback to automatically log all requests and responses.

langchain.py
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from parea import Parea
from parea.utils.trace_integrations.langchain import PareaAILangchainTracer

# All you need to do is add these two lines
p = Parea(api_key="PAREA_API_KEY")  # replace with your API key
handler = PareaAILangchainTracer()

llm = ChatOpenAI(openai_api_key="OPENAI_API_KEY")  # replace with your API key
prompt = ChatPromptTemplate.from_messages([("user", "{input}")])
chain = prompt | llm | StrOutputParser()

response = chain.invoke(
    {"input": "Write a Hello World program in Python using FastAPI."},
    config={"callbacks": [handler]}, # <- use the callback handler here
)
print(response)

Tracing

If your LLM application has complex abstractions such as chains, agents, retrieval, tool usage, or external functions that modify or connect prompts, then you will want a trace to associate all your related logs. A Trace captures the entire lifecycle of a request and consists of one or more spans, representing different sub-steps.

Usage

The @trace decorator allows you to associate multiple processes into a single parent trace. You only need to add the decorator to the top level function or any non-llm call function that you want to also track.

OpenAI API

If you want to use OpenAI directly, you can still get automatic logging using Parea’s wrap_openai_client helper.

openai_trace_decorator.py
from openai import OpenAI
from parea import Parea, trace

client = OpenAI(api_key="OPENAI_API_KEY")  # replace with your API key

# All you need to do is add these two lines
p = Parea(api_key="PAREA_API_KEY")  # replace with your API key
p.wrap_openai_client(client)


# We generally recommend creating a helper function to make LLM API calls.
def llm(messages: list[dict[str, str]]):
    response = client.chat.completions.create(model="gpt-3.5-turbo", temperature=0.5, messages=messages)
    return response.choices[0].message.content


# (Optional) You can add a trace decorator to each prompt.
# This will give the Span the name of the function.
# Without the decorator the default name for all LLM call logs is `llm-openai`
@trace
def hello_world(lang: str, framework: str):
    return llm([{"role": "user", "content": f"Write a Hello World program in {lang} using {framework}."}])

@trace
def critique_code(code: str):
    return llm([{"role": "user", "content": f"How can we improve this code: \n {code}"}])

# Our top level function is called chain. By adding the trace decorator here,
# all sub-functions will automatically be logged and associated with this trace.
# Notice, you can also add metadata to the trace, we'll revisit this functionality later.
@trace(metadata={"purpose": "example"}, end_user_identifier="John Doe")
def chain(lang: str, framework: str) -> str:
    return critique_code(hello_world(lang, framework))

Parea Completion Method

trace_decorator.py
from parea import Parea, trace
from parea.schemas import LLMInputs, Message, ModelParams, Completion

p = Parea(api_key="PAREA_API_KEY") # replace with your API key

# We generally recommend creating a helper function to make LLM API calls.
def llm(messages: list[dict[str, str]]):
    return p.completion(
        Completion(
            llm_configuration=LLMInputs(
                model="gpt-3.5-turbo",
                model_params=ModelParams(temp=0.5),
                messages=[Message(**m) for m in messages],
            )
        )
    ).content

# (Optional) You can add a trace decorator to each prompt.
# This will give the Span the name of the function.
# Without the decorator the default name for all LLM call logs is `LLM`
@trace
def hello_world(lang: str, framework: str):
    return llm([{"role": "user", "content": f"Write a Hello World program in {lang} using {framework}."}])

@trace
def critique_code(code: str):
    return llm([{"role": "user", "content": f"How can we improve this code: \n {code}"}])

# Our top level function is called chain. By adding the trace decorator here,
# all sub-functions will automatically be logged and associated with this trace.
# Notice, you can also add metadata to the trace, we'll revisit this functionality later.
@trace(metadata={"purpose": "example"}, end_user_identifier="John Doe")
def chain(lang: str, framework: str) -> str:
    return critique_code(hello_world(lang, framework))

Limitations

Python: Threading & Multi-processing

The trace decorator relies on Python’s contextvars to create traces. However, when spawning threads from inside a trace the decorator will not work correctly as the contextvars are not correctly copied to the new threads or processes. There is an existing issue in Python’s standard library and a great explanation in the FastAPI repo that discusses this limitation.

For example when a @trace-decorated function uses a ThreadPoolExecutor to make concurrent LLM requests the context that holds important info on the nesting hierarchy (“we are inside another trace”) is not copied over correctly to the child threads. So, the created generations will not be linked to the trace and be ‘orphaned’. In the UI, you will see a trace missing those generations. A workaround is to manually copy over the context to the new threads or processes via contextvars.copy_context. This is the recommended approach when using threading or multi-processing in Python.

from concurrent.futures import ThreadPoolExecutor
import contextvars

from parea import Parea, trace

p = Parea(api_key="PAREA_API_KEY")  # replace with your Parea API key

@trace
def llm_call(question):
    return f"I can't answer that question: {question}"

@trace
def multiple_llm_calls(question, n_calls: int = 2):
    answers = []
    with ThreadPoolExecutor(max_workers=2) as executor:
        for _ in range(n_calls):
            context = contextvars.copy_context()
            future = executor.submit(context.run, llm_call, question)
            answers.append(future.result())
    return answers

response = multiple_llm_calls("Who are you?")
print(response)

Disabling/sampling logging

You can either disable logging or only store a percentage of all logs in Parea.

In Python, you can disable logging by setting the environment variable TURN_OFF_PAREA_LOGGING to True. Alternatively, you can also deactivate logging by using the parea.helpers.TurnOffPareaLogging context manager. In order to reduce the amount of logs stored in Parea, you can specify the log_sample_rate in the trace decorator or completion function

Streaming

What’s Next

Now that you know how to create a trace you can enrich it with metadata or learn how to: