> ## Documentation Index
> Fetch the complete documentation index at: https://docs.parea.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# SGLang

> Instrumenting your SGLang application with Parea AI

SGLang from [LMSYS](https://lmsys.org) is "a Structured Generation Language designed for LLMs".
Its main benefits is that it allows to structure complex LLM programs with multiple chained generation calls,control flow, multiple modalities, parallelism, and external interaction using plain Python.
Additionally, one can take improve performance of local LLMs with its RadixAttention mechanism for automatic KV cache reuse across multiple calls.

## Quickstart

```bash python theme={null}
pip install parea-ai "sglang[openai]"
```

First, create a Parea API key as shown [here](/welcome/setup).
Second, call `integrate_with_sglang()` on the Parea client to automatically instrument any OpenAI calls made through SGLang.
Finally, define a function like `run_and_trace` to automatically log the outputs of the SGLang program to Parea and create a trace which associates all LLM calls together.
The following code snippet demonstrates a simple multi-turn question-answering program that logs the outputs of the LLM calls to Parea.

```python theme={null}
from parea.schemas import UpdateTraceScenario
from parea.utils.trace_utils import fill_trace_data, get_current_trace_id

from sglang import function, system, user, assistant, gen, set_default_backend, OpenAI, SglFunction
from parea import Parea, trace

from sglang.lang.interpreter import ProgramState


p = Parea(api_key="PAREA_API_KEY")  # Replace with your Parea API key
p.integrate_with_sglang()


@function
def multi_turn_question(s, question_1, question_2):
    s += system("You are a helpful assistant.")
    s += user(question_1)
    s += assistant(gen("answer_1", max_tokens=256))
    s += user(question_2)
    s += assistant(gen("answer_2", max_tokens=256))


@trace(log_omit_outputs=True)
def run_and_trace(func: SglFunction, *args, **kwargs) -> ProgramState:
    state: ProgramState = func.run(*args, **kwargs)
    while not state.stream_executor.is_finished:
        time.sleep(1)
    # the returned state doesn't return the output
    # but the variables of the executor have it such that we can log them
    fill_trace_data(get_current_trace_id(), {'result': state.stream_executor.variables}, UpdateTraceScenario.RESULT)
    return state

set_default_backend(OpenAI("gpt-3.5-turbo"))

run_and_trace(multi_turn_question, question_1="What is the capital of Sweden?", question_2="List two local attractions.")
```

## Visualization of the Trace

This will produce the following trace:

<img src="https://mintcdn.com/pareaai/cxAhBMLitjWj5gEW/integrations/sglang-trace.png?fit=max&auto=format&n=cxAhBMLitjWj5gEW&q=85&s=9dffa926d6cf6770e115e1c45add3035" alt="SGLang Trace" width="1853" height="922" data-path="integrations/sglang-trace.png" />
