How to Make LLM Output JSON

`ell` Looks Good

I just saw a new repo called ell and it can make llm returns structured data determined by pydantic model. Taken from the docs:

from pydantic import BaseModel, Field
 
class MovieReview(BaseModel):
    title: str = Field(description="The title of the movie")
    rating: int = Field(description="The rating of the movie out of 10")
    summary: str = Field(description="A brief summary of the movie")
 
@ell.complex(model="gpt-4o-2024-08-06", response_format=MovieReview)
def generate_movie_review(movie: str) -> MovieReview:
    """You are a movie review generator. Given the name of a movie, you need to return a structured review."""
    return f"generate a review for the movie {movie}"

Besides the (IMHO) unsettling usage of docstring as system prompt and the fact that the returned string is not what the function returns, it’s a lot simpler than langchain and llamaindex invocation (given they also provide a lot more flexibility). Anyway I thought it could work with all LLM, but looking at the documentation it seems to use a new OpenAI client’s beta feature, i.e. client.beta.chat.completions.parse which you can read more on OpenAI Blog Post from last Aug.

Not on the Shoulder of `Instructor`

And here I thought they were building on top of Instructor, another great piece of library to output structured data

import instructor
from pydantic import BaseModel
from openai import OpenAI
 
# Define your desired output structure
class UserInfo(BaseModel):
    name: str
    age: int
 
# Patch the OpenAI client
client = instructor.from_openai(OpenAI())
 
# Extract structured data from natural language
user_info = client.chat.completions.create(
    model="gpt-3.5-turbo",
    response_model=UserInfo,
    messages=[{"role": "user", "content": "John Doe is 30 years old."}],
)
 
print(user_info.name)
#> John Doe
print(user_info.age)
#> 30

as you can see the pattern is similar, and it actually supports a lot of models at the time of writing. They’re doing this by Patching the llm client as seen in instructor.from_openai part of the code. And it brought me to another question

What Actually is Patching?

TLDR; Adding This Piece of Code

From the docs the patching actually do several things to the llm client. But what I want to know is what actually is sent to the client? Especially what kind of magic that can help in making LLM return proper structured data. Well, the answer is it’s not magic. As seen in this function it’s just ifs and else™️. Really, there’s a lot of cases for all the supported models and it’s inside this huge function containing if and else. But I especially love Google’s gemini system prompt

message = dedent(
	f"""
	As a genius expert, your task is to understand the content and provide
	the parsed objects in json that match the following json_schema:\n
	
	{json.dumps(response_model.model_json_schema(), indent=2, ensure_ascii=False)}
	
	Make sure to return an instance of the JSON, not the schema itself
	"""
	)

just look at that encouraging As a genius expert part, I hope someday someone ask that of me ;)

(un)Finished

Explorer

How to Make LLM Output JSON

`ell` Looks Good

Not on the Shoulder of `Instructor`

What Actually is Patching?

Table of Contents

(un)Finished

Explorer

How to Make LLM Output JSON

ell Looks Good

Not on the Shoulder of Instructor

What Actually is Patching?

Table of Contents

`ell` Looks Good

Not on the Shoulder of `Instructor`