Auxiliary tools

LLMs and data
Author

Cody Peterson

Published

October 16, 2023

Introduction

# | code-fold: true
import ibis
import marvin

from dotenv import load_dotenv

load_dotenv()

con = ibis.connect("duckdb://penguins.ddb")
t = ibis.examples.penguins.fetch()
t = con.create_table("penguins", t.to_pyarrow(), overwrite=True)
1
Import the libraries we need.
2
Load the environment variable to setup Marvin to call our OpenAI account.
3
Setup the demo datain an Ibis backend.
import ibis
import marvin

from ibis.expr.schema import Schema
from ibis.expr.types.relations import Table

ibis.options.interactive = True
marvin.settings.llm_model = "openai/gpt-4"

con = ibis.connect("duckdb://penguins.ddb")
t = con.table("penguins")
1
Import Ibis and Marvin.
2
Configure Ibis (interactive) and Marvin (GPT-4).
3
Connect to the data and load a table into a variable.
@marvin.ai_fn
def _generate_sql_select(
    text: str, table_name: str, table_schema: Schema
) -> str:
    """Generate SQL SELECT from text."""


def sql_from_text(text: str, t: Table) -> Table:
    """Run SQL from text."""
    return t.sql(_generate_sql_select(text, t.get_name(), t.schema()).strip(";"))
1
A non-deterministic, LLM-powered AI function.
2
A deterministic, human-authored function that calls the AI function.
t2 = sql_from_text("the unique combination of species and islands", t)
t2
t3 = sql_from_text(
    "the unique combination of species and islands, with their counts, ordered from highest to lowest, and name that column just 'count'",
    t,
)
t3

Summary

To summarize this post:

from rich import print
from pydantic import BaseModel, Field

with open("index.qmd", "r") as f:
    self_text = f.read()

# save some money and avoid rate limiting
marvin.settings.llm_model = "openai/gpt-3.5-turbo-16k"

@marvin.ai_model
class Summary(BaseModel):
    """Summary of text."""

    summary_line: str = Field(..., description="The one-line summary of the text.")
    summary_paragraph: str = Field(
        ..., description="The one-paragraph summary of the text."
    )
    conclusion: str = Field(
        ..., description="The conclusion the reader should draw from the text."
    )
    key_points: list[str] = Field(..., description="The key points of the text.")
    critiques: list[str] = Field(
        ..., description="Professional, fair critiques of the text."
    )
    suggested_improvements: list[str] = Field(
        ..., description="Suggested improvements for the text."
    )
    sentiment: float = Field(..., description="The sentiment of the text.")
    sentiment_label: str = Field(..., description="The sentiment label of the text.")
    author_bias: str = Field(..., description="The author bias of the text.")


print(Summary(self_text))

Next steps

You can get involved with Ibis Birdbrain, our open-source data & AI project for building next-generation natural language interfaces to data.

Back to top