{
"cells": [
{
"cell_type": "markdown",
"id": "5a01f3ac-5306-4a1b-9e47-a5d254bce93a",
"metadata": {},
"source": [
"# Validators"
]
},
{
"cell_type": "markdown",
"id": "9dcc78ac-ed6d-49e3-b71b-fb2fb25f16a8",
"metadata": {},
"source": [
"Instead of framing \"self-critique\" or \"self-reflection\" in AI as new concepts, we can view them as validation errors with clear error messages that the system can use to self correct.\n",
"\n",
"Pydantic offers an customizable and expressive validation framework for Python. Instructor leverages Pydantic's validation framework to provide a uniform developer experience for both code-based and LLM-based validation, as well as a reasking mechanism for correcting LLM outputs based on validation errors. To learn more check out the Pydantic [docs](https://docs.pydantic.dev/latest/) on validators.\n",
"\n",
"Note: For the majority of this notebook we won't be calling openai, just using validators to see how we can control the validation of the objects."
]
},
{
"cell_type": "markdown",
"id": "064c286b",
"metadata": {},
"source": [
"Validators will enable us to control outputs by defining a function like so:\n",
"\n",
"\n",
"```python\n",
"def validation_function(value):\n",
" if condition(value):\n",
" raise ValueError(\"Value is not valid\")\n",
" return mutation(value)\n",
"```\n",
"\n",
"Before we get started lets go over the general shape of a validator:"
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "d4bb6258-b03a-4621-8a73-29056a20ec0f",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for UserDetail\nname\n Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 4\u001b[0m line \u001b[0;36m1\n\u001b[1;32m 11\u001b[0m age: \u001b[39mint\u001b[39m\n\u001b[1;32m 12\u001b[0m name: Annotated[\u001b[39mstr\u001b[39m, AfterValidator(name_must_contain_space)]\n\u001b[0;32m---> 14\u001b[0m person \u001b[39m=\u001b[39m UserDetail(age\u001b[39m=\u001b[39;49m\u001b[39m29\u001b[39;49m, name\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39mJason\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n",
"File \u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\u001b[0m, in \u001b[0;36mBaseModel.__init__\u001b[0;34m(__pydantic_self__, **data)\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 163\u001b[0m __tracebackhide__ \u001b[39m=\u001b[39m \u001b[39mTrue\u001b[39;00m\n\u001b[0;32m--> 164\u001b[0m __pydantic_self__\u001b[39m.\u001b[39;49m__pydantic_validator__\u001b[39m.\u001b[39;49mvalidate_python(data, self_instance\u001b[39m=\u001b[39;49m__pydantic_self__)\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for UserDetail\nname\n Value error, Name must contain a space. [type=value_error, input_value='Jason', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error"
]
}
],
"source": [
"from pydantic import BaseModel\n",
"from typing import Annotated\n",
"from pydantic import AfterValidator\n",
"\n",
"\n",
"def name_must_contain_space(v: str) -> str:\n",
" if \" \" not in v:\n",
" raise ValueError(\"Name must contain a space.\")\n",
" return v.lower()\n",
"\n",
"\n",
"class UserDetail(BaseModel):\n",
" age: int\n",
" name: Annotated[str, AfterValidator(name_must_contain_space)]\n",
"\n",
"\n",
"person = UserDetail(age=29, name=\"Jason\")"
]
},
{
"cell_type": "markdown",
"id": "417fafe5-4616-4372-b9e9-78e89afff536",
"metadata": {},
"source": [
"**Validation Applications**\n",
"\n",
"Validators are essential in tackling the unpredictabile nature of LLMs.\n",
"\n",
"Straightforward examples include:\n",
"\n",
"* Flagging outputs containing blacklisted words.\n",
"* Identifying outputs with tones like racism or violence.\n",
"\n",
"For more complex tasks:\n",
"\n",
"* Ensuring citations directly come from provided content.\n",
"* Checking that the model's responses align with given context.\n",
"* Validating the syntax of SQL queries before execution."
]
},
{
"cell_type": "markdown",
"id": "1bd2104b-7eed-4619-a47d-c3d197f9d483",
"metadata": {},
"source": [
"## Setup and Dependencies"
]
},
{
"cell_type": "markdown",
"id": "e94449ab-50a9-4325-972c-f64fcdadee00",
"metadata": {},
"source": [
"Using the [instructor](https://github.com/jxnl/instructor) library, we streamline the integration of these validators. `instructor` manages the parsing and validation of outputs and automates retries for compliant responses. This simplifies the process for developers to implement new validation logic, minimizing extra overhead."
]
},
{
"cell_type": "markdown",
"id": "a7a84adc",
"metadata": {},
"source": [
"To use instructor in our api calls, we just need to patch the openai client:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "1aa2c503-82f8-4735-aae3-373b55fb1064",
"metadata": {},
"outputs": [],
"source": [
"import instructor\n",
"from openai import OpenAI\n",
"\n",
"client = instructor.patch(OpenAI())"
]
},
{
"cell_type": "markdown",
"id": "45cd244f-d59c-4431-be2d-aa356a6fefa0",
"metadata": {},
"source": [
"## Software 2.0: Rule-based validators"
]
},
{
"cell_type": "markdown",
"id": "3494e664-c5b3-42ea-9c19-aa301a041bdb",
"metadata": {},
"source": [
"Deterministic validation, characterized by its rule-based logic, ensures consistent outcomes for the same input. Let's explore how we can apply this concept through some examples."
]
},
{
"cell_type": "markdown",
"id": "717ecefd-0355-4ba4-a642-95d281b0f075",
"metadata": {},
"source": [
"### Flagging bad keywords"
]
},
{
"cell_type": "markdown",
"id": "3a15013e-42f3-4d3b-b395-d6edbdec34e5",
"metadata": {},
"source": [
"To begin with, we aim to prevent engagement in topics involving explicit violence."
]
},
{
"cell_type": "markdown",
"id": "13d61a81",
"metadata": {},
"source": [
"We will define a blacklist of violent words that cannot be mentioned in any messages:"
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "59330d7d-082a-4240-98c4-eaee18f02728",
"metadata": {},
"outputs": [],
"source": [
"blacklist = {\n",
" \"rob\",\n",
" \"steal\",\n",
" \"hurt\",\n",
" \"kill\",\n",
" \"attack\",\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "7ce06bbf",
"metadata": {},
"source": [
"To validate if the message contains a blacklisted word we will use a [field_validator](https://jxnl.github.io/instructor/blog/2023/10/23/good-llm-validation-is-just-good-validation/#using-field_validator-decorator) over the 'message' field:"
]
},
{
"cell_type": "code",
"execution_count": 64,
"id": "9bb87f47-db98-4f1d-80cb-ad5f39df8793",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for Response\nmessage\n Value error, `hurt` was found in the message `I will hurt him` [type=value_error, input_value='I will hurt him', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 17\u001b[0m line \u001b[0;36m1\n\u001b[1;32m 11\u001b[0m \u001b[39mraise\u001b[39;00m \u001b[39mValueError\u001b[39;00m(\u001b[39mf\u001b[39m\u001b[39m\"\u001b[39m\u001b[39m`\u001b[39m\u001b[39m{\u001b[39;00mword\u001b[39m}\u001b[39;00m\u001b[39m` was found in the message `\u001b[39m\u001b[39m{\u001b[39;00mv\u001b[39m}\u001b[39;00m\u001b[39m`\u001b[39m\u001b[39m\"\u001b[39m)\n\u001b[1;32m 12\u001b[0m \u001b[39mreturn\u001b[39;00m v\n\u001b[0;32m---> 14\u001b[0m Response(message\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39mI will hurt him\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n",
"File \u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\u001b[0m, in \u001b[0;36mBaseModel.__init__\u001b[0;34m(__pydantic_self__, **data)\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 163\u001b[0m __tracebackhide__ \u001b[39m=\u001b[39m \u001b[39mTrue\u001b[39;00m\n\u001b[0;32m--> 164\u001b[0m __pydantic_self__\u001b[39m.\u001b[39;49m__pydantic_validator__\u001b[39m.\u001b[39;49mvalidate_python(data, self_instance\u001b[39m=\u001b[39;49m__pydantic_self__)\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for Response\nmessage\n Value error, `hurt` was found in the message `I will hurt him` [type=value_error, input_value='I will hurt him', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error"
]
}
],
"source": [
"from pydantic import BaseModel, field_validator\n",
"from pydantic.fields import Field\n",
"\n",
"\n",
"class Response(BaseModel):\n",
" message: str\n",
"\n",
" @field_validator(\"message\")\n",
" def message_cannot_have_blacklisted_words(cls, v: str) -> str:\n",
" for word in v.split():\n",
" if word.lower() in blacklist:\n",
" raise ValueError(f\"`{word}` was found in the message `{v}`\")\n",
" return v\n",
"\n",
"\n",
"Response(message=\"I will hurt him\")"
]
},
{
"cell_type": "markdown",
"id": "37e3a638-c9c9-44cd-bcd0-ad1a39f448db",
"metadata": {},
"source": [
"### Flagging using OpenAI Moderation"
]
},
{
"cell_type": "markdown",
"id": "88d0b816-7ec8-42b0-9b91-c9aab382c960",
"metadata": {},
"source": [
"To enhance our validation measures, we'll extend the scope to flag any answer that contains hateful content, harassment, or similar issues. OpenAI offers a moderation endpoint that addresses these concerns, and it's freely available when using OpenAI models."
]
},
{
"cell_type": "markdown",
"id": "65f46eb5",
"metadata": {},
"source": [
"With the `instructor` library, this is just one function edit away:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "b2ad8c19-6a94-4e4a-aa3e-dce149e8a479",
"metadata": {},
"outputs": [],
"source": [
"from typing import Annotated\n",
"from pydantic.functional_validators import AfterValidator"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "82521112-5301-4442-acce-82b495bd838f",
"metadata": {},
"outputs": [],
"source": [
"from instructor import openai_moderation\n",
"\n",
"\n",
"class Response(BaseModel):\n",
" message: Annotated[str, AfterValidator(openai_moderation(client=client))]"
]
},
{
"cell_type": "markdown",
"id": "90542190-a4f2-4242-8261-2f0ace323022",
"metadata": {},
"source": [
"Now we have a more comprehensive flagging for violence and we can outsource the moderation of our messages."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "54a9de1b-c6e7-4a5f-854c-506083a06a9d",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for Response\nmessage\n Value error, `I want to make them suffer the consequences` was flagged for harassment, harassment_threatening, violence, harassment/threatening [type=value_error, input_value='I want to make them suffer the consequences', input_type=str]\n For further information visit https://errors.pydantic.dev/2.5/v/value_error",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[7], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mResponse\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmessage\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mI want to make them suffer the consequences\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/.virtualenvs/pampa-labs/lib/python3.10/site-packages/pydantic/main.py:164\u001b[0m, in \u001b[0;36mBaseModel.__init__\u001b[0;34m(__pydantic_self__, **data)\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[38;5;66;03m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 163\u001b[0m __tracebackhide__ \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[0;32m--> 164\u001b[0m \u001b[43m__pydantic_self__\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__pydantic_validator__\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalidate_python\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mself_instance\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m__pydantic_self__\u001b[49m\u001b[43m)\u001b[49m\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for Response\nmessage\n Value error, `I want to make them suffer the consequences` was flagged for harassment, harassment_threatening, violence, harassment/threatening [type=value_error, input_value='I want to make them suffer the consequences', input_type=str]\n For further information visit https://errors.pydantic.dev/2.5/v/value_error"
]
}
],
"source": [
"Response(message=\"I want to make them suffer the consequences\")"
]
},
{
"cell_type": "markdown",
"id": "f138f9f8-495a-4a09-96a0-c71d01561855",
"metadata": {},
"source": [
"And as an extra, we get flagging for other topics like religion, race etc."
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "feb77670-afd7-4947-89f8-a9446f6fb12c",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for Response\nmessage\n Value error, `I will mock their religion` was flagged for ['harassment'] [type=value_error, input_value='I will mock their religion', input_type=str]\n For further information visit https://errors.pydantic.dev/2.5/v/value_error",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[26], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mResponse\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmessage\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mI will mock their religion\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/.virtualenvs/pampa-labs/lib/python3.10/site-packages/pydantic/main.py:164\u001b[0m, in \u001b[0;36mBaseModel.__init__\u001b[0;34m(__pydantic_self__, **data)\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[38;5;66;03m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 163\u001b[0m __tracebackhide__ \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[0;32m--> 164\u001b[0m \u001b[43m__pydantic_self__\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m__pydantic_validator__\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalidate_python\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdata\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mself_instance\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m__pydantic_self__\u001b[49m\u001b[43m)\u001b[49m\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for Response\nmessage\n Value error, `I will mock their religion` was flagged for ['harassment'] [type=value_error, input_value='I will mock their religion', input_type=str]\n For further information visit https://errors.pydantic.dev/2.5/v/value_error"
]
}
],
"source": [
"Response(message=\"I will mock their religion\")"
]
},
{
"cell_type": "markdown",
"id": "886f122b-22c9-440e-99cf-2e594b3df99b",
"metadata": {},
"source": [
"### Filtering very long messages"
]
},
{
"cell_type": "markdown",
"id": "692b1164-4bd5-4943-b9ab-2edec00d4f7d",
"metadata": {},
"source": [
"In addition to content-based flags, we can also set criteria based on other aspects of the input text. For instance, to maintain user engagement, we might want to prevent the assistant from returning excessively long texts. \n",
"\n",
"Here, noticed that `Field` has built-in validators for `min_length` and `max_length`. to learn more checkout [Field Contraints](https://docs.pydantic.dev/latest/concepts/fields)"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "45ffdbd4-deae-4a46-9637-1b5339904f53",
"metadata": {},
"outputs": [],
"source": [
"class AssistantMessage(BaseModel):\n",
" message: str = Field(..., max_length=100)"
]
},
{
"cell_type": "code",
"execution_count": 69,
"id": "66430dc5-b78c-45e2-a53b-ddc392b20583",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for AssistantMessage\nmessage\n String should have at most 100 characters [type=string_too_long, input_value=\"Certainly! Lorem ipsum i... on the actual content.\", input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/string_too_long",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 29\u001b[0m line \u001b[0;36m1\n\u001b[0;32m----> 1\u001b[0m AssistantMessage(message\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39mCertainly! Lorem ipsum is a placeholder text commonly used in the printing and typesetting industry. Here\u001b[39;49m\u001b[39m'\u001b[39;49m\u001b[39ms a sample of Lorem ipsum text: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam euismod velit vel tellus tempor, non viverra eros iaculis. Sed vel nisl nec mauris bibendum tincidunt. Vestibulum sed libero euismod, eleifend tellus id, laoreet elit. Donec auctor arcu ac mi feugiat, vel lobortis justo efficitur. Fusce vel odio vitae justo varius dignissim. Integer sollicitudin mi a justo bibendum ultrices. Quisque id nisl a lectus venenatis luctus. Please note that Lorem ipsum text is a nonsensical Latin-like text used as a placeholder for content, and it has no specific meaning. It\u001b[39;49m\u001b[39m'\u001b[39;49m\u001b[39ms often used in design and publishing to demonstrate the visual aspects of a document without focusing on the actual content.\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n",
"File \u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\u001b[0m, in \u001b[0;36mBaseModel.__init__\u001b[0;34m(__pydantic_self__, **data)\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 163\u001b[0m __tracebackhide__ \u001b[39m=\u001b[39m \u001b[39mTrue\u001b[39;00m\n\u001b[0;32m--> 164\u001b[0m __pydantic_self__\u001b[39m.\u001b[39;49m__pydantic_validator__\u001b[39m.\u001b[39;49mvalidate_python(data, self_instance\u001b[39m=\u001b[39;49m__pydantic_self__)\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for AssistantMessage\nmessage\n String should have at most 100 characters [type=string_too_long, input_value=\"Certainly! Lorem ipsum i... on the actual content.\", input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/string_too_long"
]
}
],
"source": [
"AssistantMessage(\n",
" message=\"Certainly! Lorem ipsum is a placeholder text commonly used in the printing and typesetting industry. Here's a sample of Lorem ipsum text: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam euismod velit vel tellus tempor, non viverra eros iaculis. Sed vel nisl nec mauris bibendum tincidunt. Vestibulum sed libero euismod, eleifend tellus id, laoreet elit. Donec auctor arcu ac mi feugiat, vel lobortis justo efficitur. Fusce vel odio vitae justo varius dignissim. Integer sollicitudin mi a justo bibendum ultrices. Quisque id nisl a lectus venenatis luctus. Please note that Lorem ipsum text is a nonsensical Latin-like text used as a placeholder for content, and it has no specific meaning. It's often used in design and publishing to demonstrate the visual aspects of a document without focusing on the actual content.\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "050e72fe-4b13-4002-a1d0-94f7b88b784b",
"metadata": {},
"source": [
"### Avoiding hallucination with citations"
]
},
{
"cell_type": "markdown",
"id": "e3f2869e-c8a3-4b93-82e7-55eb70930900",
"metadata": {},
"source": [
"When incorporating external knowledge bases, it's crucial to ensure that the agent uses the provided context accurately and doesn't fabricate responses. Validators can be effectively used for this purpose. We can illustrate this with an example where we validate that a provided citation is actually included in the referenced text chunk:"
]
},
{
"cell_type": "code",
"execution_count": 70,
"id": "638fc368-5cf7-4ae7-9d3f-efea1b84eec0",
"metadata": {},
"outputs": [],
"source": [
"from pydantic import ValidationInfo\n",
"\n",
"\n",
"class AnswerWithCitation(BaseModel):\n",
" answer: str\n",
" citation: str\n",
"\n",
" @field_validator(\"citation\")\n",
" @classmethod\n",
" def citation_exists(cls, v: str, info: ValidationInfo):\n",
" context = info.context\n",
" if context:\n",
" context = context.get(\"text_chunk\")\n",
" if v not in context:\n",
" raise ValueError(f\"Citation `{v}` not found in text\")\n",
" return v"
]
},
{
"cell_type": "markdown",
"id": "3064b06b-7f85-40ec-8fe2-4fa2cce36585",
"metadata": {},
"source": [
"Here we assume that there is a \"text_chunk\" field that contains the text that the model is supposed to use as context. We then use the `field_validator` decorator to define a validator that checks if the citation is included in the text chunk. If it's not, we raise a `ValueError` with a message that will be returned to the user."
]
},
{
"cell_type": "code",
"execution_count": 71,
"id": "0f3030b6-e6cf-45bf-a366-12de996fea40",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for AnswerWithCitation\ncitation\n Value error, Citation `Blueberries contain high levels of protein` not found in text [type=value_error, input_value='Blueberries contain high levels of protein', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 34\u001b[0m line \u001b[0;36m1\n\u001b[0;32m----> 1\u001b[0m AnswerWithCitation\u001b[39m.\u001b[39;49mmodel_validate(\n\u001b[1;32m 2\u001b[0m {\n\u001b[1;32m 3\u001b[0m \u001b[39m\"\u001b[39;49m\u001b[39manswer\u001b[39;49m\u001b[39m\"\u001b[39;49m: \u001b[39m\"\u001b[39;49m\u001b[39mBlueberries are packed with protein\u001b[39;49m\u001b[39m\"\u001b[39;49m, \n\u001b[1;32m 4\u001b[0m \u001b[39m\"\u001b[39;49m\u001b[39mcitation\u001b[39;49m\u001b[39m\"\u001b[39;49m: \u001b[39m\"\u001b[39;49m\u001b[39mBlueberries contain high levels of protein\u001b[39;49m\u001b[39m\"\u001b[39;49m\n\u001b[1;32m 5\u001b[0m },\n\u001b[1;32m 6\u001b[0m context\u001b[39m=\u001b[39;49m{\u001b[39m\"\u001b[39;49m\u001b[39mtext_chunk\u001b[39;49m\u001b[39m\"\u001b[39;49m: \u001b[39m\"\u001b[39;49m\u001b[39mBlueberries are very rich in antioxidants\u001b[39;49m\u001b[39m\"\u001b[39;49m}, \n\u001b[1;32m 7\u001b[0m )\n",
"File \u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:503\u001b[0m, in \u001b[0;36mBaseModel.model_validate\u001b[0;34m(cls, obj, strict, from_attributes, context)\u001b[0m\n\u001b[1;32m 501\u001b[0m \u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 502\u001b[0m __tracebackhide__ \u001b[39m=\u001b[39m \u001b[39mTrue\u001b[39;00m\n\u001b[0;32m--> 503\u001b[0m \u001b[39mreturn\u001b[39;00m \u001b[39mcls\u001b[39;49m\u001b[39m.\u001b[39;49m__pydantic_validator__\u001b[39m.\u001b[39;49mvalidate_python(\n\u001b[1;32m 504\u001b[0m obj, strict\u001b[39m=\u001b[39;49mstrict, from_attributes\u001b[39m=\u001b[39;49mfrom_attributes, context\u001b[39m=\u001b[39;49mcontext\n\u001b[1;32m 505\u001b[0m )\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for AnswerWithCitation\ncitation\n Value error, Citation `Blueberries contain high levels of protein` not found in text [type=value_error, input_value='Blueberries contain high levels of protein', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error"
]
}
],
"source": [
"AnswerWithCitation.model_validate(\n",
" {\n",
" \"answer\": \"Blueberries are packed with protein\",\n",
" \"citation\": \"Blueberries contain high levels of protein\",\n",
" },\n",
" context={\"text_chunk\": \"Blueberries are very rich in antioxidants\"},\n",
")"
]
},
{
"cell_type": "markdown",
"id": "06e54533-3304-4fa0-9828-9591d5dcdefd",
"metadata": {},
"source": [
"## Software 3.0: Probabilistic validators"
]
},
{
"cell_type": "markdown",
"id": "1907df5b-472f-45ac-9181-45235e3cd0c3",
"metadata": {},
"source": [
"For scenarios requiring more nuanced validation than rule-based methods, we use probabilistic validation. This approach incorporates LLMs into the validation workflow for a sophisticated assessment of outputs.\n",
"\n",
"The `instructor` library offers the `llm_validator` utility for this purpose. By specifying the desired directive, we can use LLMs for complex validation tasks. Let's explore some intriguing use cases enabled by LLMs.\n",
"\n",
"### Keeping an agent on topic\n",
"\n",
"When creating an agent focused on health improvement, providing answers and daily practice suggestions, it's crucial to ensure strict adherence to health-related topics. This is important because the knowledge base is limited to health topics, and veering off-topic could result in fabricated responses.\n",
"\n",
"To achieve this focus, we'll follow a similar process as before, but with an important addition: integrating an LLM into our validator."
]
},
{
"cell_type": "markdown",
"id": "546625ac",
"metadata": {},
"source": [
"This LLM will be tasked with determining whether the agent's responses are exclusively related to health topics. For this, we will use the `llm_validator` from `instructor` like so:"
]
},
{
"cell_type": "code",
"execution_count": 73,
"id": "8cf00cad-c4c0-49dd-9be5-fb02338a5a7f",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for AssistantMessage\nmessage\n Assertion failed, The statement is not related to health best practices or topics. [type=assertion_error, input_value='I would suggest you to v...is very nice in winter.', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/assertion_error",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 38\u001b[0m line \u001b[0;36m1\n\u001b[1;32m 5\u001b[0m \u001b[39mclass\u001b[39;00m \u001b[39mAssistantMessage\u001b[39;00m(BaseModel):\n\u001b[1;32m 6\u001b[0m message: Annotated[\u001b[39mstr\u001b[39m, \n\u001b[1;32m 7\u001b[0m AfterValidator(\n\u001b[1;32m 8\u001b[0m llm_validator(\u001b[39m\"\u001b[39m\u001b[39mdon\u001b[39m\u001b[39m'\u001b[39m\u001b[39mt talk about any other topic except health best practices and topics\u001b[39m\u001b[39m\"\u001b[39m, \n\u001b[1;32m 9\u001b[0m openai_client\u001b[39m=\u001b[39mclient))]\n\u001b[0;32m---> 11\u001b[0m AssistantMessage(message\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39mI would suggest you to visit Sicily as they say it is very nice in winter.\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n",
"File \u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\u001b[0m, in \u001b[0;36mBaseModel.__init__\u001b[0;34m(__pydantic_self__, **data)\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 163\u001b[0m __tracebackhide__ \u001b[39m=\u001b[39m \u001b[39mTrue\u001b[39;00m\n\u001b[0;32m--> 164\u001b[0m __pydantic_self__\u001b[39m.\u001b[39;49m__pydantic_validator__\u001b[39m.\u001b[39;49mvalidate_python(data, self_instance\u001b[39m=\u001b[39;49m__pydantic_self__)\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for AssistantMessage\nmessage\n Assertion failed, The statement is not related to health best practices or topics. [type=assertion_error, input_value='I would suggest you to v...is very nice in winter.', input_type=str]\n For further information visit https://errors.pydantic.dev/2.4/v/assertion_error"
]
}
],
"source": [
"from instructor import llm_validator\n",
"\n",
"\n",
"class AssistantMessage(BaseModel):\n",
" message: Annotated[\n",
" str,\n",
" AfterValidator(\n",
" llm_validator(\n",
" \"don't talk about any other topic except health best practices and topics\",\n",
" client=client,\n",
" )\n",
" ),\n",
" ]\n",
"\n",
"\n",
"AssistantMessage(\n",
" message=\"I would suggest you to visit Sicily as they say it is very nice in winter.\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "1dce5a7a-024e-4742-a124-fe51973df5f2",
"metadata": {},
"source": [
"Important that for these examples we're not waiting for the messages, to get this message we would need to call the openai with `response_model=AssistantMessage`."
]
},
{
"cell_type": "markdown",
"id": "a6ec4afa-0be7-469e-93c0-5c729a06d4fc",
"metadata": {},
"source": [
"### Validating agent thinking with CoT"
]
},
{
"cell_type": "markdown",
"id": "424d915b-f332-48f3-a75e-6e1cd6d12075",
"metadata": {},
"source": [
"Using probabilistic validation, we can also assess the agent's reasoning process to ensure it's logical before providing a response. With [chain of thought](https://learnprompting.org/docs/intermediate/chain_of_thought) prompting, the model is expected to think in steps and arrive at an answer following its logical progression. If there are errors in this logic, the final response may be incorrect.\n",
"\n",
"Here we will use Pydantic's [model_validator](https://docs.pydantic.dev/latest/concepts/validators/#model-validators) which allows us to apply validation over all the properties of the `AIResponse` at once.\n",
"\n",
"To make this easier we'll make a simple validation class that we can reuse for all our validation:"
]
},
{
"cell_type": "code",
"execution_count": 74,
"id": "65340b8c-2ea3-4457-a6d4-f0e652c317b4",
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional\n",
"\n",
"\n",
"class Validation(BaseModel):\n",
" is_valid: bool = Field(\n",
" ..., description=\"Whether the value is valid based on the rules\"\n",
" )\n",
" error_message: Optional[str] = Field(\n",
" ...,\n",
" description=\"The error message if the value is not valid, to be used for re-asking the model\",\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "de2104f1",
"metadata": {},
"source": [
"The function we will call will integrate an LLM and will ask it to determine whether the answer the model provided follows from the chain of thought: "
]
},
{
"cell_type": "code",
"execution_count": 75,
"id": "e9ab3804-6962-4a48-83da-1f8360d8379a",
"metadata": {},
"outputs": [],
"source": [
"def validate_chain_of_thought(values):\n",
" chain_of_thought = values[\"chain_of_thought\"]\n",
" answer = values[\"answer\"]\n",
" resp = client.chat.completions.create(\n",
" model=\"gpt-4-1106-preview\",\n",
" messages=[\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": \"You are a validator. Determine if the value follows from the statement. If it is not, explain why.\",\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": f\"Verify that `{answer}` follows the chain of thought: {chain_of_thought}\",\n",
" },\n",
" ],\n",
" response_model=Validation,\n",
" )\n",
" if not resp.is_valid:\n",
" raise ValueError(resp.error_message)\n",
" return values"
]
},
{
"cell_type": "markdown",
"id": "b79b94cf-15c2-432b-b0d5-aad0c2997f91",
"metadata": {},
"source": [
"The use of the 'before' argument in this context is significant. It means that the validator will receive the complete dictionary of inputs in their raw form, before any parsing by Pydantic."
]
},
{
"cell_type": "code",
"execution_count": 76,
"id": "fbc9887a-df0d-4a4b-9ef5-ea450701d85b",
"metadata": {},
"outputs": [],
"source": [
"from typing import Any\n",
"from pydantic import model_validator\n",
"\n",
"\n",
"class AIResponse(BaseModel):\n",
" chain_of_thought: str\n",
" answer: str\n",
"\n",
" @model_validator(mode=\"before\")\n",
" @classmethod\n",
" def chain_of_thought_makes_sense(cls, data: Any) -> Any:\n",
" # here we assume data is the dict representation of the model\n",
" # since we use 'before' mode.\n",
" return validate_chain_of_thought(data)"
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "a38f2b28-f5b9-4a44-bfe5-9735726ec57d",
"metadata": {},
"outputs": [
{
"ename": "ValidationError",
"evalue": "1 validation error for AIResponse\n Value error, The statement about the user having a broken leg does not logically follow from the information provided about the user suffering from diabetes. These are two separate health conditions and one does not imply the other. [type=value_error, input_value={'chain_of_thought': 'The...user has a broken leg.'}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValidationError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m/Users/jasonliu/dev/instructor/tutorials/5.validation.ipynb Cell 47\u001b[0m line \u001b[0;36m1\n\u001b[0;32m----> 1\u001b[0m AIResponse(chain_of_thought\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39mThe user suffers from diabetes.\u001b[39;49m\u001b[39m\"\u001b[39;49m, answer\u001b[39m=\u001b[39;49m\u001b[39m\"\u001b[39;49m\u001b[39mThe user has a broken leg.\u001b[39;49m\u001b[39m\"\u001b[39;49m)\n",
"File \u001b[0;32m~/dev/instructor/.venv/lib/python3.11/site-packages/pydantic/main.py:164\u001b[0m, in \u001b[0;36mBaseModel.__init__\u001b[0;34m(__pydantic_self__, **data)\u001b[0m\n\u001b[1;32m 162\u001b[0m \u001b[39m# `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks\u001b[39;00m\n\u001b[1;32m 163\u001b[0m __tracebackhide__ \u001b[39m=\u001b[39m \u001b[39mTrue\u001b[39;00m\n\u001b[0;32m--> 164\u001b[0m __pydantic_self__\u001b[39m.\u001b[39;49m__pydantic_validator__\u001b[39m.\u001b[39;49mvalidate_python(data, self_instance\u001b[39m=\u001b[39;49m__pydantic_self__)\n",
"\u001b[0;31mValidationError\u001b[0m: 1 validation error for AIResponse\n Value error, The statement about the user having a broken leg does not logically follow from the information provided about the user suffering from diabetes. These are two separate health conditions and one does not imply the other. [type=value_error, input_value={'chain_of_thought': 'The...user has a broken leg.'}, input_type=dict]\n For further information visit https://errors.pydantic.dev/2.4/v/value_error"
]
}
],
"source": [
"AIResponse(\n",
" chain_of_thought=\"The user suffers from diabetes.\",\n",
" answer=\"The user has a broken leg.\",\n",
")"
]
},
{
"cell_type": "markdown",
"id": "5bbbaa11-32d2-4772-bc31-18d1d6d6c919",
"metadata": {},
"source": [
"## Reasking with validators\n",
"\n",
"For most of these examples all we've done we've mostly only defined the validation logic.\n",
"\n",
"We'eve covered field validators and model validators and even used LLMs to validate our outputs. But we haven't actually used the validators to reask the model! One of the most powerful features of `instructor` is that it will automatically reask the model when it receives a validation error. This means that we can use the same validation logic for both code-based and LLM-based validation.\n",
"\n",
"This also means that our 'prompt' is not only the prompt we send, but the code that runs the validator, and the error message we send back to the model."
]
},
{
"cell_type": "markdown",
"id": "39e642d9-0d20-4231-a694-baa0ea03f147",
"metadata": {},
"source": [
"Integrating these validation examples with the OpenAI API is streamlined using `instructor`. After patching the OpenAI client with `instructor`, you simply need to specify a `response_model` for your requests. This setup ensures that all the validation processes occur automatically.\n",
"\n",
"To enable reasking you can set a maximum number of retries. When calling the OpenAI client, the system can re-attempt to generate a correct answer. It does this by resending the original query along with feedback on why the previous response was rejected, guiding the LLM towards a more accurate answer in subsequent attempts."
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "97f544e7-2552-465c-89a9-a4820f00d658",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'a life of sin and debauchery'"
]
},
"execution_count": 79,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"class QuestionAnswer(BaseModel):\n",
" question: str\n",
" answer: str\n",
"\n",
"\n",
"question = \"What is the meaning of life?\"\n",
"context = (\n",
" \"The according to the devil the meaning of life is a life of sin and debauchery.\"\n",
")\n",
"\n",
"\n",
"resp = client.chat.completions.create(\n",
" model=\"gpt-4-1106-preview\",\n",
" response_model=QuestionAnswer,\n",
" messages=[\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": f\"using the context: `{context}`\\n\\nAnswer the following question: `{question}`\",\n",
" },\n",
" ],\n",
")\n",
"\n",
"resp.answer"
]
},
{
"cell_type": "code",
"execution_count": 80,
"id": "0328bbc5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The meaning of life is a concept that varies depending on individual perspectives and beliefs.'"
]
},
"execution_count": 80,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from pydantic import BeforeValidator\n",
"\n",
"\n",
"class QuestionAnswer(BaseModel):\n",
" question: str\n",
" answer: Annotated[\n",
" str,\n",
" BeforeValidator(llm_validator(\"don't say objectionable things\", client=client)),\n",
" ]\n",
"\n",
"\n",
"resp = client.chat.completions.create(\n",
" model=\"gpt-3.5-turbo\",\n",
" response_model=QuestionAnswer,\n",
" max_retries=2,\n",
" messages=[\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": \"You are a system that answers questions based on the context. answer exactly what the question asks using the context.\",\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": f\"using the context: `{context}`\\n\\nAnswer the following question: `{question}`\",\n",
" },\n",
" ],\n",
")\n",
"\n",
"resp.answer"
]
},
{
"cell_type": "markdown",
"id": "a0c07b8b-ba6d-4e5d-a26c-ba72ca7d4f22",
"metadata": {},
"source": [
"# Conclusion"
]
},
{
"cell_type": "markdown",
"id": "344c623a-9b3b-4134-92d4-ad4eb9bb5f9e",
"metadata": {},
"source": [
"This guide explains how to use deterministic and probabilistic validation techniques with Large Language Models (LLMs). We discussed using an instructor to establish validation processes for content filtering, context relevance maintenance, and model reasoning verification. These methods enhance the performance of LLMs across different tasks.\n",
"\n",
"For those interested in further exploration, here's a to-do list:\n",
"\n",
"1. **SQL Syntax Checker**: Create a validator to check the syntax of SQL queries before executing them.\n",
"2. **Context-Based Response Validation**: Design a method to flag responses based on the model's own knowledge rather than the provided context.\n",
"3. **PII Detection**: Implement a mechanism to identify and handle Personally Identifiable Information in responses while prioritizing user privacy.\n",
"4. **Targeted Rule-Based Filtering**: Develop filters to remove specific content types, such as responses mentioning named entities.\n",
"\n",
"Completing these tasks will enable users to acquire practical skills in improving LLMs through advanced validation methods."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}