Quick Start
Set up¶
The following instructions assume that there is already a Foundation4.ai API Server successfully running and accepting connections. Please see the installation instructions for details on how to set up the API server.
Test the connection¶
In order to ensure that the server is reachable, connecting to the API endpoint directly identifies the server.
import json
import ssl
import httpx
import truststore
FOUNDATION4AI_API = 'https://api.foundation4ai.example.com' # replace with running server endpoint
FOUNDATION4AI_API = FOUNDATION4AI_API.strip('/')
ctx = truststore.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
client = httpx.Client(verify=ctx)
res = client.get(FOUNDATION4AI_API)
assert res.status_code == 200 and res.json()["message"] == "Foundation4.ai Server"
Set up LLM endpoints¶
The Foundation4.ai API Server can use any number of OpenAI-compatible endpoints to perform all the LLM tasks necessary to answer user questions. To set those up, use the /llms
api endpoint:
res = client.post(f"{FOUNDATION4AI_API}/llms", json={
'name': 'Quick Start LLM Endpoint',
'description': 'Short description of this LLM',
'endpoint': 'https://api.openai.com/v1',
'api_key': 'SECRET API KEY' # replace with the endpoint key
})
# note that this will fail if this code is executed twice as the LLM `name` must be unique
assert res.status_code == 200
The id
of the LLM just created can be obtained from the response:
llm_id = res.json()['id']
print('LLM id:', llm_id)
LLM id: b35904ae-054c-4ff9-bf45-d9bf5e38e057
Listing existing LLM endpoints or getting a specific LLM endpoint follow the traditional REST operations. For example,
res = client.get(f"{FOUNDATION4AI_API}/llms/{llm_id}")
# note that this will fail if this code is executed twice as the LLM `name` must be unique
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
{ "id": "b35904ae-054c-4ff9-bf45-d9bf5e38e057", "name": "Quick Start LLM Endpoint", "description": "Short description of this LLM", "endpoint": "https://api.openai.com/v1", "model": null }
Create a pipeline¶
Pipelines require an embedding model that cannot be changed after the pipeline creation, and a mechanism to split documents into smaller chunks. Additionally, Foundation4ai pipelines require documents and pipelines to have an associated document classification.
Embedding Providers and Models¶
To list available embeddings providers, use the /providers/embeddings
endpoint:
res = client.get(f"{FOUNDATION4AI_API}/providers/embeddings")
# note that this will fail if this code is executed twice as the LLM `name` must be unique
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
[ { "id": "0b31298a-27ba-4056-830d-37b6e95f3f02", "provider": "GPT4AllEmbeddings", "description": null } ]
Providers are bundled with the application and represent classes that are instantiated with particular parameters. In the example above, the GPT4AllEmbeddings
provider can be used by creating or using an already existing embedding model with parameters. For example,
res = client.get(f"{FOUNDATION4AI_API}/embedding-models")
# note that this will fail if this code is executed twice as the LLM `name` must be unique
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
[ { "id": "f48f032e-2883-4813-be57-d38cf06c5c42", "name": "MiniLM-L6-v2", "provider": "GPT4AllEmbeddings", "size": 384, "parameters": { "model_name": "all-MiniLM-L6-v2.gguf2.f16.gguf" }, "description": null } ]
Text Splitters¶
Pipelines also require a default text splitter mechanism to split documents into smaller chunks. The text splitter providers are bundled with the application and represent LangChain TextSplitter instances with specific parameters. To retrieve a list of available text splitters, use the /providers/text-splitters
endpoint.
res = client.get(f"{FOUNDATION4AI_API}/providers/text-splitters")
# note that this will fail if this code is executed twice as the LLM `name` must be unique
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
[ { "id": "019252e9-b4a0-7713-9a69-d701b4f4a2d1", "provider": "RecursiveCharacterTextSplitter", "parameters": {}, "description": null }, { "id": "019252e9-da22-7f12-8c2f-36f4025fb0af", "provider": "CharacterTextSplitter", "parameters": {}, "description": null }, { "id": "019252e9-f2b1-7482-930f-0cf9400bdd79", "provider": "NLTKTextSplitter", "parameters": {}, "description": null } ]
Pipeline¶
To create a new pipeline, use the /pipelines
endpoint:
res = client.post(
f"{FOUNDATION4AI_API}/pipelines",
json={
"name": "Quick Start Pipeline",
"embedding_model_id": "f48f032e-2883-4813-be57-d38cf06c5c42",
"default_text_splitter_id": "019252e9-b4a0-7713-9a69-d701b4f4a2d1",
"classifications": [
["secret", "classified"],
["classified", "public"],
],
},
)
# note that this will fail if this code is executed twice as the Pipeline `name` must be unique
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
pipeline_id = res.json()["id"]
{ "id": "a7f15787-639e-4e40-9127-bc92a8294594", "name": "Quick Start Pipeline", "embedding_model": { "id": "f48f032e-2883-4813-be57-d38cf06c5c42", "name": "MiniLM-L6-v2", "provider": "GPT4AllEmbeddings", "size": 384, "parameters": { "model_name": "all-MiniLM-L6-v2.gguf2.f16.gguf" } }, "default_text_splitter": { "id": "019252e9-b4a0-7713-9a69-d701b4f4a2d1", "provider": "RecursiveCharacterTextSplitter", "parameters": {} } }
Note that above we specified three document classifications on this pipeline, secret
, classified
, and public
. As explained in the Concepts page, the classifications represent a hierarchical system of access. To retrieve the classifications available for a specific pipeline:
res = client.get(f"{FOUNDATION4AI_API}/pipelines/{pipeline_id}/classifications")
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
{ "classifications": [ "secret", "classified", "public" ], "hierarchy": [ [ "secret", "classified" ], [ "classified", "public" ] ] }
That is, this pipeline has the document classifications from above and from the hierarchy we see that secret
users have access to classified
documents and anyone with a classified
permission can also read public
documents. Although not explicitly specified on the result above, because of the hierarchical nature os the classification system, a secret
user will also have access to public
documents.
Add documents to the pipeline¶
To add documents to a pipeline, use the /pipelines/{pipeline_id}/documents
endpoint. This is an asynchronous endpoint that ingests a document and sets it up for processing:
res = client.post(
f"{FOUNDATION4AI_API}/pipelines/{pipeline_id}/documents",
json={
"classification": "public",
"contents": "The secret number is between 10 and 20"
},
)
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
document_id = res.json()['id']
{ "id": "0193ac43-1361-7912-b896-11f85a6c6cf8", "name": "0193ac43-1361-7912-b896-11e11a442118", "pipeline": { "id": "a7f15787-639e-4e40-9127-bc92a8294594", "name": "Quick Start Pipeline", "embedding_model": { "id": "f48f032e-2883-4813-be57-d38cf06c5c42", "name": "MiniLM-L6-v2", "provider": "GPT4AllEmbeddings", "size": 384, "parameters": { "model_name": "all-MiniLM-L6-v2.gguf2.f16.gguf" }, "description": null }, "default_text_splitter": { "id": "019252e9-b4a0-7713-9a69-d701b4f4a2d1", "provider": "RecursiveCharacterTextSplitter", "parameters": {}, "description": null }, "description": null }, "classification": "public", "meta": {}, "text_splitter": null, "status": "pending", "message": null }
Checking the status of a document¶
Note that the response above gives us a result that tells us that the document is queued but not yet processed (pending state). To check whether the document was fully loaded and processed, just retrieve the document information again:
res = client.get(f"{FOUNDATION4AI_API}/documents/{document_id}")
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
print()
print("Document status is:", res.json()['status'])
{ "id": "0193ac43-1361-7912-b896-11f85a6c6cf8", "name": "0193ac43-1361-7912-b896-11e11a442118", "pipeline": { "id": "a7f15787-639e-4e40-9127-bc92a8294594", "name": "Quick Start Pipeline", "embedding_model": { "id": "f48f032e-2883-4813-be57-d38cf06c5c42", "name": "MiniLM-L6-v2", "provider": "GPT4AllEmbeddings", "size": 384, "parameters": { "model_name": "all-MiniLM-L6-v2.gguf2.f16.gguf" }, "description": null }, "default_text_splitter": { "id": "019252e9-b4a0-7713-9a69-d701b4f4a2d1", "provider": "RecursiveCharacterTextSplitter", "parameters": {}, "description": null }, "description": null }, "classification": "public", "meta": {}, "text_splitter": null, "status": "success", "message": null } Document status is: success
Note that if this endpoint is executed right after the document creation, the status may still be pending
. Subsequent calls to this endpoint will eventually switch to the success
status.
Creating an Agent¶
Agents represent prompt instructions that interact with LLMs and the vector stores. Please refer to the Agents page for a detailed explanation on how Agents operate and how to design prompts.
To create an Agent, use the /agents
endpoint.
Basic Prompt¶
A basic chat-completion style prompt can be achieved with something similar to the example below:
res = client.post(
f"{FOUNDATION4AI_API}/agents",
json={
"name": "Quick Start Basic Agent",
"description": "Basic example of a chat-completions ",
"prompt": [
{
"role": "system",
"template": """
Your job is to answer questions from users.
Use the following context to answer questions.
Be as detailed as possible, but don't make up any information
that's not from the context. If you don't know an answer, say
you don't know.
{context}
""",
},
{"role": "user", "template": "{query}"},
],
"placeholders": [
{
"name": "context",
"type": "similarity",
"params": {"k": 10},
"target": "query",
},
{"name": "query", "type": "query"},
],
},
)
# note that this will fail if this code is executed twice as the Pipeline `name` must be unique
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
basic_agent_id = res.json()["id"]
{ "id": "f5952625-c37f-47e9-88be-3773f5253b21", "name": "Quick Start Basic Agent", "description": "Basic example of a chat-completions ", "prompt": [ { "role": "system", "template": "\n Your job is to answer questions from users.\n Use the following context to answer questions.\n Be as detailed as possible, but don't make up any information\n that's not from the context. If you don't know an answer, say\n you don't know.\n\n {context}\n ", "include": true }, { "role": "user", "template": "{query}", "include": true } ], "placeholders": [ { "name": "context", "type": "similarity", "params": { "k": 10 }, "target": "query" }, { "name": "query", "type": "query", "params": {}, "target": null } ], "meta": {} }
Querying Agents¶
To query an agent, the user must specify the LLM and Pipeline to use in addition the required prompt variables. For example:
res = client.post(
f"{FOUNDATION4AI_API}/agents/{basic_agent_id}/execute",
headers={'X-LLM-ID': llm_id, 'X-Pipeline-ID': pipeline_id},
json={
"prompt": {"query": "What's the secret number?"},
"classification": "public"
},
)
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
"I'm sorry, but I cannot give you the exact secret number. However, I can tell you that the secret number is between 10 and 20."
Note that the prompt
parameter in the POST request above requires all the variables specified in the Agent prompt. If a variable is missed, the API will report an error:
res = client.post(
f"{FOUNDATION4AI_API}/agents/{basic_agent_id}/execute",
headers={'X-LLM-ID': llm_id, 'X-Pipeline-ID': pipeline_id},
json={
"prompt": {"wrong_variable_name": "What's the secret number?"},
"classification": "public"
},
)
assert res.status_code == 422
print(json.dumps(res.json(), indent=4))
{ "detail": { "msg": "Missing prompt variables: {'query'}" } }
Since inspecting an Agent object and analyzing the prompt for required variables can be time consuming, the endpoint /agents/{agent_id}/execute
offers a simple way to quickly determine which variables are needed for a particular Agent:
res = client.get(f"{FOUNDATION4AI_API}/agents/{basic_agent_id}/prompt")
assert res.status_code == 200
print("Required variables:", json.dumps(res.json()["input_variables"], indent=4))
Required variables: [ "query" ]
The results become more interesting when there are multiple documents with disparate classifications. As an example, we can add a document with a more restrictive classification:
res = client.post(
f"{FOUNDATION4AI_API}/pipelines/{pipeline_id}/documents",
json={
"classification": "classified",
"contents": "The secret number is between 5 and 15"
},
)
assert res.status_code == 200
Repeating the same query as above with a stricter classification will now look into documents from the classification hierarchy to assemble the context for the LLM. In our example, since classified
users can also read public
documents, the context consists of the two documents added above. From the response below we see that the LLM combined the information from both levels.
res = client.post(
f"{FOUNDATION4AI_API}/agents/{basic_agent_id}/execute",
headers={'X-LLM-ID': llm_id, 'X-Pipeline-ID': pipeline_id},
json={
"prompt": {"query": "What's the secret number?"},
"classification": "classified"
},
)
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
"The secret number is between 10 and 15."
Running the initial query with the public
permission results, as expected, with information that doesn't include knowledge in the classified category.
res = client.post(
f"{FOUNDATION4AI_API}/agents/{basic_agent_id}/execute",
headers={'X-LLM-ID': llm_id, 'X-Pipeline-ID': pipeline_id},
json={
"prompt": {"query": "What's the secret number?"},
"classification": "public"
},
)
assert res.status_code == 200
print(json.dumps(res.json(), indent=4))
"The secret number is between 10 and 20, but I don't have the exact number."