FLAN-T5
Navigate through the Stochastic Platform in the Model API
side tab and select FLAN-T5
model card.
You can now choose to use the model either by our Playground
or API
. You can choose the method by selecting the corresponding tab.
Playground usage
In this page you can find a textarea in which you can insert any prompt you would like the model to generate the text based on it. On the right side panel you can personalize supported parameter values based on your preference. Finally press the Submit
button to trigger a request. The output result will be shown in the central panel.
- Max new tokens: Maximum number of tokens generated by the model. A token is roughly 4 characters including alphanumerics and special characters.
- TopK: Top-K sampling means sorting by probability and zero-ing out the probabilities for anything below the k'th token. A lower value improves quality by removing the tail and making it less likely to go off topic.
- Penalty alpha: Regulates the importance of the model confidence and the degeneration penalty in contrastive search decoding. When generating output, contrastive search jointly considers the probability predicted by the language model to maintain the semantic coherence between the generated text and the prefix text and the similarity with respect to the previous context to avoid model degeneration
Stochastic-x model API usage
To use the model API in your application there are two main steps
To use the model API you have to have a Stochastic account. Sign up for a free account.
- Step-1
- Step-2
In this step we have to submit a inference request to the ApiUrl
to get the responseUrl
and the queuePosition
.
Specification of request and response are mentioned below.
Request
Method : POST
Header : In the request header add a property called
apiKey
. (Get the apiKey form the Stochastic Platform)
Body : The request body can contain the following properties:
- prompt: Required, the prompt for text generation or can be an Array of Strings
- params: Required, Params for generation
Here is an example for the request body:
{
"prompt": "The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history, which will lift up American workers and create good-paying, union jobs across the country. It'll lower the deficit and ask the ultra-wealthy and corporations to pay their fair share. And no one making under $400,000 per year will pay a penny more in taxes",
"params": {
"max_new_tokens": 64,
"top_k": 4,
"penalty_alpha": 0.6
}
}
Response
If the request is successful you will receive the
responseUrl
and thequeuePosition
{
"success": true,
"data": {
"id": "6389ce23460c900d80fa2290",
"responseUrl": "https://api.stochastic.ai/v1/modelApi/inference/6389ce23460c900d80fa2290",
"queuePosition": "0"
}
}
In this step we have to keep on polling the responseUrl
at a regular interval say 15 seconds until the response contains the completion.
Specification of request and response are mentioned below.
Request
Method : GET
Header : In the request header add a property called
apiKey
. (Get the apiKey form the Stochastic Platform)Body : No request body as it is a GET request
Response
{
"success": true,
"data": {
"id": "6389b7a86c592e05217dbb20",
"model": "flan-t5",
"prompt": "The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history, which will lift up American workers and create good-paying, union jobs across the country. It'll lower the deficit and ask the ultra-wealthy and corporations to pay their fair share. And no one making under $400,000 per year will pay a penny more in taxes",
"completion": [
"The rich will pay more in taxes and will be able to buy a new car, or buy a new house."
]
}
}
Python example
Below you can find an example request with Python to get the completion. Don't forget to add your API key in the example.
import requests
import time
response_step1 = requests.post(
url="https://api-dev.stochastic.ai/v1/modelApi/submit/flan-t5",
headers={
"apiKey": "your API key"
},
json={
"prompt": "A step by step recipe to make bolognese pasta:",
"params": {
"max_new_tokens": 64,
"top_k": 4,
"penalty_alpha": 0.6
}
}
)
response_step1.raise_for_status()
data_step1 = response_step1.json()["data"]
completed = False
while not completed:
response_step2 = requests.get(
data_step1["responseUrl"],
headers={
"apiKey": "your API key"
}
)
response_step2.raise_for_status()
data_step2 = response_step2.json()["data"]
completion = data_step2.get("completion")
completed = completion is not None
time.sleep(1)
print(completion)
Output
['In a large saucepan, combine the ground beef, tomato paste, tomato sauce, oregano, basil, salt, pepper, and thyme. Bring to a boil, then reduce the heat to low and simmer for 30 minutes. Meanwhile, cook the pasta in salted boiling water according to']