Skip to main content

GPT-J

Navigate through the Stochastic Platform in the Model API side tab and select GPT-J model card.

Model card

You can now choose to use the model either by our Playground or API. You can choose the method by selecting the corresponding tab.

Playground usage

In this page you can find a textarea in which you can insert any prompt you would like the model to generate the text based on it. On the right side panel you can personalize supported parameter values based on your preference. Finally press the Submit button to trigger a request. The output result will be shown in the central panel.

  • Max new tokens: Maximum number of tokens generated by the model. A token is roughly 4 characters including alphanumerics and special characters.
  • TopK: Top-K sampling means sorting by probability and zero-ing out the probabilities for anything below the k'th token. A lower value improves quality by removing the tail and making it less likely to go off topic.
  • Penalty alpha: Regulates the importance of the model confidence and the degeneration penalty in contrastive search decoding. When generating output, contrastive search jointly considers the probability predicted by the language model to maintain the semantic coherence between the generated text and the prefix text and the similarity with respect to the previous context to avoid model degeneration

Use tab

Stochastic-x model API usage

To use the model API in your application there are two main steps

info

To use the model API you have to have a Stochastic account. Sign up for a free account.

In this step we have to submit a inference request to the ApiUrl to get the responseUrl and the queuePosition. Specification of request and response are mentioned below.

  • Request

    • Method : POST

    • Header : In the request header add a property called apiKey . (Get the apiKey form the Stochastic Platform)

    Use tab

    • Body : The request body can contain the following properties:

      • prompt: Required, the prompt for text generation or can be an Array of Strings
      • params: Required, Params for generation

      Here is an example for the request body:

      {
      "prompt": "A step by step recipe to make bolognese pasta:",
      "params": {
      "max_new_tokens": 64,
      "top_k": 4,
      "penalty_alpha": 0.6
      }
      }
  • Response

    If the request is successful you will receive the responseUrl and the queuePosition

    {
    "success": true,
    "data": {
    "id": "6389ce23460c900d80fa2290",
    "responseUrl": "https://api.stochastic.ai/v1/modelApi/inference/6389ce23460c900d80fa2290",
    "queuePosition": "0"
    }
    }

Python example

Below you can find an example request with Python to get the completion. Don't forget to add your API key in the example.

import requests
import time

response_step1 = requests.post(
url="https://api-dev.stochastic.ai/v1/modelApi/submit/gpt-j",
headers={
"apiKey": "your API key"
},
json={
"prompt": "A step by step recipe to make bolognese pasta:",
"params": {
"max_new_tokens": 64,
"top_k": 4,
"penalty_alpha": 0.6
}
}
)

response_step1.raise_for_status()
data_step1 = response_step1.json()["data"]

completed = False

while not completed:
response_step2 = requests.get(
data_step1["responseUrl"],
headers={
"apiKey": "your API key"
}
)
response_step2.raise_for_status()

data_step2 = response_step2.json()["data"]

completion = data_step2.get("completion")

completed = completion is not None
time.sleep(1)

print(completion)
Output
['\n\nBolognese is a meat sauce that originates from Emilia Romagna, a region in Northern Italy.\n\nThe bolognese sauce is made with meat such as beef, veal, pork, chicken, rabbit, and game.\n\nIt’s a thick sauce that can']