Skip to main content

SQuARE Model Inference API (0.1.0)

Download OpenAPI specification:Download

API reference for model inference.





Response samples

Content type
  • "is_alive": true


Sequence Classification

path Parameters
string (Identifier)
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"

Sequence Classification

path Parameters
string (Identifier)
query Parameters
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"

Token Classification

path Parameters
string (Identifier)
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"

Token Classification

path Parameters
string (Identifier)
query Parameters
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"


path Parameters
string (Identifier)
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"


path Parameters
string (Identifier)
query Parameters
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"

Question Answering

path Parameters
string (Identifier)
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"

Question Answering

path Parameters
string (Identifier)
query Parameters
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"


path Parameters
string (Identifier)
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"


path Parameters
string (Identifier)
query Parameters
string (Hf Username)
Request Body schema: application/json
Array of Input (strings) or Array of Input (strings) or Input (object) (Input)

Input for the model. Supports Huggingface Transformer inputs (i.e., list of sentences, or list of pairs of sentences), a dictionary with Transformer inputs, or a dictionary containing numpy arrays (as lists). For the numpy arrays, also set is_preprocessed=True.

Transformer/ Adapter:
Task 'question_answering' expects the input to be in the (question, context) format.

boolean (Is Preprocessed)
Default: false

Flag indicating that the input contains already pre-processed numpy arrays as list and that it needs no further pre-processing.

Transformer/ Adapter/ SentenceTransformer: 'is_preprocessed' is not supported.

object (Preprocessing Kwargs)
Default: {}

Optional dictionary containing additional parameters for the pre-processing step.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the Huggingface tokenizer for possible parameters.

object (Model Kwargs)
Default: {}

Optional dictionary containing parameters that are passed to the model for the forward pass to control what additional tensors are returned.

SentenceTransformer: This is ignored.
Transformer/ Adapter: See the forward method of the Huggingface models for possible parametersFor example, set ‘output_attentions=True’ to receive the attention results in the output.For adapter models, the following options are also available:1. set average_adapters to True to average the adapter weights

object (Task Kwargs)
Default: {}

Optional dictionary containing additional parameters for handling of the task and task-related post-processing.

SentenceTransformer: This is ignored.
Transformer/ Adapter:
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'is_regression': Flag to treat output of models with num_labels>1 as regression, too, i.e., no softmax and no labels are returned
- 'embedding_mode: One of 'mean', 'max', 'cls', 'pooler', 'token'. The pooling mode used (or not used for 'token'). 'pooler' uses the pooler_output of a Transformer, i.e. the processed CLS token. Default value 'mean'.
- 'topk': Return the top-k most likely spans. Default 1.
- 'max_answer_len': Maximal token length of answers. Default 128.
- 'clean_up_tokenization_spaces': See parameter in Huggingface tokenizer.decode(). Default False
- See Huggingface model.generate() for all possible parameters that can be used. Note, 'model_kwargs' and 'task_kwargs' are merged for generation.
'normalize',boolen, 'True' for using normalized embedding, default 'False'

object (Explain Kwargs)
Default: {}

Optional dictionary containing additional parameters for explaining predictions
- 'method': explanation method such as 'simple_grads, integrated_grads,smooth_grads, attention or scaled_attention':
- 'top_k': number of word attributions to return:
- 'mode: One of 'question', 'context', 'all'. Returns respective attributions.

object (Attack Kwargs)
Default: {}

Optional dictionary containing additional parameters for attacking models
- 'method': explanation method such as 'hotflip', 'input_reduction'
'saliency_method': simple_grads, integrated_grads, smooth_grads, attention or scaled_attention :
- 'max_flips': number of words to flip in hotflip
- 'include_answer: Whether to remove answer from context while attacking model.

Adapter Name (string) or Array of Adapter Name (strings) (Adapter Name)
Default: ""

Only necessary for Adapter. The fully specified name of the to-be-used adapter from


Request samples

Content type
  • "input": [
  • "is_preprocessed": false,
  • "preprocessing_kwargs": { },
  • "model_kwargs": { },
  • "task_kwargs": { },
  • "explain_kwargs": { },
  • "attack_kwargs": { },
  • "adapter_name": ""

Response samples

Content type
  • "message": "string",
  • "task_id": "string"

Get Task Results

path Parameters
string (Task Id)


Response samples

Content type


Returns the statistics of the model :return: the ModelStatistics for the model

path Parameters
string (Identifier)
string (Hf Username)


Response samples

Content type
  • "model_type": "string",
  • "model_name": "string",
  • "batch_size": 0,
  • "max_input": 0,
  • "model_class": "string",
  • "disable_gpu": true,
  • "return_plaintext_arrays": true,
  • "preloaded_adapters": true,
  • "transformers_cache": ".cache",
  • "model_path": "string",
  • "decoder_path": "string",
  • "onnx_use_quantized": true,
  • "is_encoder_decoder": true


Returns the statistics of the model :return: the ModelStatistics for the model

path Parameters
string (Identifier)
query Parameters
string (Hf Username)


Response samples

Content type
  • "model_type": "string",
  • "model_name": "string",
  • "batch_size": 0,
  • "max_input": 0,
  • "model_class": "string",
  • "disable_gpu": true,
  • "return_plaintext_arrays": true,
  • "preloaded_adapters": true,
  • "transformers_cache": ".cache",
  • "model_path": "string",
  • "decoder_path": "string",
  • "onnx_use_quantized": true,
  • "is_encoder_decoder": true


Update the model with the given parameters. (not all parameters can be updated through this method e.g. the model class is linked to the model, hence it can't be updated during runtime) :param updated_param: the new parameters :return: the information about the updated model

path Parameters
string (Identifier)
string (Hf Username)
Request Body schema: application/json
boolean (Disable Gpu)
integer (Batch Size)
integer (Max Input)
boolean (Return Plaintext Arrays)


Request samples

Content type
  • "disable_gpu": true,
  • "batch_size": 0,
  • "max_input": 0,
  • "return_plaintext_arrays": true

Response samples

Content type


Update the model with the given parameters. (not all parameters can be updated through this method e.g. the model class is linked to the model, hence it can't be updated during runtime) :param updated_param: the new parameters :return: the information about the updated model

path Parameters
string (Identifier)
query Parameters
string (Hf Username)
Request Body schema: application/json
boolean (Disable Gpu)
integer (Batch Size)
integer (Max Input)
boolean (Return Plaintext Arrays)


Request samples

Content type
  • "disable_gpu": true,
  • "batch_size": 0,
  • "max_input": 0,
  • "return_plaintext_arrays": true

Response samples

Content type