Skip to main content

Integrations

  1. Log in to Hugging Face, then navigate to Access Tokens.
  2. Create a new token. You may use a fine-grained token. In this case, please make sure the token has view permission for the repository you’d like to use.
  3. Integrate the key in Friendli Suite → Personal settings → Integrations.
If you revoke / invalidate the key, you will have to update the key in order to not disrupt ongoing deployments, or to launch a new inference deployment.
  1. Log in to your W&B account at the authorization page, then navigate to User Settings, and scroll to the API Keys section.
  2. Acquire a token.
  3. Integrate the key in Friendli Suite → Personal settings → Integrations.
If you revoke / invalidate the key, you will have to update the key in order to not disrupt ongoing deployments, or to launch a new inference deployment.

Using 3rd-party model

W&B artifact as a model
  • Make sure to use the full name of the artifact.
  • The artifact name must be in the format of org/project/artifact_id:version
  1. Install the CLI and log in with your API key. See the W&B CLI documentation for details.
  2. Upload the model as a W&B artifact using the command below:
wandb artifact put -n project/artifact_id --type model /path/to/dir
  1. With all this, the W&B artifact will look like this: W&B artifact
HF artifact as a model
  • Use the repository id of the model. You may select the entry from the list of autocompleted model repositories.
  • You may choose specific branch, or manually enter a commit hash.

Format Requirements

  • A model should be in safetensors format.
  • The model should NOT be nested inside another directory.
  • Including other arbitrary files (that are not in the list) is totally fine. However, those files will not be downloaded nor used.
RequiredFilenameDescription
YessafetensorsModel weight, e.g. model.safetensors. Use model.safetensors.index.json for split safetensors files
Yesconfig.jsonModel config that includes the architecture. (Supported Models on Friendli)
Notokenizer.jsonTokenizer for the model
Notokenizer_config.jsonTokenizer config. This should be present & have a chat_template field for the Friendli Engine to provide chat APIs
Nospecial_tokens_map.json
The dataset should satisfy the following conditions:
  1. The dataset must contain a column named “messages”.
  2. Each row in the “messages” column should be compatible with the chat template of the base model. For example, tokenizer_config.json of mistralai/Mistral-7B-Instruct-v0.2 is a template that repeats the messages of a user and an assistant. Concretely, each row in the “messages” field should follow a format like: [{"role": "user", "content": "The 1st user's message"}, {"role": "assistant", "content": "The 1st assistant's message"}]. In this case, HuggingFaceH4/ultrachat_200k is a dataset that is compatible with the chat template.

Troubleshooting

Inference Request Errors

Below is a table of common error codes you might encounter when making inference-related API requests.
CodeNameCauseSuggested Solution
400Bad RequestThe request is malformed or missing required fields.Check your request payload. Ensure it is valid JSON with all required fields.
401UnauthorizedMissing or invalid API key. The request lacks proper authentication.Include a valid Friendli token in the Authorization header. Verify the token is active and correct.
403ForbiddenThe API key is valid but does not have permission to access the endpoint.Ensure your token has access rights to the endpoint. Use the correct team token or add the X-Friendli-Team header if needed.
404Not FoundThe specified endpoint or resource does not exist. This typically occurs when the endpoint_id or team_id is invalid.Verify the endpoint_id and model name in your request. Ensure they match an existing, non-deleted deployment. Also check for typos in your endpoint ID or team ID.
422Unprocessable EntityThe request is syntactically correct but semantically invalid (e.g. exceeding token limits, invalid parameter values).Adjust your request (e.g. reduce max_tokens, correct parameter values) and try again.
429Too Many RequestsYou have exceeded rate limits for your plan.Reduce request frequency or upgrade your plan for higher limits. Wait before retrying after a 429 error.
500Internal Server ErrorA server-side error occurred while processing the request.Retry the request after a short delay. If the error persists, check endpoint health in the overview dashboard or contact FriendliAI support.

Quick checklist before retrying

  • Verify the endpoint URL, endpoint_id, and (if applicable) X-Friendli-Team header
  • Include the Authorization header with a valid token
  • Confirm the target deployment exists, is healthy, and is not deleted
  • Validate request JSON and required fields; reduce max_tokens if needed
  • Check rate limits; add retry with backoff when receiving 429

Model Selection Errors

Troubleshooting - can't accessThe artifact might be nonexistent, or hidden so that you cannot access it.
Troubleshooting - no accessThe repository is gated. Please follow the steps and gain approval from the owner using Hugging Face Hub.
Troubleshooting - invalid repoTroubleshooting - invalid artifactThe model does not meet the requirements. Please check if the model follows a correct safetensors format. See the format requirements for details.
Troubleshooting - unsupportedThe model architecture is not supported. Please refer to the Supported Models page.
This page may not cover all cases. If your issue persists contact support.
I