> ## Documentation Index
> Fetch the complete documentation index at: https://friendli.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Dedicated Update Endpoint

> Update a Friendli Dedicated Endpoint with a new model, GPU type, or replica count. Changes are applied as a new version in the deployment history.

Update a Dedicated Endpoint deployment with new configuration.

To request successfully, it is mandatory to enter a **Personal API Key** (e.g. flp\_XXX) value in the **Bearer Token** field.
Refer to the [authentication section](/openapi/introduction#authentication) on our introduction page to learn how to acquire this variable and [visit here](https://friendli.ai/suite/~/setting/keys) to generate your API Key.

<Info>
  This API is currently in **Beta**.
  While we strive to provide a stable and reliable experience, this feature is still under active development.
  As a result, you may encounter unexpected behavior or limitations.
  We encourage you to provide feedback to help us improve the feature before its official release.

  * [Feature request & feedback](mailto:support@friendli.ai)
  * [Contact support](mailto:support@friendli.ai)
</Info>


## OpenAPI

````yaml https://github.com/friendliai/friendli-openapi/raw/refs/heads/main/openapi.yaml put /dedicated/beta/endpoint/{endpoint_id}
openapi: 3.1.0
info:
  title: Friendli Suite API Reference
  description: This is an OpenAPI reference of Friendli Suite API.
  termsOfService: https://friendli.ai/terms-of-service
  contact:
    name: FriendliAI Support Team
    email: support@friendli.ai
  version: 0.1.0
servers:
  - url: https://api.friendli.ai
security: []
tags:
  - name: Serverless.Chat
  - name: Serverless.ToolAssistedChat
  - name: Serverless.Messages
  - name: Serverless.ChatRender
  - name: Serverless.Completions
  - name: Serverless.Token
  - name: Serverless.Audio
  - name: Serverless.Model
  - name: Serverless.Knowledge
  - name: Dedicated.Chat
  - name: Dedicated.Messages
  - name: Dedicated.ChatRender
  - name: Dedicated.Completions
  - name: Dedicated.Embeddings
  - name: Dedicated.TextClassification
  - name: Dedicated.Token
  - name: Dedicated.Image
  - name: Dedicated.Audio
  - name: Dedicated.Endpoint
  - name: Container.Chat
  - name: Container.Messages
  - name: Container.Completions
  - name: Container.TextClassification
  - name: Container.Token
  - name: Container.Image
  - name: Container.Audio
  - name: Cost
  - name: Dataset
  - name: File
paths:
  /dedicated/beta/endpoint/{endpoint_id}:
    put:
      tags:
        - Dedicated.Endpoint
      summary: Update endpoint spec
      description: Update the specification of a specific endpoint
      operationId: dedicatedUpdateEndpoint
      parameters:
        - name: endpoint_id
          in: path
          required: true
          schema:
            type: string
            description: The ID of the endpoint
            title: Endpoint Id
          description: The ID of the endpoint
        - name: X-Friendli-Team
          in: header
          required: false
          schema:
            anyOf:
              - type: string
              - type: 'null'
            description: ID of team to run requests as (optional parameter).
            title: X-Friendli-Team
          description: ID of team to run requests as (optional parameter).
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/DedicatedEndpointUpdateBody'
      responses:
        '200':
          description: Successfully updated the endpoint specification.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/DedicatedEndpointSpec'
              examples:
                Example:
                  value:
                    name: endpoint-name
                    gpuType: NVIDIA H100
                    numGpu: 1
                    instanceId: instance-id
                    projectId: project-id
                    creatorId: creator-id
                    teamId: team-id
                    autoscalingMin: 0
                    autoscalingMax: 1
                    autoscalingCooldown: 300
                    maxBatchSize: 10
                    maxInputLength: 1024
                    tokenizerSkipSpecialTokens: true
                    tokenizerAddSpecialTokens: true
                    currReplicaCnt: 1
                    desiredReplicaCnt: 1
                    updatedReplicaCnt: 1
        '400':
          description: Bad Request
        '404':
          description: Not Found
        '422':
          description: Unprocessable Entity
      security:
        - token: []
components:
  schemas:
    DedicatedEndpointUpdateBody:
      properties:
        name:
          anyOf:
            - type: string
            - type: 'null'
          title: Name
          description: The name of the endpoint.
        advanced:
          anyOf:
            - $ref: '#/components/schemas/EndpointAdvancedConfig'
            - type: 'null'
          title: Advanced
          description: The advanced configuration of the endpoint.
        simplescale:
          anyOf:
            - $ref: '#/components/schemas/EndpointSimplescaleConfig'
            - type: 'null'
          title: Simple Scale
          description: The simple scaling configuration of the endpoint.
        autoscalingPolicy:
          anyOf:
            - $ref: '#/components/schemas/AutoscalingPolicy'
            - type: 'null'
          title: Auto Scale Policy
          description: The auto scaling configuration of the endpoint.
        hfModelRepo:
          anyOf:
            - type: string
            - type: 'null'
          title: HF Model Repo
          description: HF ID of the model.
        hfModelRepoRevision:
          anyOf:
            - type: string
            - type: 'null'
          title: HF Model Repo Revision
          description: HF commit hash of the model.
        newVersionComment:
          anyOf:
            - type: string
            - type: 'null'
          title: New Version Comment
          description: Comment for the new version.
        instanceOptionId:
          anyOf:
            - type: string
            - type: 'null'
          title: Instance Option ID
          description: |-
            The ID of the instance option.

            Available options:
            - 1x NVIDIA A100 80GB: `ShbPuOs4tfGb`
            - 2x NVIDIA A100 80GB: `mrAHuYt7T40o`
            - 4x NVIDIA A100 80GB: `JkNob0NMdoF3`
            - 8x NVIDIA A100 80GB: `sYH4kHmAcA5P`
            - 1x NVIDIA H100: `TwD5AqnBSVN0`
            - 2x NVIDIA H100: `zfTutSiLn0Hq`
            - 4x NVIDIA H100: `lfkRz5G48REc`
            - 8x NVIDIA H100: `GUA4qYFmsYz8`
            - 1x NVIDIA H200: `LnK1wTaKc7WO`
            - 2x NVIDIA H200: `Tu6GjBnfHPe4`
            - 4x NVIDIA H200: `OhTzYtZuomzI`
            - 8x NVIDIA H200: `ahBzWtOuomsI`
            - 1x NVIDIA B200: `8GiQTLKfJNOr`
            - 2x NVIDIA B200: `brTZGIuYgVrs`
            - 4x NVIDIA B200: `AFoZMFXZnAdD`
            - 8x NVIDIA B200: `drbc6G9FxJWZ`
      type: object
      title: DedicatedEndpointUpdateBody
      description: Dedicated endpoint update request.
    DedicatedEndpointSpec:
      properties:
        name:
          type: string
          title: Name
          description: The name of the endpoint.
        gpuType:
          type: string
          title: GPU Type
          description: The type of GPU to use for the endpoint.
        numGpu:
          type: integer
          title: Number of GPUs
          description: The number of GPUs to use per replica.
        instanceId:
          anyOf:
            - type: string
            - type: 'null'
          title: Instance ID
          description: The ID of the instance.
        projectId:
          type: string
          title: Project ID
          description: The ID of the project that owns the endpoint.
        creatorId:
          type: string
          title: Creator ID
          description: The ID of the user who created the endpoint.
        teamId:
          type: string
          title: Team ID
          description: The ID of the team that owns the endpoint.
        autoscalingMin:
          type: integer
          title: Minimum Replicas
          description: The minimum number of replicas to maintain.
        autoscalingMax:
          type: integer
          title: Maximum Replicas
          description: The maximum number of replicas allowed.
        autoscalingCooldown:
          type: integer
          title: Autoscaling Cooldown
          description: The cooldown period in seconds between scaling operations.
        maxBatchSize:
          type: integer
          title: Maximum Batch Size
          description: The maximum batch size for inference requests.
        maxInputLength:
          anyOf:
            - type: integer
            - type: 'null'
          title: Maximum Input Length
          description: The maximum allowed input length.
        tokenizerSkipSpecialTokens:
          type: boolean
          title: Skip Special Tokens
          description: Whether to skip special tokens in tokenizer output.
        tokenizerAddSpecialTokens:
          type: boolean
          title: Add Special Tokens
          description: Whether to add special tokens in tokenizer input.
        currReplicaCnt:
          anyOf:
            - type: integer
            - type: 'null'
          title: Current Replica Count
          description: The current number of replicas.
        desiredReplicaCnt:
          anyOf:
            - type: integer
            - type: 'null'
          title: Desired Replica Count
          description: The desired number of replicas.
        updatedReplicaCnt:
          anyOf:
            - type: integer
            - type: 'null'
          title: Updated Replica Count
          description: The updated number of replicas.
      type: object
      required:
        - name
        - gpuType
        - numGpu
        - projectId
        - creatorId
        - teamId
        - autoscalingMin
        - autoscalingMax
        - autoscalingCooldown
        - maxBatchSize
        - tokenizerSkipSpecialTokens
        - tokenizerAddSpecialTokens
      title: DedicatedEndpointSpec
      description: Dedicated endpoint specification.
    EndpointAdvancedConfig:
      properties:
        max_batch_size:
          anyOf:
            - type: integer
            - type: 'null'
          title: Max Batch Size
        tokenizer_skip_special_tokens:
          type: boolean
          title: Tokenizer Skip Special Tokens
        tokenizer_add_special_tokens:
          type: boolean
          title: Tokenizer Add Special Tokens
        max_token_count:
          type: integer
          title: Max Token Count
          default: 2560
        enable_content_logging:
          anyOf:
            - type: boolean
            - type: 'null'
          title: Enable Content Logging
        max_input_length:
          anyOf:
            - type: integer
            - type: 'null'
          title: Max Input Length
      type: object
      required:
        - tokenizer_skip_special_tokens
        - tokenizer_add_special_tokens
      title: EndpointAdvancedConfig
      description: Endpoint advanced config.
    EndpointSimplescaleConfig:
      properties:
        replicas:
          type: integer
          minimum: 1
          title: Replicas
      type: object
      required:
        - replicas
      title: EndpointSimplescaleConfig
      description: Simple scaling options.
    AutoscalingPolicy:
      properties:
        minReplica:
          type: integer
          minimum: 0
          title: Minimum Replica
          description: >-
            Setting `minReplica` to 0 allows the endpoint to sleep when idle,
            reducing costs. The minimum value is 0.
          default: 0
        maxReplica:
          type: integer
          maximum: 10
          title: Maximum Replica
          description: >-
            The maximum replicas that the endpoint can scale up to. The maximum
            value is 10.
          default: 1
        cooldownPeriod:
          type: integer
          title: Cooldown Period
          description: >-
            Determines how long the endpoint waits before scaling down after the
            last request.
          default: 300
      type: object
      title: AutoscalingPolicy
      description: Autoscaling policy.
  securitySchemes:
    token:
      type: http
      description: >-
        When using Friendli Suite API for inference requests, you need to
        provide a **Personal API Key** for authentication and authorization
        purposes.


        For more detailed information, please refer
        [here](https://friendli.ai/docs/openapi/introduction#authentication).
      scheme: bearer

````