授权#
SkyServe 在副本层面提供了强大的授权功能,允许您使用 API 密钥控制对服务端点的访问。
设置 API 密钥#
SkyServe 依赖于在底层服务副本上运行的服务自身的授权,例如推理引擎。我们以 vLLM 推理引擎为例,它支持使用参数 --api-key
进行静态 API 密钥授权。
我们定义了一个 SkyServe 服务规范,用于使用 vLLM 和 API 密钥来提供 Llama-3 聊天机器人服务。在下面的示例 YAML 中,我们将授权令牌定义为一个环境变量 AUTH_TOKEN
,并将其传递给服务字段以使 readiness_probe
能够访问副本,同时也传递给 vllm 入口点以使用 API 密钥在副本上启动服务。
# auth.yaml
envs:
MODEL_NAME: meta-llama/Meta-Llama-3-8B-Instruct
HF_TOKEN: # TODO: Fill with your own huggingface token, or use --env to pass.
AUTH_TOKEN: # TODO: Fill with your own auth token (a random string), or use --env to pass.
service:
readiness_probe:
path: /v1/models
headers:
Authorization: Bearer $AUTH_TOKEN
replicas: 2
resources:
accelerators: {L4, A10g, A10, L40, A40, A100, A100-80GB}
ports: 8000
setup: |
pip install vllm==0.4.0.post1 flash-attn==2.5.7 gradio openai
# python -c "import huggingface_hub; huggingface_hub.login('${HF_TOKEN}')"
run: |
export PATH=$PATH:/sbin
python -m vllm.entrypoints.openai.api_server \
--model $MODEL_NAME --trust-remote-code \
--gpu-memory-utilization 0.95 \
--host 0.0.0.0 --port 8000 \
--api-key $AUTH_TOKEN
要部署服务,运行以下命令
HF_TOKEN=xxx AUTH_TOKEN=yyy sky serve up auth.yaml -n auth --env HF_TOKEN --env AUTH_TOKEN
要向服务终点发送请求,服务客户端需要在请求头中包含静态 API 密钥
$ ENDPOINT=$(sky serve status --endpoint auth)
$ AUTH_TOKEN=yyy
$ curl http://$ENDPOINT/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $AUTH_TOKEN" \
-d '{
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who are you?"
}
],
"stop_token_ids": [128009, 128001]
}' | jq
示例输出
{
"id": "cmpl-cad2c1a2a6ee44feabed0b28be294d6f",
"object": "chat.completion",
"created": 1716819147,
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I'm so glad you asked! I'm LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I'm here to help you with any questions, tasks, or topics you'd like to discuss. I can provide information on a wide range of subjects, from science and history to entertainment and culture. I can also assist with language-related tasks such as language translation, text summarization, and even writing and proofreading. My goal is to provide accurate and helpful responses to your inquiries, while also being friendly and engaging. So, what's on your mind? How can I assist you today?"
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": 128009
}
],
"usage": {
"prompt_tokens": 26,
"total_tokens": 160,
"completion_tokens": 134
}
}
没有 API 密钥的服务客户端将无法访问服务,并会收到 401 Unauthorized
错误
$ curl http://$ENDPOINT/v1/models
{"error": "Unauthorized"}
$ curl http://$ENDPOINT/v1/models -H "Authorization: Bearer random-string"
{"error": "Unauthorized"}