Endpoints
happy_vLLM provides several endpoints which cover most of the use cases. Feel free to open an issue or a PR if you would like to add an endpoint. All these endpoints (except for the /metrics
endpoint) are prefixed by, well, a prefix which by default is absent.
Technical endpoints
/v1/info (GET)
Provides information on the API and the model (more details here)
/metrics (GET)
The technical metrics obtained for prometheus (more details here)
/liveness (GET)
The liveness endpoint (more details here)
/readiness (GET)
The readiness endpoint (more details here)
/v1/models (GET)
The Open AI compatible endpoint used, for example, to get the name of the model. Mimicks the vLLM implementation (more details here)
/v1/launch_arguments (GET)
Gives all the arguments used when launching the application. --with-launch-arguments
must be activated (more details here)
Generating endpoints
/v1/completions and /v1/chat/completions (POST)
These two endpoints mimick the ones of vLLM. They follow the Open AI contract and you can find more details in the vLLM documentation
/v1/abort_request (POST)
Aborts a running request
Tokenizer endpoints
/v1/tokenizer (POST)
Deprecated
Used to tokenizer a text (more details here)
/v2/tokenizer (POST)
Used to tokenizer a text (more details here)
/v1/decode (POST)
Deprecated
Used to decode a list of token ids (more details here)
/v2/decode (POST)
Used to decode a list of token ids (more details here)
Data manipulation endpoints
/v1/metadata_text (POST)
Used to know which part of a prompt will be truncated (more details here)
/v1/split_text (POST)
Splits a text on some separators, for example to prepare for some RAG (more details here)
Embeddings endpoint
/v1/embeddings
Used to obtain the embeddings of a text (more details here)
Lora endpoints
/v1/load_lora_adapter (POST)
Load a specific Lora adapter (more details in vLLM documentation)
/v1/unload_lora_adapter (POST)
Unload a Lora adapter (more details in vLLM documentation)
Transcription endpoints
/v1/audio/transcriptions
Used to obtain the transcription of an audio file (more details here)