Requirement already satisfied: pip in ./.venv/lib/python3.12/site-packages (24.0) Collecting pip Downloading pip-26.0.1-py3-none-any.whl.metadata (4.7 kB) Downloading pip-26.0.1-py3-none-any.whl (1.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 56.0 MB/s eta 0:00:00 Installing collected packages: pip Attempting uninstall: pip Found existing installation: pip 24.0 Uninstalling pip-24.0: Successfully uninstalled pip-24.0 Successfully installed pip-26.0.1 Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu128 Collecting vllm Downloading vllm-0.17.1-cp38-abi3-manylinux_2_31_x86_64.whl.metadata (9.8 kB) Collecting regex (from vllm) Downloading regex-2026.2.28-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (40 kB) Collecting cachetools (from vllm) Downloading cachetools-7.0.5-py3-none-any.whl.metadata (5.6 kB) Collecting psutil (from vllm) Downloading psutil-7.2.2-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl.metadata (22 kB) Collecting sentencepiece (from vllm) Downloading sentencepiece-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (10 kB) Collecting numpy (from vllm) Downloading numpy-2.4.3-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (6.6 kB) Collecting requests>=2.26.0 (from vllm) Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB) Collecting tqdm (from vllm) Downloading tqdm-4.67.3-py3-none-any.whl.metadata (57 kB) Collecting blake3 (from vllm) Downloading blake3-1.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.6 kB) Collecting py-cpuinfo (from vllm) Downloading py_cpuinfo-9.0.0-py3-none-any.whl.metadata (794 bytes) Collecting transformers<5,>=4.56.0 (from vllm) Downloading transformers-4.57.6-py3-none-any.whl.metadata (43 kB) Collecting tokenizers>=0.21.1 (from vllm) Downloading tokenizers-0.22.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.3 kB) Collecting protobuf!=6.30.*,!=6.31.*,!=6.32.*,!=6.33.0.*,!=6.33.1.*,!=6.33.2.*,!=6.33.3.*,!=6.33.4.*,>=5.29.6 (from vllm) Downloading protobuf-7.34.0-cp310-abi3-manylinux2014_x86_64.whl.metadata (595 bytes) Collecting fastapi>=0.115.0 (from fastapi[standard]>=0.115.0->vllm) Downloading fastapi-0.135.1-py3-none-any.whl.metadata (30 kB) Collecting aiohttp>=3.13.3 (from vllm) Downloading aiohttp-3.13.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (8.1 kB) Collecting openai<2.25.0,>=1.99.1 (from vllm) Downloading openai-2.24.0-py3-none-any.whl.metadata (29 kB) Collecting pydantic>=2.12.0 (from vllm) Downloading pydantic-2.12.5-py3-none-any.whl.metadata (90 kB) Collecting prometheus_client>=0.18.0 (from vllm) Downloading prometheus_client-0.24.1-py3-none-any.whl.metadata (2.1 kB) Collecting pillow (from vllm) Downloading pillow-12.1.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (8.8 kB) Collecting prometheus-fastapi-instrumentator>=7.0.0 (from vllm) Downloading prometheus_fastapi_instrumentator-7.1.0-py3-none-any.whl.metadata (13 kB) Collecting tiktoken>=0.6.0 (from vllm) Downloading tiktoken-0.12.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (6.7 kB) Collecting lm-format-enforcer==0.11.3 (from vllm) Downloading lm_format_enforcer-0.11.3-py3-none-any.whl.metadata (17 kB) Collecting llguidance<1.4.0,>=1.3.0 (from vllm) Downloading llguidance-1.3.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (10 kB) Collecting outlines_core==0.2.11 (from vllm) Downloading outlines_core-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.8 kB) Collecting diskcache==5.6.3 (from vllm) Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB) Collecting lark==1.2.2 (from vllm) Downloading lark-1.2.2-py3-none-any.whl.metadata (1.8 kB) Collecting xgrammar==0.1.29 (from vllm) Downloading xgrammar-0.1.29-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.8 kB) Collecting typing_extensions>=4.10 (from vllm) Downloading https://download.pytorch.org/whl/typing_extensions-4.15.0-py3-none-any.whl.metadata (3.3 kB) Collecting filelock>=3.16.1 (from vllm) Downloading filelock-3.25.2-py3-none-any.whl.metadata (2.0 kB) Collecting partial-json-parser (from vllm) Downloading partial_json_parser-0.2.1.1.post7-py3-none-any.whl.metadata (6.1 kB) Collecting pyzmq>=25.0.0 (from vllm) Downloading pyzmq-27.1.0-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl.metadata (6.0 kB) Collecting msgspec (from vllm) Downloading msgspec-0.20.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (5.5 kB) Collecting gguf>=0.17.0 (from vllm) Downloading gguf-0.18.0-py3-none-any.whl.metadata (4.5 kB) Collecting mistral_common>=1.9.1 (from mistral_common[image]>=1.9.1->vllm) Downloading mistral_common-1.10.0-py3-none-any.whl.metadata (5.6 kB) Collecting opencv-python-headless>=4.13.0 (from vllm) Downloading opencv_python_headless-4.13.0.92-cp37-abi3-manylinux_2_28_x86_64.whl.metadata (19 kB) Collecting pyyaml (from vllm) Downloading pyyaml-6.0.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (2.4 kB) Collecting six>=1.16.0 (from vllm) Downloading six-1.17.0-py2.py3-none-any.whl.metadata (1.7 kB) Collecting setuptools<81.0.0,>=77.0.3 (from vllm) Downloading setuptools-80.10.2-py3-none-any.whl.metadata (6.6 kB) Collecting einops (from vllm) Downloading einops-0.8.2-py3-none-any.whl.metadata (13 kB) Collecting compressed-tensors==0.13.0 (from vllm) Downloading compressed_tensors-0.13.0-py3-none-any.whl.metadata (7.0 kB) Collecting depyf==0.20.0 (from vllm) Downloading depyf-0.20.0-py3-none-any.whl.metadata (7.3 kB) Collecting cloudpickle (from vllm) Downloading cloudpickle-3.1.2-py3-none-any.whl.metadata (7.1 kB) Collecting watchfiles (from vllm) Downloading watchfiles-1.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB) Collecting python-json-logger (from vllm) Downloading python_json_logger-4.0.0-py3-none-any.whl.metadata (4.0 kB) Collecting ninja (from vllm) Downloading ninja-1.13.0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (5.1 kB) Collecting pybase64 (from vllm) Downloading pybase64-1.4.3-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl.metadata (8.7 kB) Collecting cbor2 (from vllm) Downloading cbor2-5.8.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (5.4 kB) Collecting ijson (from vllm) Downloading ijson-3.5.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (23 kB) Collecting setproctitle (from vllm) Downloading setproctitle-1.3.7-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.metadata (10 kB) Collecting openai-harmony>=0.0.3 (from vllm) Downloading openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (8.0 kB) Collecting anthropic>=0.71.0 (from vllm) Downloading anthropic-0.86.0-py3-none-any.whl.metadata (3.0 kB) Collecting model-hosting-container-standards<1.0.0,>=0.1.13 (from vllm) Downloading model_hosting_container_standards-0.1.14-py3-none-any.whl.metadata (24 kB) Collecting mcp (from vllm) Downloading mcp-1.26.0-py3-none-any.whl.metadata (89 kB) Collecting grpcio (from vllm) Downloading grpcio-1.78.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (3.8 kB) Collecting grpcio-reflection (from vllm) Downloading grpcio_reflection-1.78.0-py3-none-any.whl.metadata (1.2 kB) Collecting opentelemetry-sdk>=1.27.0 (from vllm) Downloading opentelemetry_sdk-1.40.0-py3-none-any.whl.metadata (1.6 kB) Collecting opentelemetry-api>=1.27.0 (from vllm) Downloading opentelemetry_api-1.40.0-py3-none-any.whl.metadata (1.5 kB) Collecting opentelemetry-exporter-otlp>=1.27.0 (from vllm) Downloading opentelemetry_exporter_otlp-1.40.0-py3-none-any.whl.metadata (2.4 kB) Collecting opentelemetry-semantic-conventions-ai>=0.4.1 (from vllm) Downloading opentelemetry_semantic_conventions_ai-0.4.15-py3-none-any.whl.metadata (998 bytes) Collecting kaldi-native-fbank>=1.18.7 (from vllm) Downloading kaldi_native_fbank-1.22.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (3.3 kB) Collecting numba==0.61.2 (from vllm) Downloading numba-0.61.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.8 kB) Collecting ray>=2.48.0 (from ray[cgraph]>=2.48.0->vllm) Downloading ray-2.54.0-cp312-cp312-manylinux2014_x86_64.whl.metadata (21 kB) Collecting torch==2.10.0 (from vllm) Downloading https://download.pytorch.org/whl/cu128/torch-2.10.0%2Bcu128-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (30 kB) Collecting torchaudio==2.10.0 (from vllm) Downloading https://download-r2.pytorch.org/whl/cu128/torchaudio-2.10.0%2Bcu128-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (6.9 kB) Collecting torchvision==0.25.0 (from vllm) Downloading https://download-r2.pytorch.org/whl/cu128/torchvision-0.25.0%2Bcu128-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (5.4 kB) Collecting flashinfer-python==0.6.4 (from vllm) Downloading flashinfer_python-0.6.4-py3-none-any.whl.metadata (10 kB) Collecting nvidia-cudnn-frontend<1.19.0,>=1.13.0 (from vllm) Downloading nvidia_cudnn_frontend-1.18.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (4.4 kB) Collecting nvidia-cutlass-dsl>=4.4.0.dev1 (from vllm) Downloading nvidia_cutlass_dsl-4.4.2-py3-none-any.whl.metadata (2.7 kB) Collecting quack-kernels>=0.2.7 (from vllm) Downloading quack_kernels-0.3.4-py3-none-any.whl.metadata (528 bytes) Collecting loguru (from compressed-tensors==0.13.0->vllm) Downloading loguru-0.7.3-py3-none-any.whl.metadata (22 kB) Collecting astor (from depyf==0.20.0->vllm) Downloading astor-0.8.1-py2.py3-none-any.whl.metadata (4.2 kB) Collecting dill (from depyf==0.20.0->vllm) Downloading dill-0.4.1-py3-none-any.whl.metadata (10 kB) Collecting apache-tvm-ffi!=0.1.8,!=0.1.8.post0,<0.2,>=0.1.6 (from flashinfer-python==0.6.4->vllm) Downloading apache_tvm_ffi-0.1.9-cp312-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (6.1 kB) Collecting click (from flashinfer-python==0.6.4->vllm) Downloading click-8.3.1-py3-none-any.whl.metadata (2.6 kB) Collecting nvidia-ml-py (from flashinfer-python==0.6.4->vllm) Downloading nvidia_ml_py-13.590.48-py3-none-any.whl.metadata (9.8 kB) Collecting packaging>=24.2 (from flashinfer-python==0.6.4->vllm) Downloading packaging-26.0-py3-none-any.whl.metadata (3.3 kB) Collecting tabulate (from flashinfer-python==0.6.4->vllm) Downloading tabulate-0.10.0-py3-none-any.whl.metadata (40 kB) Collecting interegular>=0.3.2 (from lm-format-enforcer==0.11.3->vllm) Downloading interegular-0.3.3-py37-none-any.whl.metadata (3.0 kB) Collecting llvmlite<0.45,>=0.44.0dev0 (from numba==0.61.2->vllm) Downloading llvmlite-0.44.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.0 kB) Collecting numpy (from vllm) Downloading numpy-2.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB) Collecting sympy>=1.13.3 (from torch==2.10.0->vllm) Downloading sympy-1.14.0-py3-none-any.whl.metadata (12 kB) Collecting networkx>=2.5.1 (from torch==2.10.0->vllm) Downloading networkx-3.6.1-py3-none-any.whl.metadata (6.8 kB) Collecting jinja2 (from torch==2.10.0->vllm) Downloading https://download.pytorch.org/whl/jinja2-3.1.6-py3-none-any.whl.metadata (2.9 kB) Collecting fsspec>=0.8.5 (from torch==2.10.0->vllm) Downloading fsspec-2026.2.0-py3-none-any.whl.metadata (10 kB) Collecting cuda-bindings==12.9.4 (from torch==2.10.0->vllm) Downloading https://download.pytorch.org/whl/cu128/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (2.6 kB) Collecting nvidia-cuda-nvrtc-cu12==12.8.93 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cuda-nvrtc-cu12/nvidia_cuda_nvrtc_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (88.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 88.0/88.0 MB 139.9 MB/s 0:00:00 Collecting nvidia-cuda-runtime-cu12==12.8.90 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cuda-runtime-cu12/nvidia_cuda_runtime_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (954 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 954.8/954.8 kB 378.6 MB/s 0:00:00 Collecting nvidia-cuda-cupti-cu12==12.8.90 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cuda-cupti-cu12/nvidia_cuda_cupti_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (10.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 169.4 MB/s 0:00:00 Collecting nvidia-cudnn-cu12==9.10.2.21 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cudnn-cu12/nvidia_cudnn_cu12-9.10.2.21-py3-none-manylinux_2_27_x86_64.whl (706.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 706.8/706.8 MB 141.3 MB/s 0:00:04 Collecting nvidia-cublas-cu12==12.8.4.1 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.8.4.1-py3-none-manylinux_2_27_x86_64.whl (594.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 594.3/594.3 MB 154.4 MB/s 0:00:03 Collecting nvidia-cufft-cu12==11.3.3.83 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cufft-cu12/nvidia_cufft_cu12-11.3.3.83-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (193.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 193.1/193.1 MB 156.6 MB/s 0:00:01 Collecting nvidia-curand-cu12==10.3.9.90 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-curand-cu12/nvidia_curand_cu12-10.3.9.90-py3-none-manylinux_2_27_x86_64.whl (63.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.6/63.6 MB 159.6 MB/s 0:00:00 Collecting nvidia-cusolver-cu12==11.7.3.90 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cusolver-cu12/nvidia_cusolver_cu12-11.7.3.90-py3-none-manylinux_2_27_x86_64.whl (267.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 267.5/267.5 MB 154.5 MB/s 0:00:01 Collecting nvidia-cusparse-cu12==12.5.8.93 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cusparse-cu12/nvidia_cusparse_cu12-12.5.8.93-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (288.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 288.2/288.2 MB 154.4 MB/s 0:00:01 Collecting nvidia-cusparselt-cu12==0.7.1 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cusparselt-cu12/nvidia_cusparselt_cu12-0.7.1-py3-none-manylinux2014_x86_64.whl (287.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 287.2/287.2 MB 140.7 MB/s 0:00:02 Collecting nvidia-nccl-cu12==2.27.5 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-nccl-cu12/nvidia_nccl_cu12-2.27.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (322.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 322.3/322.3 MB 168.9 MB/s 0:00:01 Collecting nvidia-nvshmem-cu12==3.4.5 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-nvshmem-cu12/nvidia_nvshmem_cu12-3.4.5-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (139.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 139.1/139.1 MB 138.4 MB/s 0:00:01 Collecting nvidia-nvtx-cu12==12.8.90 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-nvtx-cu12/nvidia_nvtx_cu12-12.8.90-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89 kB) Collecting nvidia-nvjitlink-cu12==12.8.93 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-nvjitlink-cu12/nvidia_nvjitlink_cu12-12.8.93-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (39.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.3/39.3 MB 172.3 MB/s 0:00:00 Collecting nvidia-cufile-cu12==1.13.1.3 (from torch==2.10.0->vllm) Downloading https://pypi.nvidia.com/nvidia-cufile-cu12/nvidia_cufile_cu12-1.13.1.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 572.2 MB/s 0:00:00 Collecting triton==3.6.0 (from torch==2.10.0->vllm) Downloading https://download.pytorch.org/whl/triton-3.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (1.7 kB) Collecting cuda-pathfinder~=1.1 (from cuda-bindings==12.9.4->torch==2.10.0->vllm) Downloading cuda_pathfinder-1.4.3-py3-none-any.whl.metadata (1.9 kB) Collecting httpx (from model-hosting-container-standards<1.0.0,>=0.1.13->vllm) Downloading httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB) Collecting jmespath (from model-hosting-container-standards<1.0.0,>=0.1.13->vllm) Downloading jmespath-1.1.0-py3-none-any.whl.metadata (7.6 kB) Collecting starlette>=0.49.1 (from model-hosting-container-standards<1.0.0,>=0.1.13->vllm) Downloading starlette-0.52.1-py3-none-any.whl.metadata (6.3 kB) Collecting supervisor>=4.2.0 (from model-hosting-container-standards<1.0.0,>=0.1.13->vllm) Downloading supervisor-4.3.0-py2.py3-none-any.whl.metadata (87 kB) Collecting anyio<5,>=3.5.0 (from openai<2.25.0,>=1.99.1->vllm) Downloading anyio-4.12.1-py3-none-any.whl.metadata (4.3 kB) Collecting distro<2,>=1.7.0 (from openai<2.25.0,>=1.99.1->vllm) Downloading distro-1.9.0-py3-none-any.whl.metadata (6.8 kB) Collecting jiter<1,>=0.10.0 (from openai<2.25.0,>=1.99.1->vllm) Downloading jiter-0.13.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.2 kB) Collecting sniffio (from openai<2.25.0,>=1.99.1->vllm) Downloading sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB) Collecting idna>=2.8 (from anyio<5,>=3.5.0->openai<2.25.0,>=1.99.1->vllm) Downloading idna-3.11-py3-none-any.whl.metadata (8.4 kB) Collecting certifi (from httpx->model-hosting-container-standards<1.0.0,>=0.1.13->vllm) Downloading certifi-2026.2.25-py3-none-any.whl.metadata (2.5 kB) Collecting httpcore==1.* (from httpx->model-hosting-container-standards<1.0.0,>=0.1.13->vllm) Downloading httpcore-1.0.9-py3-none-any.whl.metadata (21 kB) Collecting h11>=0.16 (from httpcore==1.*->httpx->model-hosting-container-standards<1.0.0,>=0.1.13->vllm) Downloading h11-0.16.0-py3-none-any.whl.metadata (8.3 kB) Collecting annotated-types>=0.6.0 (from pydantic>=2.12.0->vllm) Downloading annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB) Collecting pydantic-core==2.41.5 (from pydantic>=2.12.0->vllm) Downloading pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.3 kB) Collecting typing-inspection>=0.4.2 (from pydantic>=2.12.0->vllm) Downloading typing_inspection-0.4.2-py3-none-any.whl.metadata (2.6 kB) Collecting huggingface-hub<1.0,>=0.34.0 (from transformers<5,>=4.56.0->vllm) Downloading huggingface_hub-0.36.2-py3-none-any.whl.metadata (15 kB) Collecting safetensors>=0.4.3 (from transformers<5,>=4.56.0->vllm) Downloading safetensors-0.7.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB) Collecting hf-xet<2.0.0,>=1.1.3 (from huggingface-hub<1.0,>=0.34.0->transformers<5,>=4.56.0->vllm) Downloading hf_xet-1.4.2-cp37-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (4.9 kB) Collecting aiohappyeyeballs>=2.5.0 (from aiohttp>=3.13.3->vllm) Downloading aiohappyeyeballs-2.6.1-py3-none-any.whl.metadata (5.9 kB) Collecting aiosignal>=1.4.0 (from aiohttp>=3.13.3->vllm) Downloading aiosignal-1.4.0-py3-none-any.whl.metadata (3.7 kB) Collecting attrs>=17.3.0 (from aiohttp>=3.13.3->vllm) Downloading attrs-25.4.0-py3-none-any.whl.metadata (10 kB) Collecting frozenlist>=1.1.1 (from aiohttp>=3.13.3->vllm) Downloading frozenlist-1.8.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.metadata (20 kB) Collecting multidict<7.0,>=4.5 (from aiohttp>=3.13.3->vllm) Downloading multidict-6.7.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (5.3 kB) Collecting propcache>=0.2.0 (from aiohttp>=3.13.3->vllm) Downloading propcache-0.4.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (13 kB) Collecting yarl<2.0,>=1.17.0 (from aiohttp>=3.13.3->vllm) Downloading yarl-1.23.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (79 kB) Collecting docstring-parser<1,>=0.15 (from anthropic>=0.71.0->vllm) Downloading docstring_parser-0.17.0-py3-none-any.whl.metadata (3.5 kB) Collecting annotated-doc>=0.0.2 (from fastapi>=0.115.0->fastapi[standard]>=0.115.0->vllm) Downloading annotated_doc-0.0.4-py3-none-any.whl.metadata (6.6 kB) Collecting fastapi-cli>=0.0.8 (from fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading fastapi_cli-0.0.24-py3-none-any.whl.metadata (6.4 kB) Collecting python-multipart>=0.0.18 (from fastapi[standard]>=0.115.0->vllm) Downloading python_multipart-0.0.22-py3-none-any.whl.metadata (1.8 kB) Collecting email-validator>=2.0.0 (from fastapi[standard]>=0.115.0->vllm) Downloading email_validator-2.3.0-py3-none-any.whl.metadata (26 kB) Collecting uvicorn>=0.12.0 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading uvicorn-0.42.0-py3-none-any.whl.metadata (6.7 kB) Collecting pydantic-settings>=2.0.0 (from fastapi[standard]>=0.115.0->vllm) Downloading pydantic_settings-2.13.1-py3-none-any.whl.metadata (3.4 kB) Collecting pydantic-extra-types>=2.0.0 (from fastapi[standard]>=0.115.0->vllm) Downloading pydantic_extra_types-2.11.1-py3-none-any.whl.metadata (4.2 kB) Collecting dnspython>=2.0.0 (from email-validator>=2.0.0->fastapi[standard]>=0.115.0->vllm) Downloading dnspython-2.8.0-py3-none-any.whl.metadata (5.7 kB) Collecting typer>=0.16.0 (from fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading typer-0.24.1-py3-none-any.whl.metadata (16 kB) Collecting rich-toolkit>=0.14.8 (from fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading rich_toolkit-0.19.7-py3-none-any.whl.metadata (1.0 kB) Collecting fastapi-cloud-cli>=0.1.1 (from fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading fastapi_cloud_cli-0.15.0-py3-none-any.whl.metadata (3.3 kB) Collecting rignore>=0.5.1 (from fastapi-cloud-cli>=0.1.1->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading rignore-0.7.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.2 kB) Collecting sentry-sdk>=2.20.0 (from fastapi-cloud-cli>=0.1.1->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading sentry_sdk-2.55.0-py2.py3-none-any.whl.metadata (10 kB) Collecting fastar>=0.8.0 (from fastapi-cloud-cli>=0.1.1->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading fastar-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.0 kB) Collecting MarkupSafe>=2.0 (from jinja2->torch==2.10.0->vllm) Downloading markupsafe-3.0.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (2.7 kB) Collecting jsonschema>=4.21.1 (from mistral_common>=1.9.1->mistral_common[image]>=1.9.1->vllm) Downloading jsonschema-4.26.0-py3-none-any.whl.metadata (7.6 kB) Collecting jsonschema-specifications>=2023.03.6 (from jsonschema>=4.21.1->mistral_common>=1.9.1->mistral_common[image]>=1.9.1->vllm) Downloading jsonschema_specifications-2025.9.1-py3-none-any.whl.metadata (2.9 kB) Collecting referencing>=0.28.4 (from jsonschema>=4.21.1->mistral_common>=1.9.1->mistral_common[image]>=1.9.1->vllm) Downloading referencing-0.37.0-py3-none-any.whl.metadata (2.8 kB) Collecting rpds-py>=0.25.0 (from jsonschema>=4.21.1->mistral_common>=1.9.1->mistral_common[image]>=1.9.1->vllm) Downloading rpds_py-0.30.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB) Collecting nvidia-cutlass-dsl-libs-base==4.4.2 (from nvidia-cutlass-dsl>=4.4.0.dev1->vllm) Downloading nvidia_cutlass_dsl_libs_base-4.4.2-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (2.6 kB) Collecting cuda-python>=12.8 (from nvidia-cutlass-dsl-libs-base==4.4.2->nvidia-cutlass-dsl>=4.4.0.dev1->vllm) Downloading cuda_python-13.2.0-py3-none-any.whl.metadata (6.5 kB) INFO: pip is looking at multiple versions of cuda-python to determine which version is compatible with other requirements. This could take a while. Downloading cuda_python-13.1.1-py3-none-any.whl.metadata (6.2 kB) Downloading cuda_python-13.1.0-py3-none-any.whl.metadata (4.8 kB) Downloading cuda_python-13.0.3-py3-none-any.whl.metadata (4.7 kB) Downloading cuda_python-13.0.2-py3-none-any.whl.metadata (4.7 kB) Downloading cuda_python-13.0.1-py3-none-any.whl.metadata (4.7 kB) Downloading cuda_python-13.0.0-py3-none-any.whl.metadata (4.7 kB) Downloading cuda_python-12.9.6-py3-none-any.whl.metadata (4.7 kB) INFO: pip is still looking at multiple versions of cuda-python to determine which version is compatible with other requirements. This could take a while. Downloading cuda_python-12.9.5-py3-none-any.whl.metadata (4.7 kB) Downloading cuda_python-12.9.4-py3-none-any.whl.metadata (4.7 kB) Collecting importlib-metadata<8.8.0,>=6.0 (from opentelemetry-api>=1.27.0->vllm) Downloading importlib_metadata-8.7.1-py3-none-any.whl.metadata (4.7 kB) Collecting zipp>=3.20 (from importlib-metadata<8.8.0,>=6.0->opentelemetry-api>=1.27.0->vllm) Downloading zipp-3.23.0-py3-none-any.whl.metadata (3.6 kB) Collecting opentelemetry-exporter-otlp-proto-grpc==1.40.0 (from opentelemetry-exporter-otlp>=1.27.0->vllm) Downloading opentelemetry_exporter_otlp_proto_grpc-1.40.0-py3-none-any.whl.metadata (2.6 kB) Collecting opentelemetry-exporter-otlp-proto-http==1.40.0 (from opentelemetry-exporter-otlp>=1.27.0->vllm) Downloading opentelemetry_exporter_otlp_proto_http-1.40.0-py3-none-any.whl.metadata (2.5 kB) Collecting googleapis-common-protos~=1.57 (from opentelemetry-exporter-otlp-proto-grpc==1.40.0->opentelemetry-exporter-otlp>=1.27.0->vllm) Downloading googleapis_common_protos-1.73.0-py3-none-any.whl.metadata (9.4 kB) Collecting opentelemetry-exporter-otlp-proto-common==1.40.0 (from opentelemetry-exporter-otlp-proto-grpc==1.40.0->opentelemetry-exporter-otlp>=1.27.0->vllm) Downloading opentelemetry_exporter_otlp_proto_common-1.40.0-py3-none-any.whl.metadata (1.9 kB) Collecting opentelemetry-proto==1.40.0 (from opentelemetry-exporter-otlp-proto-grpc==1.40.0->opentelemetry-exporter-otlp>=1.27.0->vllm) Downloading opentelemetry_proto-1.40.0-py3-none-any.whl.metadata (2.4 kB) Collecting protobuf!=6.30.*,!=6.31.*,!=6.32.*,!=6.33.0.*,!=6.33.1.*,!=6.33.2.*,!=6.33.3.*,!=6.33.4.*,>=5.29.6 (from vllm) Downloading protobuf-6.33.6-cp39-abi3-manylinux2014_x86_64.whl.metadata (593 bytes) Collecting opentelemetry-semantic-conventions==0.61b0 (from opentelemetry-sdk>=1.27.0->vllm) Downloading opentelemetry_semantic_conventions-0.61b0-py3-none-any.whl.metadata (2.5 kB) Collecting charset_normalizer<4,>=2 (from requests>=2.26.0->vllm) Downloading charset_normalizer-3.4.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (40 kB) Collecting urllib3<3,>=1.21.1 (from requests>=2.26.0->vllm) Downloading urllib3-2.6.3-py3-none-any.whl.metadata (6.9 kB) Collecting pycountry>=23 (from pydantic-extra-types[pycountry]>=2.10.5->mistral_common>=1.9.1->mistral_common[image]>=1.9.1->vllm) Downloading pycountry-26.2.16-py3-none-any.whl.metadata (12 kB) Collecting python-dotenv>=0.21.0 (from pydantic-settings>=2.0.0->fastapi[standard]>=0.115.0->vllm) Downloading python_dotenv-1.2.2-py3-none-any.whl.metadata (27 kB) Collecting torch-c-dlpack-ext (from quack-kernels>=0.2.7->vllm) Downloading torch_c_dlpack_ext-0.1.5-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (14 kB) Collecting msgpack<2.0.0,>=1.0.0 (from ray>=2.48.0->ray[cgraph]>=2.48.0->vllm) Downloading msgpack-1.1.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (8.1 kB) Collecting cupy-cuda12x (from ray[cgraph]>=2.48.0->vllm) Downloading cupy_cuda12x-14.0.1-cp312-cp312-manylinux2014_x86_64.whl.metadata (2.8 kB) Collecting rich>=13.7.1 (from rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading rich-14.3.3-py3-none-any.whl.metadata (18 kB) Collecting markdown-it-py>=2.2.0 (from rich>=13.7.1->rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading markdown_it_py-4.0.0-py3-none-any.whl.metadata (7.3 kB) Collecting pygments<3.0.0,>=2.13.0 (from rich>=13.7.1->rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading pygments-2.19.2-py3-none-any.whl.metadata (2.5 kB) Collecting mdurl~=0.1 (from markdown-it-py>=2.2.0->rich>=13.7.1->rich-toolkit>=0.14.8->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading mdurl-0.1.2-py3-none-any.whl.metadata (1.6 kB) Collecting mpmath<1.4,>=1.1.0 (from sympy>=1.13.3->torch==2.10.0->vllm) Downloading mpmath-1.3.0-py3-none-any.whl.metadata (8.6 kB) Collecting shellingham>=1.3.0 (from typer>=0.16.0->fastapi-cli>=0.0.8->fastapi-cli[standard]>=0.0.8; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading shellingham-1.5.4-py2.py3-none-any.whl.metadata (3.5 kB) Collecting httptools>=0.6.3 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading httptools-0.7.1-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.metadata (3.5 kB) Collecting uvloop>=0.15.1 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading uvloop-0.22.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (4.9 kB) Collecting websockets>=10.4 (from uvicorn[standard]>=0.12.0; extra == "standard"->fastapi[standard]>=0.115.0->vllm) Downloading websockets-16.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.metadata (6.8 kB) Collecting httpx-sse>=0.4 (from mcp->vllm) Downloading httpx_sse-0.4.3-py3-none-any.whl.metadata (9.7 kB) Collecting pyjwt>=2.10.1 (from pyjwt[crypto]>=2.10.1->mcp->vllm) Downloading pyjwt-2.12.1-py3-none-any.whl.metadata (4.1 kB) Collecting sse-starlette>=1.6.1 (from mcp->vllm) Downloading sse_starlette-3.3.3-py3-none-any.whl.metadata (14 kB) Collecting cryptography>=3.4.0 (from pyjwt[crypto]>=2.10.1->mcp->vllm) Downloading cryptography-46.0.5-cp311-abi3-manylinux_2_34_x86_64.whl.metadata (5.7 kB) Collecting cffi>=2.0.0 (from cryptography>=3.4.0->pyjwt[crypto]>=2.10.1->mcp->vllm) Downloading cffi-2.0.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (2.6 kB) Collecting pycparser (from cffi>=2.0.0->cryptography>=3.4.0->pyjwt[crypto]>=2.10.1->mcp->vllm) Downloading pycparser-3.0-py3-none-any.whl.metadata (8.2 kB) Downloading vllm-0.17.1-cp38-abi3-manylinux_2_31_x86_64.whl (432.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 432.9/432.9 MB 184.5 MB/s 0:00:02 Downloading compressed_tensors-0.13.0-py3-none-any.whl (192 kB) Downloading depyf-0.20.0-py3-none-any.whl (39 kB) Downloading diskcache-5.6.3-py3-none-any.whl (45 kB) Downloading flashinfer_python-0.6.4-py3-none-any.whl (7.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.8/7.8 MB 304.4 MB/s 0:00:00 Downloading lark-1.2.2-py3-none-any.whl (111 kB) Downloading lm_format_enforcer-0.11.3-py3-none-any.whl (45 kB) Downloading numba-0.61.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.9/3.9 MB 271.1 MB/s 0:00:00 Downloading outlines_core-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 197.1 MB/s 0:00:00 Downloading https://download.pytorch.org/whl/cu128/torch-2.10.0%2Bcu128-cp312-cp312-manylinux_2_28_x86_64.whl (916.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 916.9/916.9 MB 333.2 MB/s 0:00:01 Downloading https://download.pytorch.org/whl/cu128/cuda_bindings-12.9.4-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (12.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.2/12.2 MB 360.5 MB/s 0:00:00 Downloading https://download-r2.pytorch.org/whl/cu128/torchaudio-2.10.0%2Bcu128-cp312-cp312-manylinux_2_28_x86_64.whl (1.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 9.0 MB/s 0:00:00 Downloading https://download-r2.pytorch.org/whl/cu128/torchvision-0.25.0%2Bcu128-cp312-cp312-manylinux_2_28_x86_64.whl (8.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.1/8.1 MB 333.7 MB/s 0:00:00 Downloading https://download.pytorch.org/whl/triton-3.6.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (188.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 188.3/188.3 MB 578.8 MB/s 0:00:00 Downloading xgrammar-0.1.29-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 34.9/34.9 MB 215.3 MB/s 0:00:00 Downloading apache_tvm_ffi-0.1.9-cp312-abi3-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (2.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 218.9 MB/s 0:00:00 Downloading cuda_pathfinder-1.4.3-py3-none-any.whl (47 kB) Downloading llguidance-1.3.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 307.1 MB/s 0:00:00 Downloading llvmlite-0.44.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.4/42.4 MB 316.7 MB/s 0:00:00 Downloading model_hosting_container_standards-0.1.14-py3-none-any.whl (121 kB) Downloading numpy-2.2.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.5/16.5 MB 303.2 MB/s 0:00:00 Downloading nvidia_cudnn_frontend-1.18.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (2.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.2/2.2 MB 167.1 MB/s 0:00:00 Downloading openai-2.24.0-py3-none-any.whl (1.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 201.6 MB/s 0:00:00 Downloading anyio-4.12.1-py3-none-any.whl (113 kB) Downloading distro-1.9.0-py3-none-any.whl (20 kB) Downloading httpx-0.28.1-py3-none-any.whl (73 kB) Downloading httpcore-1.0.9-py3-none-any.whl (78 kB) Downloading jiter-0.13.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (360 kB) Downloading pydantic-2.12.5-py3-none-any.whl (463 kB) Downloading pydantic_core-2.41.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 351.4 MB/s 0:00:00 Downloading setuptools-80.10.2-py3-none-any.whl (1.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 233.9 MB/s 0:00:00 Downloading transformers-4.57.6-py3-none-any.whl (12.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.0/12.0 MB 316.4 MB/s 0:00:00 Downloading huggingface_hub-0.36.2-py3-none-any.whl (566 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 566.4/566.4 kB 118.4 MB/s 0:00:00 Downloading hf_xet-1.4.2-cp37-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (4.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.2/4.2 MB 246.4 MB/s 0:00:00 Downloading tokenizers-0.22.2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 256.5 MB/s 0:00:00 Downloading https://download.pytorch.org/whl/typing_extensions-4.15.0-py3-none-any.whl (44 kB) Downloading aiohttp-3.13.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 340.9 MB/s 0:00:00 Downloading multidict-6.7.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (256 kB) Downloading yarl-1.23.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (100 kB) Downloading aiohappyeyeballs-2.6.1-py3-none-any.whl (15 kB) Downloading aiosignal-1.4.0-py3-none-any.whl (7.5 kB) Downloading annotated_types-0.7.0-py3-none-any.whl (13 kB) Downloading anthropic-0.86.0-py3-none-any.whl (469 kB) Downloading docstring_parser-0.17.0-py3-none-any.whl (36 kB) Downloading attrs-25.4.0-py3-none-any.whl (67 kB) Downloading fastapi-0.135.1-py3-none-any.whl (116 kB) Downloading annotated_doc-0.0.4-py3-none-any.whl (5.3 kB) Downloading email_validator-2.3.0-py3-none-any.whl (35 kB) Downloading dnspython-2.8.0-py3-none-any.whl (331 kB) Downloading fastapi_cli-0.0.24-py3-none-any.whl (12 kB) Downloading fastapi_cloud_cli-0.15.0-py3-none-any.whl (32 kB) Downloading fastar-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (821 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 821.6/821.6 kB 193.8 MB/s 0:00:00 Downloading filelock-3.25.2-py3-none-any.whl (26 kB) Downloading frozenlist-1.8.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (242 kB) Downloading fsspec-2026.2.0-py3-none-any.whl (202 kB) Downloading gguf-0.18.0-py3-none-any.whl (114 kB) Downloading h11-0.16.0-py3-none-any.whl (37 kB) Downloading idna-3.11-py3-none-any.whl (71 kB) Downloading interegular-0.3.3-py37-none-any.whl (23 kB) Downloading https://download.pytorch.org/whl/jinja2-3.1.6-py3-none-any.whl (134 kB) Downloading kaldi_native_fbank-1.22.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (322 kB) Downloading markupsafe-3.0.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (22 kB) Downloading mistral_common-1.10.0-py3-none-any.whl (6.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 292.7 MB/s 0:00:00 Downloading jsonschema-4.26.0-py3-none-any.whl (90 kB) Downloading jsonschema_specifications-2025.9.1-py3-none-any.whl (18 kB) Downloading networkx-3.6.1-py3-none-any.whl (2.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 483.8 MB/s 0:00:00 Downloading nvidia_cutlass_dsl-4.4.2-py3-none-any.whl (10 kB) Downloading nvidia_cutlass_dsl_libs_base-4.4.2-cp312-cp312-manylinux_2_28_x86_64.whl (74.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.4/74.4 MB 87.5 MB/s 0:00:00 Downloading cuda_python-12.9.4-py3-none-any.whl (7.6 kB) Downloading openai_harmony-0.0.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 259.7 MB/s 0:00:00 Downloading opencv_python_headless-4.13.0.92-cp37-abi3-manylinux_2_28_x86_64.whl (60.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.4/60.4 MB 351.1 MB/s 0:00:00 Downloading opentelemetry_api-1.40.0-py3-none-any.whl (68 kB) Downloading importlib_metadata-8.7.1-py3-none-any.whl (27 kB) Downloading opentelemetry_exporter_otlp-1.40.0-py3-none-any.whl (7.0 kB) Downloading opentelemetry_exporter_otlp_proto_grpc-1.40.0-py3-none-any.whl (20 kB) Downloading opentelemetry_exporter_otlp_proto_common-1.40.0-py3-none-any.whl (18 kB) Downloading opentelemetry_exporter_otlp_proto_http-1.40.0-py3-none-any.whl (19 kB) Downloading opentelemetry_proto-1.40.0-py3-none-any.whl (72 kB) Downloading googleapis_common_protos-1.73.0-py3-none-any.whl (297 kB) Downloading grpcio-1.78.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (6.7 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.7/6.7 MB 478.4 MB/s 0:00:00 Downloading opentelemetry_sdk-1.40.0-py3-none-any.whl (141 kB) Downloading opentelemetry_semantic_conventions-0.61b0-py3-none-any.whl (231 kB) Downloading protobuf-6.33.6-cp39-abi3-manylinux2014_x86_64.whl (323 kB) Downloading requests-2.32.5-py3-none-any.whl (64 kB) Downloading charset_normalizer-3.4.6-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (207 kB) Downloading urllib3-2.6.3-py3-none-any.whl (131 kB) Downloading certifi-2026.2.25-py3-none-any.whl (153 kB) Downloading opentelemetry_semantic_conventions_ai-0.4.15-py3-none-any.whl (6.0 kB) Downloading packaging-26.0-py3-none-any.whl (74 kB) Downloading pillow-12.1.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 478.7 MB/s 0:00:00 Downloading prometheus_client-0.24.1-py3-none-any.whl (64 kB) Downloading prometheus_fastapi_instrumentator-7.1.0-py3-none-any.whl (19 kB) Downloading starlette-0.52.1-py3-none-any.whl (74 kB) Downloading propcache-0.4.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (221 kB) Downloading pydantic_extra_types-2.11.1-py3-none-any.whl (79 kB) Downloading pycountry-26.2.16-py3-none-any.whl (8.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.0/8.0 MB 414.9 MB/s 0:00:00 Downloading pydantic_settings-2.13.1-py3-none-any.whl (58 kB) Downloading python_dotenv-1.2.2-py3-none-any.whl (22 kB) Downloading python_multipart-0.0.22-py3-none-any.whl (24 kB) Downloading pyyaml-6.0.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (807 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 807.9/807.9 kB 224.5 MB/s 0:00:00 Downloading pyzmq-27.1.0-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl (840 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 841.0/841.0 kB 153.5 MB/s 0:00:00 Downloading quack_kernels-0.3.4-py3-none-any.whl (181 kB) Downloading ray-2.54.0-cp312-cp312-manylinux2014_x86_64.whl (73.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 73.0/73.0 MB 261.5 MB/s 0:00:00 Downloading msgpack-1.1.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (427 kB) Downloading click-8.3.1-py3-none-any.whl (108 kB) Downloading referencing-0.37.0-py3-none-any.whl (26 kB) Downloading regex-2026.2.28-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (802 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 802.0/802.0 kB 114.2 MB/s 0:00:00 Downloading rich_toolkit-0.19.7-py3-none-any.whl (32 kB) Downloading rich-14.3.3-py3-none-any.whl (310 kB) Downloading pygments-2.19.2-py3-none-any.whl (1.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 375.9 MB/s 0:00:00 Downloading markdown_it_py-4.0.0-py3-none-any.whl (87 kB) Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB) Downloading rignore-0.7.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (959 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 959.8/959.8 kB 165.2 MB/s 0:00:00 Downloading rpds_py-0.30.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (394 kB) Downloading safetensors-0.7.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (507 kB) Downloading sentry_sdk-2.55.0-py2.py3-none-any.whl (449 kB) Downloading six-1.17.0-py2.py3-none-any.whl (11 kB) Downloading supervisor-4.3.0-py2.py3-none-any.whl (320 kB) Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 220.3 MB/s 0:00:00 Downloading mpmath-1.3.0-py3-none-any.whl (536 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 143.6 MB/s 0:00:00 Downloading tiktoken-0.12.0-cp312-cp312-manylinux_2_28_x86_64.whl (1.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 368.1 MB/s 0:00:00 Downloading tqdm-4.67.3-py3-none-any.whl (78 kB) Downloading typer-0.24.1-py3-none-any.whl (56 kB) Downloading shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB) Downloading typing_inspection-0.4.2-py3-none-any.whl (14 kB) Downloading uvicorn-0.42.0-py3-none-any.whl (68 kB) Downloading httptools-0.7.1-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (517 kB) Downloading uvloop-0.22.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (4.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.4/4.4 MB 298.1 MB/s 0:00:00 Downloading watchfiles-1.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (456 kB) Downloading websockets-16.0-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (184 kB) Downloading zipp-3.23.0-py3-none-any.whl (10 kB) Downloading astor-0.8.1-py2.py3-none-any.whl (27 kB) Downloading blake3-1.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (388 kB) Downloading cachetools-7.0.5-py3-none-any.whl (13 kB) Downloading cbor2-5.8.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (285 kB) Downloading cloudpickle-3.1.2-py3-none-any.whl (22 kB) Downloading cupy_cuda12x-14.0.1-cp312-cp312-manylinux2014_x86_64.whl (134.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.6/134.6 MB 149.9 MB/s 0:00:00 Downloading dill-0.4.1-py3-none-any.whl (120 kB) Downloading einops-0.8.2-py3-none-any.whl (65 kB) Downloading grpcio_reflection-1.78.0-py3-none-any.whl (22 kB) Downloading ijson-3.5.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (149 kB) Downloading jmespath-1.1.0-py3-none-any.whl (20 kB) Downloading loguru-0.7.3-py3-none-any.whl (61 kB) Downloading mcp-1.26.0-py3-none-any.whl (233 kB) Downloading httpx_sse-0.4.3-py3-none-any.whl (9.0 kB) Downloading pyjwt-2.12.1-py3-none-any.whl (29 kB) Downloading cryptography-46.0.5-cp311-abi3-manylinux_2_34_x86_64.whl (4.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 457.1 MB/s 0:00:00 Downloading cffi-2.0.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (219 kB) Downloading sse_starlette-3.3.3-py3-none-any.whl (14 kB) Downloading msgspec-0.20.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (224 kB) Downloading ninja-1.13.0-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (180 kB) Downloading nvidia_ml_py-13.590.48-py3-none-any.whl (50 kB) Downloading partial_json_parser-0.2.1.1.post7-py3-none-any.whl (10 kB) Downloading psutil-7.2.2-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl (155 kB) Downloading py_cpuinfo-9.0.0-py3-none-any.whl (22 kB) Downloading pybase64-1.4.3-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl (71 kB) Downloading pycparser-3.0-py3-none-any.whl (48 kB) Downloading python_json_logger-4.0.0-py3-none-any.whl (15 kB) Downloading sentencepiece-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 264.9 MB/s 0:00:00 Downloading setproctitle-1.3.7-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (32 kB) Downloading sniffio-1.3.1-py3-none-any.whl (10 kB) Downloading tabulate-0.10.0-py3-none-any.whl (39 kB) Downloading torch_c_dlpack_ext-0.1.5-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (897 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 897.8/897.8 kB 167.9 MB/s 0:00:00 Installing collected packages: supervisor, py-cpuinfo, nvidia-ml-py, nvidia-cusparselt-cu12, mpmath, zipp, websockets, uvloop, urllib3, typing_extensions, triton, tqdm, tabulate, sympy, sniffio, six, shellingham, setuptools, setproctitle, sentencepiece, safetensors, rpds-py, rignore, regex, pyzmq, pyyaml, python-multipart, python-json-logger, python-dotenv, pyjwt, pygments, pycparser, pycountry, pybase64, psutil, protobuf, propcache, prometheus_client, pillow, partial-json-parser, packaging, outlines_core, nvidia-nvtx-cu12, nvidia-nvshmem-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufile-cu12, nvidia-cudnn-frontend, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, ninja, networkx, multidict, msgspec, msgpack, mdurl, MarkupSafe, loguru, llvmlite, llguidance, lark, kaldi-native-fbank, jmespath, jiter, interegular, ijson, idna, httpx-sse, httptools, hf-xet, h11, fsspec, frozenlist, filelock, fastar, einops, docstring-parser, dnspython, distro, diskcache, dill, cuda-pathfinder, cloudpickle, click, charset_normalizer, certifi, cbor2, cachetools, blake3, attrs, astor, annotated-types, annotated-doc, aiohappyeyeballs, yarl, uvicorn, typing-inspection, sentry-sdk, requests, referencing, pydantic-core, opentelemetry-proto, opencv-python-headless, nvidia-cusparse-cu12, nvidia-cufft-cu12, nvidia-cudnn-cu12, numba, markdown-it-py, jinja2, importlib-metadata, httpcore, grpcio, googleapis-common-protos, email-validator, depyf, cupy-cuda12x, cuda-bindings, cffi, apache-tvm-ffi, anyio, aiosignal, watchfiles, tiktoken, starlette, rich, pydantic, opentelemetry-exporter-otlp-proto-common, opentelemetry-api, nvidia-cusolver-cu12, jsonschema-specifications, huggingface-hub, httpx, grpcio-reflection, gguf, cuda-python, cryptography, aiohttp, typer, torch, tokenizers, sse-starlette, rich-toolkit, pydantic-settings, pydantic-extra-types, prometheus-fastapi-instrumentator, opentelemetry-semantic-conventions, openai-harmony, openai, nvidia-cutlass-dsl-libs-base, lm-format-enforcer, jsonschema, fastapi, anthropic, transformers, torchvision, torchaudio, torch-c-dlpack-ext, ray, opentelemetry-sdk, nvidia-cutlass-dsl, model-hosting-container-standards, mcp, fastapi-cloud-cli, fastapi-cli, xgrammar, quack-kernels, opentelemetry-semantic-conventions-ai, opentelemetry-exporter-otlp-proto-http, opentelemetry-exporter-otlp-proto-grpc, mistral_common, flashinfer-python, compressed-tensors, opentelemetry-exporter-otlp, vllm Successfully installed MarkupSafe-3.0.3 aiohappyeyeballs-2.6.1 aiohttp-3.13.3 aiosignal-1.4.0 annotated-doc-0.0.4 annotated-types-0.7.0 anthropic-0.86.0 anyio-4.12.1 apache-tvm-ffi-0.1.9 astor-0.8.1 attrs-25.4.0 blake3-1.0.8 cachetools-7.0.5 cbor2-5.8.0 certifi-2026.2.25 cffi-2.0.0 charset_normalizer-3.4.6 click-8.3.1 cloudpickle-3.1.2 compressed-tensors-0.13.0 cryptography-46.0.5 cuda-bindings-12.9.4 cuda-pathfinder-1.4.3 cuda-python-12.9.4 cupy-cuda12x-14.0.1 depyf-0.20.0 dill-0.4.1 diskcache-5.6.3 distro-1.9.0 dnspython-2.8.0 docstring-parser-0.17.0 einops-0.8.2 email-validator-2.3.0 fastapi-0.135.1 fastapi-cli-0.0.24 fastapi-cloud-cli-0.15.0 fastar-0.8.0 filelock-3.25.2 flashinfer-python-0.6.4 frozenlist-1.8.0 fsspec-2026.2.0 gguf-0.18.0 googleapis-common-protos-1.73.0 grpcio-1.78.0 grpcio-reflection-1.78.0 h11-0.16.0 hf-xet-1.4.2 httpcore-1.0.9 httptools-0.7.1 httpx-0.28.1 httpx-sse-0.4.3 huggingface-hub-0.36.2 idna-3.11 ijson-3.5.0 importlib-metadata-8.7.1 interegular-0.3.3 jinja2-3.1.6 jiter-0.13.0 jmespath-1.1.0 jsonschema-4.26.0 jsonschema-specifications-2025.9.1 kaldi-native-fbank-1.22.3 lark-1.2.2 llguidance-1.3.0 llvmlite-0.44.0 lm-format-enforcer-0.11.3 loguru-0.7.3 markdown-it-py-4.0.0 mcp-1.26.0 mdurl-0.1.2 mistral_common-1.10.0 model-hosting-container-standards-0.1.14 mpmath-1.3.0 msgpack-1.1.2 msgspec-0.20.0 multidict-6.7.1 networkx-3.6.1 ninja-1.13.0 numba-0.61.2 numpy-2.2.6 nvidia-cublas-cu12-12.8.4.1 nvidia-cuda-cupti-cu12-12.8.90 nvidia-cuda-nvrtc-cu12-12.8.93 nvidia-cuda-runtime-cu12-12.8.90 nvidia-cudnn-cu12-9.10.2.21 nvidia-cudnn-frontend-1.18.0 nvidia-cufft-cu12-11.3.3.83 nvidia-cufile-cu12-1.13.1.3 nvidia-curand-cu12-10.3.9.90 nvidia-cusolver-cu12-11.7.3.90 nvidia-cusparse-cu12-12.5.8.93 nvidia-cusparselt-cu12-0.7.1 nvidia-cutlass-dsl-4.4.2 nvidia-cutlass-dsl-libs-base-4.4.2 nvidia-ml-py-13.590.48 nvidia-nccl-cu12-2.27.5 nvidia-nvjitlink-cu12-12.8.93 nvidia-nvshmem-cu12-3.4.5 nvidia-nvtx-cu12-12.8.90 openai-2.24.0 openai-harmony-0.0.8 opencv-python-headless-4.13.0.92 opentelemetry-api-1.40.0 opentelemetry-exporter-otlp-1.40.0 opentelemetry-exporter-otlp-proto-common-1.40.0 opentelemetry-exporter-otlp-proto-grpc-1.40.0 opentelemetry-exporter-otlp-proto-http-1.40.0 opentelemetry-proto-1.40.0 opentelemetry-sdk-1.40.0 opentelemetry-semantic-conventions-0.61b0 opentelemetry-semantic-conventions-ai-0.4.15 outlines_core-0.2.11 packaging-26.0 partial-json-parser-0.2.1.1.post7 pillow-12.1.1 prometheus-fastapi-instrumentator-7.1.0 prometheus_client-0.24.1 propcache-0.4.1 protobuf-6.33.6 psutil-7.2.2 py-cpuinfo-9.0.0 pybase64-1.4.3 pycountry-26.2.16 pycparser-3.0 pydantic-2.12.5 pydantic-core-2.41.5 pydantic-extra-types-2.11.1 pydantic-settings-2.13.1 pygments-2.19.2 pyjwt-2.12.1 python-dotenv-1.2.2 python-json-logger-4.0.0 python-multipart-0.0.22 pyyaml-6.0.3 pyzmq-27.1.0 quack-kernels-0.3.4 ray-2.54.0 referencing-0.37.0 regex-2026.2.28 requests-2.32.5 rich-14.3.3 rich-toolkit-0.19.7 rignore-0.7.6 rpds-py-0.30.0 safetensors-0.7.0 sentencepiece-0.2.1 sentry-sdk-2.55.0 setproctitle-1.3.7 setuptools-80.10.2 shellingham-1.5.4 six-1.17.0 sniffio-1.3.1 sse-starlette-3.3.3 starlette-0.52.1 supervisor-4.3.0 sympy-1.14.0 tabulate-0.10.0 tiktoken-0.12.0 tokenizers-0.22.2 torch-2.10.0+cu128 torch-c-dlpack-ext-0.1.5 torchaudio-2.10.0+cu128 torchvision-0.25.0+cu128 tqdm-4.67.3 transformers-4.57.6 triton-3.6.0 typer-0.24.1 typing-inspection-0.4.2 typing_extensions-4.15.0 urllib3-2.6.3 uvicorn-0.42.0 uvloop-0.22.1 vllm-0.17.1 watchfiles-1.1.1 websockets-16.0 xgrammar-0.1.29 yarl-1.23.0 zipp-3.23.0