Select a programming language:
Once you have a packed model, you can pass in a file path or URL to the model.
import asyncioimport cartonml as cartonimport numpy as np
async def main(): # Note this might take a while the first time you use Carton. # Make sure to enable logging as described in the quickstart model = await carton.load("https://carton.pub/google-research/bert-base-uncased") out = await model.infer({ "input": np.array(["Today is a good [MASK]."]) }) print(out) # { # 'scores': array([[12.977381]]), # 'tokens': array([['day']], dtype='<U3') # }
asyncio.run(main())
Carton loads the model (caching it locally if necessary).
If you need a packed model, take a look at the packing docs or explore the community model registry.
Carton also supports loading an unpacked model via the load_unpacked
method. This is conceptually the same as pack
followed by load
, but is implemented more efficiently internally. It supports all the options that load
and pack
support.
See the quickstart guide for an example.
There are a few options you can pass in when loading a model, but none of them are required.
visible_device
The device that is visible to this model.
Allowed values:
cpu
0
, 1
, etc.)GPU-
or MIG-GPU-
prefix).The default is GPU 0 (or CPU if no GPUs are available).
Note: a visible device does not necessarily mean that the model will use that device; it is up to the model to actually use it (e.g. by moving itself to GPU if it sees one available).
Note: If a GPU index is specified, but no GPUs are available, Carton will print a warning and attempt to fallback to CPU
await carton.load( # ... visible_device = "0",)
override_runner_opts
Options to pass to the runner. These are runner-specific (e.g. PyTorch, TensorFlow, etc).
Overrides are merged with the options set when packing the model.
These are sometimes used to configure thread-pool sizes, etc.
For allowed values, see the packing docs for each framework.
await carton.load( # ... override_runner_opts = { # For example, if we know this is a torchscript model and we want to set # threading configuration for running this model. "num_interop_threads": 4, "num_threads": 1, },)
override_required_framework_version
This is a semver version range that specifies the version of the framework that the model requires.
See https://docs.rs/semver/1.0.16/semver/enum.Op.html and https://docs.rs/semver/1.0.16/semver/struct.VersionReq.html for more details on version ranges.
This is useful if a model is limited to a specific framework version range and you want to override it.
Note: this is not guaranteed to work if the underlying model isn't compatible with the version range you specify.
await carton.load( # ...
# If we know this is a python model and we want to force it to # run with a `3.10.x` version of python. override_required_framework_version = "=3.10",)