General Information¶
Eisen offers deployment capabilities by leveraging TorchServing as implemented in PyTorch 1.5.0 and newer releases. In this way, it is possible to use models trained with Eisen as prediction services through a simple HTTP interface that can be used directly by sending requests using any library or by leveraging the functionality offered by the Client included in Eisen-Deploy.
Eisen-Deploy is included in the distribution of eisen and can therefore be obtained by executing
$ pip install eisen
Otherwise, it is possible to obtain Eisen-Deploy by installing only the eisen-deploy package
$ pip install eisen-deploy
Using Eisen-Deploy can be achieved by importing the necessary modules directly in your code.
Packaging¶
Eisen-Deploy implements model serving via TorchServing. Models need to be packaged in a MAR archive before being able to be used for serving. This is achieved via packaging.
What packaging does is to create a compressed tar archive that is portable and can be deployed using TorchServing. This functionality is documented below.
-
eisen_deploy.packaging.
create_metadata
(input_name_list, input_type_list, input_shape_list, output_name_list, output_type_list, output_shape_list, model_input_list=None, model_output_list=None, custom_meta_dict=None)[source]¶ Facilitates creation of a metadata dictionary for model packaging (MAR). The method renders user-supplied information compliant with standard expected format for the metadata.
It has to be noted that the format of metadata is completely up the user. The only reqirement is that metadata should be always supplied as a json-serializable dictionary.
This method makes metadata more standard by capturing information about model inputs and outputs in fields that are conventionally used and accepted across Eisen ecosystem. That is, this method implements a convention about the format of metadata
- Parameters
input_name_list (list) – A list of strings representing model input names Eg. [‘input’] for single-input model
input_type_list (list) – A list of strings for input types Eg. [‘ndarray’] matching exp. type for ‘input’
input_shape_list (list) – A list of shapes (list) representing expected input shape Eg. [[-1, 3, 244, 244]]
output_name_list (list) – List of strings representing model output names Eg. [‘logits’, ‘prediction’]
output_type_list (list) – List of strings representing model output types Eg. [‘ndarray’, ‘str’]
output_shape_list (list) – List of shapes (list) for output shape Eg. [[-1, 10], [-1]]
model_input_list (list) – List of input names that should be used as model inputs (default all input_name_list)
model_output_list (list) – List of output names that should be obtained from the model (default all output_name_list)
custom_meta_dict (dict) – A json-serializable dictionary containing custom information (Eg. options or notes)
- Returns
Dictionary containing metadata in standardized format
-
class
eisen_deploy.packaging.
EisenServingMAR
(pre_processing, post_processing, meta_data, handler=None)[source]¶ This object implements model packaging compliant with PyTorch serving. This kind of packaging is referred as a MAR package. It follows the PyTorch standard, which is documented here https://github.com/pytorch/serve/tree/master/model-archiver
Once the model is packaged it can be used for inference via TorchServe. Packing the model will in fact result in a <filename>.mar package (which usually is a .tar.gz archive) that can be used through the following command:
torchserve --start --ncs --model-store model_zoo --models model.mar
-
__init__
(pre_processing, post_processing, meta_data, handler=None)[source]¶ Saving a MAR package for a model requires an Eisen pre-processing transform object, Eisen post-processing transform object, a model object (torch.nn.Module) and a metadata dictionary.
These components will be serialized and included in the MAR.
The default request handler for TorchServe is eisen_deploy.serving.EisenServingHandler. This parameter can be overridden by specifying the path of a custom handler or using one of the custom handlers provided by PyTorch. When the default handler is overridden, the pre- and post- processing transforms as well as the metadata might be ignored and the behavior during serving might differ from expected.
from eisen_deploy.packaging import EisenServingMAR my_model = # Eg. A torch.nn.Module instance my_pre_processing = # Eg. A pre processing transform object my_post_processing = # Eg. A pre processing transform object metadata = {'inputs': [], 'outputs': []} # metadata dictionary mar_creator = EisenServingMAR(my_pre_processing, my_post_processing, metadata) mar_creator.pack(my_model, '/path/to/archive')
- Parameters
pre_processing (callable) – pre processing transform object. Will be dilled into a dill file
post_processing (callable) – post processing transform object. Will be dilled into a dill file
meta_data (dict) – dictionary containing meta data about the model (Eg. information about inputs and outputs)
handler (str) – name or filename of the handler. It is an optional parameter which rarely needs to be changed
-
pack
(model, dst_path, model_name, model_version, additional_files=None)[source]¶ Package a model into the MAR archive so that it can be used for serving using TorchServing
- Parameters
model (torch.nn.Module) – an object representing a model
dst_path (str) – the destination base path (do not include the filename) of the MAR
model_name (str) – the name of the model (will be also used to define the prediction endpoint)
model_version (str) – a string encoding the version of the model
additional_files (iterable) – an optional list of files that should be included in the MAR
- Returns
None
-
Handlers¶
The reason why we are able to ingest data that follows the specifications of Eisen, as well as use all the convenient interfaces and design strategies of Eisen during serving, is that we have implemented a Handler class that processes queries to the server.
This is not something users need to worry about, in fact this documentation is being included here just for development purposes. Handlers are included automatically in the MAR and should work transparently.
-
class
eisen_deploy.serving.
EisenServingHandler
[source]¶ EisenServingHandler is a custom object to handle inference request within TorchServing. It is usually included automatically in the MAR.
-
inference
(input_dict)[source]¶ Performs prediction using the model. Feeds the necessary information to the model starting from the received data and creates an output dictionary as a result.
- Parameters
input_dict (dict) – input batch, in form of a dictionary of collated datapoints
- Returns
dict
-
initialize
(ctx)[source]¶ Initializes the fields of the EisenServingHandler object based on the context.
- Parameters
ctx – context of an inference request
- Returns
None
-
Server¶
In order to serve packaged model (MAR archives) over a HTTP interface, Eisen leverages TorchServing. We do not implement our own server, but leverage what’s being developed by PyTorch (starting from version 1.5.0).
As a compact demonstration of how to use TorchServing to instantiate a server, we include the following snippets in this doc page.
Start serving via:
$ torchserve --start --ncs --model-store model_zoo --models model.mar
Stop serving via:
$ torchserve --stop
Note that you will need a model MAR packaged via the EisenServingMAR object to be able to perform inference as explained in this doc page.
Note
It is worth to check out the documentation of TorchServing. In particular the documentation about the configuration file will uncover aspects that are very important in medical imaging and elsewhere, such as the ability to configure the maximum message size (for large inputs such as volumes) and the SSL support for encryption of the communication channel. Follow this link: https://bit.ly/2U6Fpga
Clients¶
Client functionality is currently provided in Python. Of course the HTTP prediction endpoints obtained through TorchServing can be also used by making request via a library such as requests in Python and axios in Javascript (and all the other). Curl requests via terminal are also possible.
The Python client implemented here is documented below.
-
class
eisen_deploy.client.
EisenServingClient
(url, validate_inputs=False)[source]¶ Eisen Serving client functionality. This object implements communication with prediction models packaged via EisenServingMAR. This client makes the assumption that EisenServing handler is used within the MAR.
from eisen_deploy.client import EisenServingClient client = EisenServingClient(url='http://localhost/...') metadata = client.get_metadata() output = client.predict(input_data)
-
__init__
(url, validate_inputs=False)[source]¶ Initializes the client object.
from eisen_deploy.client import EisenServingClient client = EisenServingClient(url='http://localhost/...')
- Parameters
url (str) – url of prediction endpoint
validate_inputs (bool) – Whether inputs should be validated in terms of shape and type before sending them
-