Encrypt and execute model

Encrypted ML Package files

The Knox ML Encryption Tool encrypts ML models and associates them with a user-specified policy. When encrypting models, users must provide a saved model file, a policy file, and a policy version number. The Knox ML Encryption Tool then generates the necessary metadata and keys explained earlier and returns the Wrapped ML Model as a .kaipkg (Knox AI Package) file.

The Encrypted ML Package file can be publicly disclosed without revealing the underlying ML model. Thus, it can be deployed to apps without exposing them to whitebox attacks. In common cases, it can be bundled with the app or library that uses the model, or hosted on a public-facing server.

On-device execution

Apps can load and execute the Encrypted ML Package file using the Knox SDK, which is accessible through the Knox Portal Partner Program. Knox SDK provides the following ML Protection APIs:

Initialization—The KnoxAiManager.getKeyProvisioning() method checks if the Knox ML Service was initialized. This one-time process provisions the Model Origin Key, if necessary. During this process, the Model Origin Key is transferred from an HSM to the Knox ML Trusted Applet, as decribed in Wrapping model files.

Creating a session—When loading and executing an ML model, system processes and hardware accelerators need to cache information to maintain low latency. This cached information is associated with a session object. An application does this by calling the KnoxAiManager.createKnoxAiSession() method. On creation, the Knox ML Service internally associates the session object with the calling application process. On every subsequent call, the Knox ML Service validates that the calling application process has access to the provided session object.

Loading an Encrypted ML Package file—An app can load an Encrypted ML Package within a session using either a shared memory buffer or a file descriptor shared using Binder. To retrieve the Model Decryption Key, the the Knox ML Service interacts with the Knox ML Trusted Applet, which has access to the Model Origin Key. The Knox ML Service is whitelisted so it can interact with the Knox ML Trusted Applet for this purpose. The Knox Knox ML Trusted Applet derives the Model Decryption Key, and returns it to the Knox ML Service. The Knox ML Service then decrypts the Wrapped ML Model and provides it to the necessary hardware accelerators.

Executing an ML model—All ML model execution calls are routed through the Knox ML Service and are associated with a session. An application can provide data to and receive data from the Knox ML Service using either a float buffer, a shared memory region, or a file descriptor shared using Binder. On receiving an execution request, the Knox ML Service validates that the calling application owns the relevant session, executes the ML model using the specified hardware accelerator, and returns the results using the specified output mechanism.

Destroying a session—When destroying a session, the Knox ML Service clears the memory associated with its ML model, and instructs other system services and hardware accelerators to remove their cached data. The ML model is shared with system services using shared memory, therefore the model data in these services is also cleared. Model data in hardware accelerators, such as GPUs can survive as unallocated memory for some time after the session is destroyed.

Share it: