Cyborg FPGA Programming Service Proposal¶
Blueprint url is not available yet https://blueprints.launchpad.net/openstack-cyborg/+spec/cyborg-fpga-programming-ability
This spec proposes a Programming Service to be added to Cyborg to allow user dynamically change the functions loaded on FPGA in cloud environment
Problem description¶
A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturing. Their advantage lies in that they are sometimes significantly faster for some applications because of their parallel nature and optimality in terms of the number of gates used for a certain process. In addition, FPGA can be reprogrammed based on different applications Hence, using FPGA for application acceleration in cloud has been becoming desirable. Cyborg as a management framwork for heterogeneous accelerators, tracking, deploying and reprogramming FPGAs are much needed features. Since the FPGA modelling has already been proposed in another document, this spec will be focused on proposing Reporgramming Service for FPGAs in Cyborg
Use Cases¶
In the scenario of OpenCL, user loads the accelerators on FPGA for their application. When different applications are executing on OpenCL environment, the accelerators will be changed from time to time. It will not be feasible to login to each host and change the FPGA configuration manually by lab admin. Instead, through the reprogramming service, users can manage the functions of FPGA using a set of REST APIs.
Similarly, during the maintenance of FPGA, admin needs to update/migrate shells and bitstreams on FPGAs within data center. Cyborg Reprogramming Service will allow them to use the APIs from a centralized console.
Since this is a pure proposal for programming APIs, it would not focus on what the upstream use case/runtime is. Those details will be in separate specs when needed.
Proposed change¶
First of all, Cyborg needs to add extra REST APIs to allow others to invoke the programming service. The REST api should have following format:
Url: {base_url}/fpga/{deployable_uuid}
Method: POST
URL Params:
None
Data Params:
glance_bitstream_uuid
Success Response:
POST:
Code: 200
Body: { "msg" : "bitstream has been loaded successfully"}
Error Response
Code: 401 UNAUTHORIZED
Body: { error : "Log in" }
OR
Code: 422 Unprocessable Entry
Body: { error : "User is not authorized to use the resource" }
Sample Call:
To program fpga resource with deployable_uuid=2864a139-c2cd-4f9f-abf3-44eb3f09b83c
with bitstream with uuid=0b955a5b-f5dd-49d0-8c4f-28729427d303
$.ajax({
url: "/fpga/2864a139-c2cd-4f9f-abf3-44eb3f09b83c",
data: {
"glance_bitstream_uuid": "0b955a5b-f5dd-49d0-8c4f-28729427d303"
},
dataType: "json",
type : "post",
success : function(r) {
console.log(r);
}
});
Second, implement the service in Cyborg which does three tasks: 1. identify the host location of the requested FPGA/Partial Reconfiguraion(PR) Region(e.g. on which host is the board located). 2. Check if the user(API caller, OpenStack Login User, etc) has the privilige to use the given bitstream, FPGA, or host. 3. If the previous checks pass, Cyborg will send the program notification to the target host with requested FPGA.
Third, implement notification callee in Cyborg Agent. This should be a rpc call with following signature:
int program_fpga_with_bitstream(deployable_uuid, bitstream_uuid)
The function takes both deployable_uuid and bitstream_uuid as input. It uses deployable_uuid to identify which specific FPGA/PR region is going to be programmed and uses bitstream_uuid to retrieve bitstream from the bitstream storage service (Glance in the context of OpenStack). In addition, this is a synchronous meaning it will wait for the programming task to be completed and then return a status code as integer. The return code should have following interpretation:
code |
meaning |
0 |
program successfully |
1 |
failed with unkown errors |
2 |
invalid deployable_uuid(target fpga not found) |
3 |
invalid bitstream_uuid(bitstream can not be downloaded) |
Alternatives¶
Data model impact¶
REST API impact¶
A rest api will be added to the Cyborg service as we discussed previously. It should not impact any of the existing rest apis
Security impact¶
The access to FPGA/PR region and bitstreams should be carefully checked.
Notifications impact¶
None
Other end user impact¶
None
Performance Impact¶
None
Other deployer impact¶
None
Developer impact¶
On the Cyborg Agent side, it relies on program() api implemented by vendor.
Implementation¶
Assignee(s)¶
- Primary assignee:
Li Liu <liliu1@huawei.com>
Work Items¶
Implement the cyborg program service rest api
Implement the cyborg program service
Implement the notification call in Cyborg Agent, which invokes vendor driver
Dependencies¶
Testing¶
Documentation Impact¶
The Cyborg-Nova interaction related specs need to be aware the change of the accelerators when FPGAs are being reprogrammed.
References¶
None