Run inference on the Edge TPU with C++

When running on a general-purpose OS (such as Linux), you can use the TensorFlow Lite C++ API to run inference, but you also need the Edge TPU Runtime library (libedgetpu) to delegate Edge TPU ops to the Edge TPU. Additionally, you can use the Coral C++ library (libcoral), which provides extra APIs on top of the TensorFlow Lite library. (If you're running on a microcontroller system, you need to instead use coralmicro.)

The libcoral library is optional, but it provides a variety of convenience functions for boilerplate code that's required when executing models with TensorFlow Lite API. It also provides APIs for model pipelining and on-device transfer learning with the Edge TPU.

New! Yet another way to run inference in C++ is with the TensorFlow Lite Task Library, which greatly simplifies your C++ code for common inferencing tasks. It requires just one extra line in your Bazel build target and one extra line in your code to get acceleration on the Edge TPU (just five lines of code for classification, detection, or segmentation). Check out these C++ vision task examples.
Note: If you want to use Python, instead read Run inference on the Edge TPU with Python.

Run an inference with the libcoral API

The libcoral C++ library wraps the TensorFlow Lite C++ API to simplify the setup for your tflite::Interpreter, process input and output tensors, and enable other features with the Edge TPU. But it does not obfuscate the tflite::Interpreter, so the full power of the TensorFlow Lite API is still available to you.

For details on all the available APIs, see the libcoral API reference.

To use libcoral, you must compile the library with your project using Bazel (we do not offer a shared library). For build instructions, see the README in the libcoral source.

Inferencing example

Just to show how simple your code can be, the following code runs a classification model using the libcoral API:

int main(int argc, char* argv[]) {
  absl::ParseCommandLine(argc, argv);

  // Load the model.
  const auto model = coral::LoadModelOrDie(absl::GetFlag(FLAGS_model_path));
  auto edgetpu_context = coral::ContainsEdgeTpuCustomOp(*model)
                             ? coral::GetEdgeTpuContextOrDie()
                             : nullptr;
  auto interpreter = coral::MakeEdgeTpuInterpreterOrDie(*model, edgetpu_context.get());
  CHECK_EQ(interpreter->AllocateTensors(), kTfLiteOk);

  // Read the image to input tensor.
  auto input = coral::MutableTensorData<char>(*interpreter->input_tensor(0));
  coral::ReadFileToOrDie(absl::GetFlag(FLAGS_image_path), input.data(), input.size());
  CHECK_EQ(interpreter->Invoke(), kTfLiteOk);

  // Read the label file.
  auto labels = coral::ReadLabelFile(absl::GetFlag(FLAGS_labels_path));

  for (auto result : coral::GetClassificationResults(*interpreter, 0.0f, /*top_k=*/3)) {
    std::cout << "---------------------------" << std::endl;
    std::cout << labels[result.id] << std::endl;
    std::cout << "Score: " << result.score << std::endl;
  }
  return 0;
}

For more examples using the libcoral API, see our examples on GitHub.

You can also find more example code and projects on our examples page.

For more advanced features using the libcoral API, also check out how to run multiple models with multiple Edge TPUs and pipeline a model with multiple Edge TPUs.

Set up the TF Lite Interpreter with libedgetpu

Before we released libcoral, the following type of code was required to use the Edge TPU with C++. But if you want to avoid the libcoral library, then the following is basically implementation details for the coral::MakeEdgeTpuInterpreter() function. And once you have the tflite::Interpreter as shown below, the rest of your code for inferencing is up to you, via the TensorFlow Lite C++ API.

For details about all the APIs in libedgetpu, read the libedgetpu API reference, but the basic usage requires the following:

  • EdgeTpuContext: This creates an object that's associated with an Edge TPU. Usually, you'll have just one Edge TPU to work with so you can instantiate this with EdgeTpuManager::OpenDevice(). But it's possible to use multiple Edge TPUs, so this method is overloaded so you can specify which Edge TPU you want to use.

  • kCustomOp and RegisterCustomOp(): You need to pass these to tflite::BuiltinOpResolver.AddCustom() in order for the tflite::Interpreter to understand how to execute the Edge TPU custom op inside your compiled model.

In general, the code you need to write includes the following pieces:

  1. Load your compiled Edge TPU model as a FlatBufferModel:
    const std::string model_path = "/path/to/model_compiled_for_edgetpu.tflite";
    std::unique_ptr<tflite::FlatBufferModel> model =
        tflite::FlatBufferModel::BuildFromFile(model_path.c_str());
    

    This model is required below in tflite::InterpreterBuilder().

    For details about compiling a model, read TensorFlow models on the Edge TPU.

  2. Create the EdgeTpuContext object:
    std::shared_ptr<edgetpu::EdgeTpuContext> edgetpu_context =
        edgetpu::EdgeTpuManager::GetSingleton()->OpenDevice();
    

    This context is required below in tflite::Interpreter.SetExternalContext().

  3. Specify the Edge TPU custom op when you create the Interpreter object:
    std::unique_ptr<tflite::Interpreter> model_interpreter =
        BuildEdgeTpuInterpreter(*model, edgetpu_context.get());
    
    std::unique_ptr BuildEdgeTpuInterpreter(
        const tflite::FlatBufferModel& model,
        edgetpu::EdgeTpuContext* edgetpu_context) {
      tflite::ops::builtin::BuiltinOpResolver resolver;
      resolver.AddCustom(edgetpu::kCustomOp, edgetpu::RegisterCustomOp());
      std::unique_ptr interpreter;
      if (tflite::InterpreterBuilder(model, resolver)(&interpreter) != kTfLiteOk) {
        std::cerr << "Failed to build interpreter." << std::endl;
      }
      // Bind given context with interpreter.
      interpreter->SetExternalContext(kTfLiteEdgeTpuContext, edgetpu_context);
      interpreter->SetNumThreads(1);
      if (interpreter->AllocateTensors() != kTfLiteOk) {
        std::cerr << "Failed to allocate tensors." << std::endl;
      }
      return interpreter;
    }
    
  4. Then use the Interpreter (the model_interpreter above) to execute inferences using tflite APIs. The main step is to call tflite::Interpreter::Invoke(), though you also need to prepare the input and then interpret the output. For more information, see the TensorFlow Lite documentation.

Build your project with libedgetpu

To build your project using only libedgetpu and TensorFlow Lite (not using libcoral), you can either link your code with libedgetpu statically or dynamically.

Notice that statically linking requires that you build your project with Bazel.

  1. Open your project's Bazel WORKSPACE file and add the libedgetpu repository as an http_archive.

  2. Optionally, specify the TENSORFLOW_COMMIT version in your WORKSPACE, and pass this value as a dependency for building libedgetpu. Otherwise, add an empty libedgtpu_dependencies() in your WORKSPACE and rely on the default settings of libedgetpu.

  3. To pull in libedgetpu for your binary, include a dependency on oss_libedgetpu_direct_all.

  1. Include the edgetpu.h or edgetpu_c.h file in your project.

  2. Link to the libedgetpu.so file. You should have installed this library during device setup, but you can also build it yourself.

  3. Clone the TensorFlow repo using the TENSORFLOW_COMMIT version specified here—that's the version used to build the libedgetpu.so library, so your TensorFlow version must match. Then build TensorFlow Lite (libtensorflow-lite.a) and also link to that.

For example code, checkout classify.cc and the corresponding Makefile.