STAC Research Note: Optimizing TensorFlow Inference

For the STAC-ML™ Markets (Inference) benchmark, we developed a 64-bit TensorFlow version of our models for error comparison. Given that TensorFlow is such a popular ML framework, we were surprised that it was so slow for single inference. We thought perhaps the performance resulted from something we were—or were not—doing, so we decided to revisit TensorFlow performance and report on our findings.

We documented the results of our exploration here, including what we tried, how we implemented it, and what—if any—performance improvements we saw. Optimizing TensorFlow achieved 1 to 2 orders of magnitude improvement in some cases, but as you’ll see this was not the result of a single, consistent approach.

Please log in to see file attachments. If you are not registered, you may register for no charge.

The use of machine learning (ML) to develop models is now commonplace in trading and investment. Whether the business imperative is reducing time to market for new algorithms, improving model quality, or reducing costs, financial firms have to offload major aspects of model development to machines in order to continue competing in the markets.