AI STAC, 30 November 2023, NYC
The only conference focusing on the AI infrastructure needs of finance—from banking and insurance to trading and investment.
By request of many senior financial technologists in the STAC Benchmark Council comes an event dedicated to the solution stacks needed to support the finance industry’s rapid adoption of AI—from software frameworks to chips, servers, networks, and storage. Come to discuss infrastructure for model and data exploration, training, and inference with deep neural networks (especially LLMs) and statistical ML.
AGENDA
Click on the session titles to view the slides.
The big picture: What the rise of LLMs means for financial technology | |
The surprising capabilities of today's LLMs were a wake-up call to C-suites across the finance industry. This new attention has unlocked huge business demand for AI while heightening expectations. IT groups are now asked to support AI in everything from document search, summarization, and preparation to number-crunching tasks like credit assessment and wealth strategies. We'll begin the day's discussion by exploring big-picture questions critical to financial technologies. What new expectations has the rise of AI placed on technology organizations? What is the impact of AI workloads on infrastructure requirements? What do these changes mean for the priorities and skillsets of IT teams? Our panel of senior technologists will discuss what today's AI means for the business, how technologists can best support it, and the implications for software/hardware/cloud infrastructure. |
Choosing AI platforms: Finding solid footing in shifting sands | |
The software ecosystem for AI research, training, and inference is advancing so fast that technology choices are often outdated even before purchase orders can be approved. New tools or patterns for administering model training pipelines and managing AI-powered applications seem to flow in a constant stream, with each approach surpassing its predecessor. How can technologists select, deploy, and maintain an AI software ecosystem when the sands are shifting beneath their feet? How should they balance today's immediate demands with long-term flexibility? Can IT provide a stable interface to users by separating interactions from functions? Are any of the current platform options ready to evolve with an enterprise’s needs? Our panel of experts will explore these questions and yours. To kick off, there were some brief presentations: |
|
"Adapting to the Speed of Change: Making Informed Platform Choices in a Fragmented AI Software Ecosystem" Allen Holmes Jr., AI Business Lead, Lenovo |
|
"Open Technologies Driving Model Portability and Scalability" Bob Gaines, HPC/ AI Technical Solutions Architect, Intel Corporation |
Scaling Machine Learning at Bloomberg | |
For 15 years, Bloomberg has invested heavily in three specialized areas in AI: natural language processing, information retrieval and search, and core machine learning, which includes deep learning. Over this time, they've solved many challenges in the ML pipeline, including dataset orchestration, infrastructure abstraction, and automated deployment of inference services. Ania and David will introduce Bloomberg's take on the model development life cycle and how they've built their internal Data Science Platform to allow the company's AI research scientists and engineers to run many large experiments quickly. They'll explain how they've made it easy to orchestrate ML jobs on multiple datasets, train and fine-tune models, and evaluate the results, all without practitioners worrying about managing the hardware or software needed. Finally, they'll cover KServe, an open-source project that Bloomberg engineers have helped develop and contributed to, which allows them to deploy trained models as inference services in seconds with one click. Don’t miss this chance to learn from Bloomberg’s experiences building a high-functioning ML ecosystem. |
From lakehouse to skyscraper: Building the right foundation for your AI infrastructure | |
AI research benefits from massive data sets—the more diverse and comprehensive, the better. And AI in production creates vast new troves of information. How can data engineers satisfy the need to go from 100s of terabytes to 100s of petabytes, even as the number of users and their performance demands increase? Keith will explore how data engineers can scale both size and performance by architecting and implementing a modern AI lakehouse while establishing a strong foundation for further growth. He'll cover best practices for unstructured and structured data, tools for explainability, and integration with industry-standard ML Ops tools. He'll also discuss what petascale requires from storage, networking, and management software. Bring your questions for Keith as he covers building a solid data foundation. |
Leaping over the memory wall: Efficient data access for model training | |
When it comes to model training, compute architectures seem to get all the attention. But that belies a stubborn fact: training often consumes far more data than can fit in memory. In many cases, it also creates large amounts of data. Data architectures can therefore have as much impact as compute on research productivity. What do storage architects need to consider when designing solutions that meet immediate needs and account for longer-term sustainability? William will walk through what is needed to balance current and future needs. Along the way, he'll address a variety of data storage architectures and examine their impact on data accessibility, scalability, research collaboration, and the speed of innovation. He'll also cover strategies for efficient data retrieval, preprocessing, and integration into training pipelines that help maximize storage utilization and model performance. |
Not a brave new world: The reassuring IT alignment of AI and quant finance | |
LLMs and other GenAI have overtaken the tech agendas of many financial firms. OpenAI and other AI-as-a-service providers have set expectations for what is possible, but outsourcing is rarely an option due to the sensitive nature of FSI use cases. Firms must figure out how to produce GenAI models that offer comparable functionality using internal infrastructure. Does this require a new computing paradigm? Malcolm thinks not. In his view, AI infrastructure requirements are highly aligned with those of financial HPC. He’ll walk us through how a strategic combination of software components and hardware stack can deliver a flexible infrastructure suitable for both traditional quantitative analysis and AI deep learning. Join us to hear Malcolm’s perspective and bring along your questions about AI research infrastructure. |
Don't drag on RAG: How to keep data flowing in a critical new workload | |
Retrieval-augmented generation (RAG) is a popular technique to make responses from an LLM more current, accurate, and traceable without retraining the model. The architecture retrieves documents and database records related to a user's prompt and injects them into the prompt chain, inducing far better responses. While RAG reduces hallucinations and the costs of frequent retraining, it presents technical challenges, many of which concern the data infrastructure. Architects must choose, plan for, and tune key components while figuring out how to keep the RAG system hydrated with data in a very bursty environment. Come join in the discussion as our panelists offer thoughts on how to approach these emerging challenges. To kick off, there were some brief presentations: |
|
"Workload Analysis for Generative AI" Mike Bloom, AI Solution Architect, WEKA |
|
"Enable Iteration and Discovery" Victor Alberto Olmedo, Global Staff FSA, Pure Storage |
|
"How kdb.ai can unlock the power of your data" Ian O’Dwyer, Senior Sales Engineer, KX |
The whole point: Making sure AI works in the real world | |
Firms don't research, train, and fine-tune models for amusement. If a model doesn't make it to production—performing inference at speed and scale—then it's all for naught. But getting there is a challenge, whether your organization developed the model or sourced it from a third party. How can you ensure that the inference infrastructure has enough capacity to meet SLAs? Given how expensive that infrastructure can be, how can you ensure it is fully utilized? How should you monitor and manage the system? How should you monitor the models to ensure they operate as expected and inform future retraining? What new optimization and deployment challenges do LLMs present (beyond potential prompt augmentation, discussed in the previous panel)? Our panel of experts will offer their views and take your questions. |
Customer-driven benchmarks for ML & AI | |
The STAC Benchmark Council has spent the last 15 years developing benchmark standards for engineering challenges that are highly strategic to financial firms. Peter and Bishop will discuss how the Council is applying its approach to areas such as LLMs and realtime LSTM inference in order to produce benchmarks that financial technologists find useful for technology selection and architectural planning. |
About STAC Events & Meetings
STAC events bring together CTOs and other industry leaders responsible for solution architecture, infrastructure engineering, application development, machine learning/deep learning engineering, data engineering, and operational intelligence to discuss important technical challenges in finance.
Speakers
Scott McKenzieOptiver David RukshinWorldQuant
Mike BellerUstreet.ai Ania MusialBloomberg
David EisBloomberg Roger BurkhardtLTX Trading
Ambika SuklaNLMatics Gary BhattacharjeeInfosys
Keith PijanowskiMinIO William BeaudinDDN
Allen Holmes Jr.Lenovo Balaji SrinivasanIntel Corporation
Malcolm DemayoNVIDIA Robert MagnoRun:ai
Dave WeberLenovo Bob GainesIntel Corporation
Michael McGuirkAMD Mike BloomWEKA
Ugur TigliMinIO Victor Alberto OlmedoPure Storage
Conor TwomeyKX Felix WintersteinXelera