Snowflake customers eke out early gains from Gen AI applications

4 days ago 5

Much of the debate over artificial intelligence (AI) in the enterprise, especially the generative type of AI (Gen AI), is focused on statistics, such as the number of projects in development or the projected cost savings of automation, and the benefits are still very much hypothetical.

To cut through some of the stats, and the theory, it can be useful to listen to Gen AI users, as I did during a dinner hosted in New York last week by data warehouse vendor Snowflake.

Also: 5 tips for choosing the right AI model for your business

The company invited prominent customers to speak about their experiences putting AI applications into production.

The overall impression was that there are meaningful use cases for AI, including document search, which can start delivering benefits within six months or less of implementation.

The conversations were anecdotal, and Snowflake is interested in promoting best-case scenarios from its customers to promote its cloud data warehouse services.

Nonetheless, with that caveat in mind, the thoughtful comments by both customers suggest that companies create value by taking the plunge into AI with even very simple use cases, after only days, weeks, or months in production.

Also: Today's AI ecosystem is unsustainable for most everyone but Nvidia, warns top scholar

Thomas Bodenski, chief operating officer and head of analytics for TS Imagine, which sells a cloud-based securities trading platform, described how it would traditionally take 4,000 "man hours" of labor at his company to have people read through emails for crucial, actionable events.

"I get mail, every year, 100,000 times, somebody that we buy data from, telling me that in three months, we're making a change," explained Bodenski. "If I'm not ready for this, there's 500 clients that will be down," meaning they will be unable to trade, he said. "So, it's very critical that you read every single email that comes in."

Bodenski continued: "That email comes in, I have to classify it, I have to understand it, I have to delegate it to the right people, across different departments, to action it on it -- that task costs me 4,000 hours a year."

That task has traditionally been the role of "a team around the globe" he oversees. There are at least two and a half "full-time equivalent" individuals, he said, "and they have to be, like, smart people."

Also: Microsoft introduces 10 AI agents for sales, finance, supply chain in Dynamics 365

Bodenski said: "Now, I'm doing it at 3% of the cost of the people that would do that work," using a generative AI application.

"Just do the math," said Bodenski. "You take the average salary and then calculate how much you spend on Snowflake, and that's just 3% of that cost."

This email-reading program was the first app that TS Imagine built with Snowflake's help, said Bodenski. It was built using Meta Platforms's open-source Llama large language models and Snowflake's open-source alternative, Arctic. Those large language models employ retrieval augmented generation (RAG), where the model taps into an external database.

The app "took six months of trial and error learning," said Bodenski. That process began before TS Imagine had a relationship with Snowflake.

Then Snowflake introduced Cortex AI," the managed LLM inference service run by Snowflake, "we migrated the entire RAG pipeline over in four days, and now we are able to conceptualize a different story."

Also: Snowflake says its new LLM outperforms Meta's Llama 3 on half the training

The Cortex AI service allowed Bodenski to classify incoming customer emails for sensitivity, urgency, and other parameters, something that would not have been possible before "because I don't, like, you know, read all 5,000 customer emails coming in every month," he said.

With classification, Bodenski said the result is that "I detect the brushfire before it even becomes a fire," meaning a customer mishap. "It is reliable, I have no problems, I don't miss a single email."

TS Imagine now has six apps up and running using Gen AI, said Bodenski, "and I'm going to do much more. AI is going to continue to build our brains," he said: "It works."

Snowflake customer S&P Global Market Intelligence had a similar experience, according to Daniel Sanberg, the head of "quantamental research" for the firm, who was also a guest at the dinner.

Also: The journey to fully autonomous AI agents and the venture capitalists funding them

Sanberg's company implemented an in-house application called Spark Assist on top of its Microsoft Office apps. Now, the firm can auto-generate email summaries.

"The Gen AI is smart enough to know which ones are most relevant that need my immediate attention versus those that maybe need to be de-prioritized, and I just say [to the AI model], 'Go ahead and write a response to these,' and then I spot-check them."

The app is used by 14,000 employees at S&P Global, said Sanberg. "I don't think I could go back," he said, referring to the old way of trying to sort and sift email manually.

But does the return on investment of such apps justify the cost of building apps and the cost of inference? "I would say, finger to the wind, yes," said Sanberg, although he added: "I think we're still sizing a lot of these things."

Sanberg continued: "The question is, in aggregate, what does that payoff look like? That's TBD. But in individual instances, sure; things that used to take days or longer to compile can now be done within a day [using Gen AI]."

He compared Gen AI to the early days of the internet when dial-up speeds hampered the payoff for the average user.

Also: Bank of America survey predicts massive AI lift to corporate profits

"If we're sitting here and have to wait 15 minutes to log on" to the internet via dial-up modem, "is it really worth it?" Sanberg remarked.

"But, it's not where we are now," he said. "It's where we'll be in five years; I think a lot of this stuff will get sorted."

Snowflake's head of AI, Baris Gultekin, was also at the dinner and said Gen AI can already offer better economics to automate some tasks.

Also: Asana launches a no-code tool for designing AI agents - aka your new 'teammates'

"Cortex Analyst is this product that allows someone to ask a question, to get answers from the data, instantly," he explained.

"The current pricing for 1,000 questions is $200, so, 20¢ a question. This is a question that otherwise would have to be answered by an [human] analyst. They would write the SQL [database] query for every single one of them. So, imagine 1,000 SQL queries. Each one takes, let's say, 10 minutes. You can see the ROI: 10 minutes a question, 1,000 questions, versus $200."

Of course, twenty cents here and twenty cents there can add up, said Chris Child, vice president of worldwide sales engineering for Snowflake, a guest at the dinner. The key thing, he said, is for enterprises to be able to forecast how costs will add up as inferencing begins.

"In most cases, people have set aside a budget," Child said. "They're thinking of grand things, and it's much more about, 'How do I understand how much is it going to cost me over a series of months, and how do I know when it's trending higher than that?'"

Also: Gartner's 2025 tech trends show how your business needs to adapt - and fast

His suggestion: "Try it, run it once, see, and then estimate what you're going to need to do it at scale."

Child continued: "The cost of testing a hypothesis is high," versus, "If I'm going to spend $1,000 to run a first test case, it's still expensive, but it's dramatically cheaper" than using people to test the same hypothesis.

When S&P Global put together an app using Snowflake for its clients, the tool aimed to sort through 12,000 historical quarterly financial filings issued by companies in the Russell 3000, the index of investible US companies, for 10 years across a total of 120,000 documents.

"The first thing we did when we got on the platform was write a script that helped us calculate the cost before we pushed run, and we were able to do that," said Sanberg.

"I like the consumption-based model," he said, referring to Snowflake's practice of billing customers for the total actual time used rather than a traditional software usage license, "because there's transparency in the pricing, because there's, in my opinion, fairness across the board."

Also: There are many reasons why companies struggle to exploit generative AI, says Deloitte survey

TS Imagine's Bodenski said flexibility in pricing of running inference in Cortex has worked for his needs.

"I can run a process where I'm okay to wait three minutes for each prompt, but I can also run a process where it's not okay to wait three minutes, I want it in five seconds," he explained.

"And I make the decision on the fly, just by increasing something from extra-small to medium," referring to the scale of compute.

Bodenski said the app used by TS Imagine to hunt through emails showed its worth quickly. "We saw the impact, actually, four days after we designed it," he said, "because it surfaced those items that we needed to focus on, and it improves our customer service quality."

The app has now been in production for four months. "It is very, very important for us," he said. "It elevates me to detect an item that I should be involved in, or my regional manager, my global head," said Bodenski.

"It runs automated, it produces results, we're catching items" that might have taken weeks otherwise to receive a response in email, "and I didn't have to hire a single person or reallocate a single person to do that process."

Read Entire Article