Episode
202

Why Companies Spend Millions on Data (And Still Get No Value)

February 27, 2026
1h 10mins

Max Clark is joined by Ed Bailey, Field CISO at Cribl, to discuss why many companies invest heavily in collecting data but struggle to turn it into meaningful outcomes. They explore common mistakes in data strategy, the gap between tools and skills, and what organizations should consider before collecting more data.

What you’ll learn:

  • Why companies are spending millions collecting data without clear outcomes
  • The gap between data collection and real decision-making
  • How tools, skills, and strategy need to align
  • Why collecting more data doesn’t automatically create value
  • Questions leaders should ask before expanding their data footprint
  • How to rethink data strategy to support real business results

Guest: Ed Bailey, Field CISO at Cribl

Cribl Website: https://cribl.io

Ed LinkedIn: https://www.linkedin.com/in/baileyedward/

Cribl LinkedIn: https://www.linkedin.com/company/cribl/

Cribl X: https://x.com/cribl_io

Transcript

Max Clark (00:05.97)
usual kind of like rap. if you've got something that makes noise, just turn it off. Team slack outlook workspace calendar, cell phone dog.

Ed Bailey (00:21.394)
Yeah, that's what I'm doing.

Max Clark (00:26.424)
Got a bunch of things right now I'm signing out of.

Ed Bailey (00:28.178)
So many things that make noise.

Max Clark (00:31.255)
Yeah, my dog can't hear anymore, she doorbells aren't going to set her off.

Ed Bailey (00:38.142)
Yeah, totally. It should be good. I used crisp, so it should keep the noise pretty isolated.

Max Clark (00:47.246)
You're welcome to do some push-ups or burpees, get the blood flowing, whatever you need.

Ed Bailey (00:54.3)
Yeah.

Max Clark (00:57.526)
I I'm going to start that when I'm doing it in person. to make everybody do five push-ups before we start going.

Ed Bailey (01:05.34)
There we go.

Max Clark (01:11.054)
Any questions, comments, concerns we should talk about really quickly? Okay, cool.

Ed Bailey (01:13.832)
Nah, this is a fun conversation. Just the one thing, and this is what I recommended to the PR person is, Affleck is an important customer, but they are not what I would call a large customer. And so this is why I want to introduce it. We have an interesting story that's fairly recent about the past six, eight months with a very, very, very large global insurance company. I also want to share, it's just, it'll have to be anonymous. I can't share their name, but the use cases are really interesting.

Max Clark (01:43.15)
Okay. So, before I kick off here, let me, let me tell you, you really only have two responsibilities in this entire thing. the first one is to talk. So, so, I tell, you know, I, I'm just going to, like literally, if you feel like you're talking too much, talk more like, verbatim panic. the second one is on the right side of your screen, up, up here somewhere.

Ed Bailey (01:54.728)
I could do that.

Max Clark (02:12.494)
If I'm doing this right, or do I have to do it inverted? You're going to see little icon with, for me, it's like purplish with a little number and like an up arrow. So the way Riverside works is we get a video and audio stream between the two of us, but it is just for us to see and talk to each other. The actual recording is taking place locally on your computer. So you have a isolated recording in the Chrome browser. So when we're done and I press stop,

You gotta wait until that thing says 100 % for me. So that's your high quality upload. So if the video goes crazy, if the audio goes crazy, don't worry about it. Just keep going, try not to be distracted by it too much. Yeah, Tuesday we had one and something was going on with his computer's network connection. It took a crazy long time for him to finish the upload. It was creeping along at like 1 % every 5-10 minutes. But that's...

Ed Bailey (02:52.114)
Alright.

Max Clark (03:10.498)
That's been an outlier that's, that's, could say has happened a whopping half a percent of all these that I've done. So don't anticipate any issues.

Ed Bailey (03:17.542)
Yeah, I used another platform that is very similar and it's incredible how bad internet connections can be because it's unreal. that's the why it's so important because the uploads, it's so bad.

Max Clark (03:29.906)
Yeah. Yeah. I'm spoiled when we found, we were looking, we were looking at houses five years ago. Um, I made it to point to, you know, I tell everybody the same thing before you sign a lease or buy a place or do whatever, you know, you know, like with our, our enterprise customers, just talk to us first. Like, please just talk to us and we will tell you, can save you a lot of pain and aggravation guaranteed with a five minute, five minute desktop survey. Anyway, so I did the same thing for myself and I made sure to find a, find a place. was a

I'm waiting for the G pond to get upgraded. would really like, you know, five gig at my house, but you know, I'm, I'm, I'm crying right now with, one gig symmetric at my house. You know, and it's so miserable.

Ed Bailey (04:11.528)
Yeah, that's what I have too and it's awesome. Yeah, I'm on the list for the 5 gig upgrade. So I'm

Max Clark (04:16.758)
Yeah. Yeah. Fingers crossed. We'll see. I mean, it's coming. It's coming. So I was thinking about this, and I was trying to come up with a good place to start from. I'm going to leave this off and just ask in terms of for people that are not involved in any sort of log or data pipeline.

you know, and, and Splunk, when it hit the mark, can remember when Splunk came out and it was like, this is the most amazing thing that's ever happened. And, you know, and then of course, you know, what we've seen since then is just, Hey, if we can record data, let's go record data. And this has now created, you know, data lakes and all these other things. And, and then of course you turn around and you're like, okay, we have a lot of data. Now what do do with it? And, it would, it would be really helpful. You say what?

Ed Bailey (05:10.74)
Can I make a suggestion on that? Can I make a suggestion on a starting point for that? Yeah, yeah, and so this is something that I have a lot of conversations with executives because this is the first question I asked them. So it's my guess you're spending between five and ten million dollars a year collecting data, storing data. What are you doing with the data? And oftentimes I don't have a very good answer.

Max Clark (05:15.734)
No, we've started, so go ahead and make a suggestion.

Ed Bailey (05:35.462)
And they are an answer that justifies a $10 million investment. And this is on the average on the medium sized enterprises, large sized enterprises spend exponentially more. And it's the same answer. What are you doing with all this data? And more often than not, not a lot.

Max Clark (05:52.312)
I mean, because this is the thing right now. It's just we don't know. I feel like underwear gnomes. First step, get the data. Second step, we don't know what to with the data. Third step, profit from the data. There's just this nebulous just collect everything with a vacuum cleaner.

Ed Bailey (06:08.624)
Right. And I got to argue that unless you're unless you have programmed to do something with the data, what's the point of collecting the data?

Max Clark (06:15.63)
So how does this, mean, you know, we get into like acronym soup really quickly with a lot of different products that are in the market. Probably the biggest one that most people see and see a budget line on for is of course, their SIM platforms, right? And you had centralized logging resources that were then pushing into the SIM market or, know, SIMs that are kind of like straddling the fence and everything kind of like wants to do everything at some point. But the average cost to ingest data into a SIM, you know, also gets

I mean, people very quickly realize that maybe that's not the best plan for them to shove everything they can into their Sim. how? So what I'm interested in and curious about is with all these different data pipelines that are flowing around everywhere and log collections and log sources, from a starting point, what is all this data being used for that's actually productive or actually giving some sort of intelligence back to the enterprise, the enterprise can do something?

Ed Bailey (07:14.152)
Right, and so let's let's let me answer that question and I'll follow up with the direction I think companies should be taking. So this is this is something I routinely do. We start working with a customer and we start taking a look at what data are you collecting? What are you doing with the data? And then also what data are you not even touching? And so in this end that that the third element is where it gets depressing.

And so for the typical SIM, something where between 25 and 40 % of the data in the SIM is directly related to detections. Detections in the SIM are the most valuable use case. That's how the enterprise says, there's a bad guy. There might be a problem or something's wrong. And so there's a very clear correlation there between data and value because of the idea of that perceived element of we are avoiding risk, we're protecting the enterprise. And so there's a pretty clear business use case. Now the other

50, 60 % of the data is where it gets more interesting. And so for most enterprises that are basically doing all their security functions in the SIEM, they're doing their incident response, they're doing their threat hunt, they're doing their research, they're doing their compliance, and they're doing their retention, all in the SIEM. And as you mentioned, it's a premium platform. It costs a lot to get the data in, it costs even more to store it. And so now you're starting to look at, for all these other programs outside of detection,

What is the direct value? And typically, this is where companies start to struggle because all these other programs outside of detection require very large amounts of data to be effective. And now you're butting heads up into, need more data, but I can't afford to store it. So now companies make decisions about here are things I drop, here are things I have to drop. We have to keep this data because the regulators say we have to. Even though we don't use it, we're still going to pay a premium amount to put that data into our SIEM. And this is where it gets interesting.

I find from my research and my working with couple thousand customers now, 20, 25 % of the data in the sim doesn't get touched in 90 days. So think about this, you're paying $3 to $5 million a year. This is on the average side and 25 % of your investment has no material value.

Max Clark (09:19.47)
Mm-hmm.

Max Clark (09:27.018)
Max Clark (09:30.754)
I get into like this really simplistic view of this thing, right? Which is I'm an old network guy, right? And so, you know, old network, old Linux, old Unix, right? You know, and it was, it was Syslog and it was, you know, right. So you had Syslog and then you had to have Syslog, you know, collectors. And then you ended up with like Syslog streaming logs. And then we started taking, you know, web server and application logs and trying to, you know, stream. you would, you know, there was some really cool, you know, multicast, you know, event logging, streaming applications that came out in the early th- thousands, right?

Ed Bailey (09:38.97)
That's my background.

Max Clark (10:00.904)
And it was always this kind of like give and take of like, you know, it was like, there's value here. It may be, right? And it was, and usually what it was, was like, there's value here for when we have something happen to us, and then we want to go back and try to dig through and figure out what it was that actually happened and kind of like try to track backwards, right? And then you would find like coverage issues. It was like,

I really wish I had like logs from this source because you know, that was the missing piece. Right. And, not even, this is even before like security was really pushing, know, we say like Sims coming out as like the answer to a lot of these things, but just for like applications, you know, what was going on, how do we troubleshoot our e-commerce platform? You know, what was going on with a session bouncing through a lot of different resources and environment. we had a, you know, load, you know, balance or go flaky on a port.

Right? Like, how long did it take us to find that? And then you got in this position of like, just log everything. And then, of course, then you logged everything. And then you were like, OK, now what do we do with all this data? Right? But how is this pipeline, I guess, more and more, I'm in conversations where it's like, everything that we can, we're just going to throw into some sort of data warehouse. We're going to throw into some sort of data lake. And we're just like,

And it's just like, it's just turned into this thing of just like grab everything. And then you get into a conversation with a finance person. The finance person is like, how much are we paying for snowflake or how much are we paying for this? And why are we paying for that? And so that doesn't feel right either. Like, like what's, what's the, like, how do you, how do you navigate, you know, like these two polar extremes?

Ed Bailey (11:40.776)
Exactly. And there's the desire to do something with data. And this is the issue for me is I see a constant lack of data skills. And so just think about it in the average business. They're spending a lot of money and have data scientists and all sorts of programs around business intelligence. Companies have put a lot of emphasis. But when you start looking at IT and security data, it's that other thing. It's those guys, it's the nerds in the basement, the guys in the data. And it's not considered business data for many companies. And so this is where

You have that desire, but then you don't have the backing in order to make that data valuable. And so I spent a lot of time arguing with, advocating with executives to bring IT and security data into the business intelligence realm that this is business data. So this means you, needs data skills. It needs the right people to run these schools. You just don't take a security team or an IT team. Imagine taking, taking the firewall admin that you work with and saying, Hey, good luck with Snowflake. I mean, how's that going to work out with them? And this is an issue where

skills and tool sets don't match, you are not matching up. And I don't care how good the tool is and Snowflake and like Databricks are phenomenal platforms. So without appropriate skill sets, just doesn't matter.

Max Clark (12:50.254)
Well, I feel like the first place I've seen this wall really pop up from most companies is always these audit and control frameworks. So I think on average, probably most companies are dealing with SOC 2. then they find out, they go through this process where it's like, OK, now we have to go through and we have to provide proof and show our controls and show the data on this. And then you get into these situations where it's like,

we're on the SIM platform that's supposed to provide this for us, but we can't actually do any sort of data correlation or, or reporting over the past 13 months. And it's just wasn't designed for that. And then you're like, okay, great. Now what do do? And I've had conversations where I was pulled into a, you know, pulled into these, these things where it's like, the SIM doesn't do it. And the SIM vendors like, just go ahead and export all the data for the last 13 months and put it into something else. And then, and then query that. And like, that doesn't feel like a great solution there either.

Ed Bailey (13:46.804)
Not even a little bit. And it's depressing how many companies use their SIN platform as their compliance or retention platform. So you're going to use your most expensive platform that's geared towards your most urgent needs for something that is not particularly urgent and also in the grand scheme of things, not particularly viable. And so this is where we plead with separating your functions.

So just remember, the idea was SIM. SIM means Security Event and Incident Management. So it's like, let's focus your SIM on your instance, your IR, your case management. Let's now move all these other things to your data lake. Typical SIM is going to run you, let's say, your all-in costs like $1 a gig, where I can set up, let's just call this a ordinary average commercial data lake, 15 to 20 cents a gig.

Max Clark (14:18.275)
Mm-hmm.

Ed Bailey (14:40.678)
So you're looking at a 5x increase in your amount of data you consume to keep your spin flat. So this is how you start. And this is the, I use this term a lot if you're ever around me, it's the idea of a matching cost and value. So this is data you mentioned, for example, data that might be valuable, but you're not gonna pay a premium cost. Now I can shift that data to another place where that cost and value delta gets a lot tighter.

and a lot more defendable when the CFO starts asking a lot of questions.

Max Clark (15:10.898)
You talked about this, you touched on this briefly. I mean, I feel like every like MDR engagement we've been involved with, and we start talking about SIMs or SIMs coming online, you know, the upfront, you know, energy with the SIM, it's like, okay, can you collect the log sources? But then immediately it turns into, can we filter out all the log sources? Right. And, and, and this of course becomes a really confusing part of the sales cycle. We're trying to actually, you know, size and scale. What is your SIM requirement actually meaning? You know, get on this like,

Ed Bailey (15:29.864)
Exactly right.

Ed Bailey (15:39.196)
Bye.

Max Clark (15:40.556)
you know, like, like voodoo math on spreadsheets of like, you've got, you know, 5,000 desktops and, therefore we think your SIM ingestion is going to be this thing, but we don't really know, you know, you're going to, you're going to turn it on and find it out. And then, and then we're going to start whittling down and say, we don't care about this. And we're going to exclude this from actually, well, it's going to hit the SIM, but the SIM is going to drop it immediately effectively. Right.

Ed Bailey (16:00.89)
Exactly. And that's an enormous risk calculation that we're going to start having to make choices. And every one of them can potentially affect us with detections and also everything else. Think about you mentioned issues where you're in the middle of an incident and all of sudden you're like, whoa, we need this. We don't have it because we chose to drop it because of cost.

Max Clark (16:19.608)
How do we not get into a situation though where the solution to this problem is data about data in multiple different places storing data and walking into this other problem of just like massive data sprawl?

Ed Bailey (16:33.704)
And this is where I was like, bringing the discipline of data into the IT and security space is key. So the idea is that you're still diversifying outside your SIEM, but it's not into multiple silos. It's it's your SIEM and your data lake. so that way from a process standpoint, say you're saying this is the same, this would apply to IT as well. So we're going to do our monitoring. We're going to do our detections here. We're going to do our investigations here.

And this, and so this is where the industry is starting very slowly to start looking at how do we manage getting data to the right places, the right formats. And that's, that's for example, where Cripple comes in with, with telemetry pipelines, because this, because if you were going to diversify your data storage, now you have to simplify your collection layer that gives you the flexibility to say,

this event goes here in this format, this event goes here in this format, in order to give you optimization. Because if you're not looking at something with a custom data collection layer, that's going to give you that kind of flexibility, then you're going to be in a lot more trouble. Because now you're essentially duplicating all your data going to both places, or you're having multiple sets of agents, and you're creating a whole other set of issues. And so you're right to bring it up, creating a bigger mess

is just as much of a concern as the current issues around SIM and with same SIM with IT. Does that make sense?

Max Clark (17:59.246)
Yeah. OK, so a couple of questions that immediately runs off into it, right? So let's talk about like, first I'm curious about sizing, right? Every SIM conversation turns into, the SIM's expensive, we need it, it's critical, right? How much data are we putting into it? And then you kind of get into this, there's like,

I don't want to call it like scope creep. just turns into like utility creep of like, let's get more data into the SIM. Let's get a network of, know, let's get our network data into it. Now we've got our, our desktop, we've got our end point, we've got our devices, you know, and you, and it's important to have a lot of correlated data, right? Because that's what gives you more, more indications. that then of course turns into these cycles of like SIM costs start escalating. And then you get into like the, do we reduce our SIM costs? You know, it's like, it happens kind of hand in hand.

Ed Bailey (18:47.304)
All

Max Clark (18:50.126)
But at what size, like when does having and bringing in another platform to start managing this, you know, as you said, the telemetry pipeline, right? Like this, this routing and collection of data, routing of data, filtering of data, storage of data, and then like feeding, cause you're not just feeding, we're not just talking about feeding Sims, right? You mean you're talking about feeding lots of other data. So we don't want to talk about that as well. So like how, where does this, I mean, where does this really start paying off?

Ed Bailey (19:15.87)
Yeah, in.

Max Clark (19:20.206)
you know, for people or what's the triggering point?

Ed Bailey (19:22.608)
You're exactly right. it's an important question to ask yourself. And so the thing I typically, when I work with companies, is to start looking through a matrix of what are your requirements? So regulated industries are going to have a much, much higher need. Global industries, they're now, global companies, because they're now have, they're having to comply with multiple regulatory frameworks.

So they're going to be in the basket of, we're not going to look at a different way of doing this. Now logistics, small logistics, small retail, our small manufacturing are where we consistently see this is probably not necessary.

And this is where we started looking at. Because you're doing a minimal set of functions, you're focused on a few things. And also typically more often than not, there is a focus on outsourcing. Companies, even fairly large companies, cannot afford to staff their security functions.

And so this is where you're looking at your MDR providers and it's a chronic issue. so it's going to be, you you go go to the various MDR providers, outsource providers to handle this. And it's so it's a different calculation because they're going to handle it for you. But it just, and so it varies. And so first start with what they need. What is their threat relationship? What do they really need in terms of what's their commitment to security and IT? Because you know, lot manufacturing companies, it's just not that big a deal.

And so, and this is where, so, and this, I'll be very upfront with them about, you know, you're fine, you're doing this, you're able to handle your business needs. You also have to separate what your perceived security need is versus what the business says it is. Because if the business doesn't care, then it just really doesn't matter what the security team says.

Max Clark (21:00.598)
Yes. I mean, that's such an important point to make, right? people invest in backup systems after they've had data loss. There is a certain stimulus response that organizations go through as they grow and scale, mature, and have different things happen to them. And I feel like security fits firmly into that, right? It's either forced externally, compliance, regulatory frameworks, insurance mandates, response to something.

Ed Bailey (21:02.684)
Good.

Max Clark (21:30.84)
I'm starting to see more like, you know, stories of like, my friends come, you know, this happened to them and we don't want to have that happen to us. And now we have to investigate like, what is that going to actually entail? And then you get into these really interesting conversation cycles about, know, I don't want to veer off from this too much. talked, you talked, we talked a little bit, but like, of course, you know,

Max Clark (21:54.674)
endpoints or destinations other not you know, it's more than just SEM, right? So as you get into this, as you start, know, and of course, like the sources become very broad, you know, we went from, it was like physical devices, physical network gear, then it was like applications, now you have cloud resources, you have SaaS applications. I mean, there's lots and lots of places where data is originating inside of an enterprise that then has to go other places. And so I'm

I'm curious if you could expand a little bit for me on beyond just the SIM function, where else are people sourcing data and then syncing data to?

Ed Bailey (22:31.172)
It's, it's something that's expanding rapidly. And I personally, I think it's very exciting because companies have started to understand the value of data. And so let's just, let's talk about your typical security and IT data. I, I, I typically talk about sharing a lot. was like, you know, your parents spent a lot of time teaching you to share your toys. And this is the idea we want teams to share their data. this another common, common, especially mentioned your networking background, the idea that, okay, the firewall logs are in the SIM.

But so, how the IT, how do the, how do the network guys access it? Are the security team going to let them into the SIM in order to access their own firewall logs? So this is where we advocate, especially with having a telemetry pipeline in place that I can share. I can generate an event once and I can share it everywhere. this firewall is a great example. All right. I'm going to take, take the firewall events. I'm going to put my denies and some of my session restores. I'm going to take source IPs that are, that are from outside of the company. And I'm going to put that in the SIM.

But I'm not going to put my admin messages, my cluster messages, my heartbeat messages, my session stop messages, my internal in the SIM. I'm going to put that into my elastic instances that my network engineers use because they need a greater scope of data to do their job. But then also the NOC needs data too. The most common thing in the world is customer calls up and says, hey, I can't access my application anymore. The first thing you want to do is go look at the firewall and see if the firewall is blocking it for God knows why.

Max Clark (23:56.578)
Right, right, right.

Ed Bailey (23:56.84)
And so that's three different teams. Then you want your app support teams, your DevOps teams to start being able to ask the same questions of their data. And so you can see the compounding value of data through sharing, because for every one of those teams, they can now solve problems faster. They can address issues. They can be hopefully proactive. And so that's just one data source. The idea is, how do we get more value from the data? So I was one of Cripple's first customers.

And one of the things that I quickly found out by having Cribble in place, because this was 2018, this was very new, the ability to share data regardless of what your SIEM platform was. So just to give you an idea, I was able to take data that was generated by our batch systems, that was available, it was only available to the teams who ran the batch systems. We were able to take that because I Cribble in place, I was able to reshape that data and now share it with business operations. So business operations could take

the statistics of when the batch stopped and when the batch started matched that up with correlations of who the customer was, the time of day, the year. And we started to build a model to now help the batch team. So this is business operations now schedule batch runs that were very closely aligned to the actual expect utilization instead of having to oversize. Because typically in business operations, you're governed in SLAs. And so your SLA is like, if I don't finish the batch operation on time,

that you're going to get penalized by our customers. And so what you do is you oversized. You're putting way more hardware because the SLA is a problem. But because now we are able to model our data over a period of a year, we created a predictive algorithm based upon that data. And now they're able to size the batch more closely to what it actually was. We're able to save millions of dollars because we weren't having to oversize the batch runs. And that's just an example of

sharing data outside of IT in an unexpected way.

Max Clark (25:56.002)
I mean, that's a fascinating use case and example. I'm thinking about this and just my mind's spinning a little bit on it because.

Ed Bailey (25:59.047)
Yeah.

Max Clark (26:07.778)
we look at the enterprise and you look at the placement of it inside of the enterprise, right? It is a support function for the enterprise. Like you call whatever you want. We, know, we can talk about transformation. We can talk about all these things. And I mean, obviously I'm a big believer in like, it is a force for good for, you know, the average person, right? But, but correlating and bringing that back in something that's outside of these like normalistic ROI TCO calculations, which are like complete gobbledygook most of the time and saying, you know,

Ed Bailey (26:13.842)
Exactly.

Max Clark (26:37.634)
Business has a need, customer has a requirement. This is how we make money, and this is what it costs to run this thing from start to finish. That data gets really interesting to me. I really like those conversations.

Ed Bailey (26:49.778)
Yeah, it's amazing. I love that. that's probably one of the best things about working the jobs. I get to talk to so many teams and start having these conversations. OT data is now finally starting to get the attention that it needs. This is again, we're working across so many enterprises, logistics, manufacturing, trucking. They all produce enormous amounts of OT data. Retail produces an enormous amount of OT data as well. And so giving teams the ability to in a centralized place to collect this OT data.

Max Clark (27:03.266)
Mm-hmm.

Ed Bailey (27:18.568)
and to now get more value from it. Because traditionally, this was all separate. Like we'd work with like, we'd go to oil and gas and I'd ask them about this and they're like, oh yeah, that's the pipeline team. So, you know, there's no security or IT relevance about that. Oh no, I mean, they take care of that. And, and the idea is like, so you don't want to know if someone turned a valve at 3 a.m. with no change order, you know, and then it's, and that's where you mentioned that data sprawl, that data silo. And increasingly where this is by having that common plane that

Max Clark (27:32.909)
Mm-hmm.

Ed Bailey (27:45.36)
It's just telemetry. It's not IT, it's security. It's telemetry that we can value the business. You can now bring it all together and start to see that value of understanding it holistically. Another great story, this is retail, started looking at physical security controls. Very OT, very standard. This is something that basically they had outsourced to private company and they treated it as like, know, rent a cop type thing. They started looking at...

How can we collect all this physical access data and make it part of our sock? And this is where he started all of a sudden, started seeing, okay, know, store numbers, you know, whatever in Indiana, the back door just went at 4 a.m. What's going on? And this is where he got started getting an understanding of, centralized understanding of, you know, there are things going on that just weren't quite, there's just the visibility to it. And the thing is you got to bring that data together to analyze it.

Max Clark (28:41.442)
When you talk about bringing, it's a different skill set, right? Data analysts, business analysts, data science, these sorts of things become the terminology to it. I don't want the feeling and the perception of, this is a, if you build it, they will come. We have to build it, and then people are going to figure out how to generate value from it. Because that's not really, when you walk in and you start talking like,

like, we want to collect this data because we know it's valuable, but we don't know what we're going to do with it or how we're going to like generate money for you and then what it's going to cost to collect. Like that's a, that's a non-starter conversation with a lot of companies. So then, then what is like, are you finding a lot of people are walking into this and saying, we know we want to start getting IT and OT and it's important for us to know in valves or security is looking at this and saying, Hey, we've got a scatter control system that we want, you know, the monitor better or

You know, now, you know, IOT with, you know, Lisa, talk about logistics, right? Trucking and logistics. IOT is becoming a really big play on this. How are we monitoring telematics and fleet information, you know, like our, our trucks, what's going on with, with our stuff? Where are they? What, what, what's happening? mean, is, uh, you know, let me, let me ask this a better way. Is this like, is the cart leading the horse is a horse leading the cart? Like how, like, how is this evolving now?

Ed Bailey (30:01.544)
It's uneven, put it mildly. So Matt with this is a CISO, really innovative CISO, big insurance company. the thing that he said that got my attention was, we have to change the culture so we understand and appreciate the value of data. And so this is where he's treating it in a very long form of, all right, we're gonna put data systems in place. I'm gonna hire data people.

So not security people, data people. We're going to have, we're going to start mandating training and start setting an expectation over the period of the next two to four years. We're going to ask different questions of our data. We're going to get value from our data. We're going to ask, ask those questions. But he understood that this is not going to change overnight. This is not a tool issue. This is, and this is, I see all the time. Yeah, yeah, we're going to go buy, you know, Databricks, stuff like something like that. And it's going to change our culture. That's not how it works.

It's your people and process that have to change in order to use your tools. And so, and then this is, but this is uneven. I said like the leader that I mentioned, his is that he's going to change the culture to data. But I see unfortunately too many businesses. I'm going to buy a tool to change the company.

Max Clark (31:16.802)
The, that's an unusual conversation for a CISO to be leading, talking about like culture change within an organization around how they're collecting and using data, right? Like this isn't.

Ed Bailey (31:27.378)
I wanted to hug him because I was like, this is like, thank you.

Max Clark (31:32.966)
know, you talk about culture and like accessing and understanding data, you know, immediately goes on like fires in my head, you know, I mean, then Facebook now meta was famous for this. You know, you, if, as you were hired into the company, they put you through training on how to run queries against at that point, was Hadoop or hive. forget, you know what, at that, at that point when this, when this was published, right? It was like, this is how you get access to our data and like run, you know, and,

famously a very data center company and was able to make decisions and phenomenal growth. Like you can't say Facebook wasn't successful in utilizing data to grow, but it was really institutionalized across the entire company of like, this is how you interrogate the data that you have available to you. And they pushed us as a train. Like you were onboarded, you were taught how to use it, right? Is that like the inflection point at the same, you know, now for the average enterprise?

the non tech centric, crazy growth VC unicorn kind of business.

Ed Bailey (32:36.594)
Yeah. And that's the way to look at it. Just to give you an example, my favorite CIO, I got to see he decided we need to invest in automation in order to scale as a business. And so this is something you would see him in every single meeting. He would stop the conversations like, what are you doing to automate this? How are you making it repeatable? It's ask a very standard set of questions in order to instill in his leaders that you better be asking these questions too. And this is what is important for every leader.

Max Clark (33:00.846)
Mm-hmm.

Ed Bailey (33:02.876)
How do you get value from your data? What are we doing with our data? If we're not getting value from it, why are we collecting it? You have to ask these questions over and over again and set that expectation that I'm not going to tolerate collecting data for the sake of collecting data. I'm not going to tolerate not asking questions in analytics because at the end of the day, the analytics are all that matters. The analytics are what powers the business for.

But this is something from the top down. You have to have that culture, just like you mentioned with Facebook. You can't operate with Facebook unless you use data. And that has to be at your ordinary average. At your ordinary average org, it has to be the same.

Max Clark (33:39.182)
So it's great for a CISO to want to go data-centric and change culture. It's great for a CIO to say we want to go to data-centric and culture. How do you cross over then into the rest of the business and the other business units? Because some of this data, some of what we talk about is very IT-centric things or very security-centric things. Or you say like,

But when you look at one of the examples you say is like retail, right? There's like so much information that comes off of IT systems in a retail environment is incredibly valuable for marketing. Like how do you cross that chasm between, hey, this is IT data and this is really business data and how can the business then use that data and get more intelligence out?

Ed Bailey (34:15.56)
All right.

Ed Bailey (34:26.272)
And this is where having the right people in place is really important. This is a project we worked on where every time you swipe the credit card through the system, the data would get generated. So they would create a unique token for your credit card number. So it's repeatable. Every time your credit card number, gets tokenized. So that way, mandate you. So PCI mandates are met. Their credit cards aren't just floating around.

but this is what by having that unique token, they're now able to match it up to me. So like I swipe my credit card, I now get an email that says, hey, thanks for buying this here. It's based on my credit card. They then match it up with me. And then they send me an email that says, hey, thanks for buying such and such. Here's your receipt. And then they start a follow-up a couple of days later. How was your experience? How was the thing that you bought? Would you like to leave a review?

And that's to me is a perfect example of marrying up IT data with marketing. And that's something that you've got to be able to do that. You have to invest the time to do it and understand. Does that make sense?

Max Clark (35:27.318)
Right. Well, yeah, of course. And I mean, I also think of like, you know, guest Wi-Fi or, you know, know, position tracking like Bluetooth and aisles, you know, like, can you correlate to people? How often do they come to your stores? How much time do they spend? What are they looking at? Can you do personalized marketing at that point against somebody who spent a lot of time, you know, browsing the suit aisle and like, hey, you know, can we help you get a suit, right? Or shoes or whatever.

Ed Bailey (35:56.4)
yeah. I got a great story for you. So I know of a company that's spending a lot of time, totally people, their wifi access points and the store map. So the idea is that I'm going to like, I'm looking for a, get a widget and it shows you a map and it shows you where you are in order to help you find the widget. And so that's, I mean, that's another great, cause it helps, it helps you buy things faster.

Max Clark (35:57.058)
the case maybe, right?

Max Clark (36:19.086)
And then people buy and then they walk out satisfied and everybody has a good experience and you know, you have you know the next time I need to buy something I'm gonna go back to the store because it was really easy for me to buy it, right

Ed Bailey (36:24.136)
Fuck.

Ed Bailey (36:30.984)
And then there's enormous amounts of internal data as well. So the idea that you're now looking at, it's like, say you have to go to a kiosk to order something. All right. Say someone ordered something. This is another company and I know that they're doing this is where you go to the kiosk as you start an order, but you walk away. And so this is the idea that you're trying to figure out, you why'd you walk away? Why did you leave the store? Why did you complete your order as a way to then start to try to understand gaps in the customer experience.

And so I think those are really, really good. I mean, we see this also in manufacturing logistics as well, in terms of how long does it take to get a truck loaded, unloaded, and especially when you're dropping off pieces of freight at one at a time. And that is an entire art. We got to see some companies up close that their ability to manage logistics and the drop off process is remarkable. And so much of it is RFID tags. It's the telemetry in the truck.

Max Clark (37:10.318)
Mm-hmm.

Ed Bailey (37:25.766)
By the am astonished how much data your ordinary average delivery truck can generate. I mean, it's just, it is, it's astonishing.

Max Clark (37:33.942)
Yeah. You mentioned, we touched on this briefly when I was asking around like sizes and use cases and where this really starts kicking off, And the things that you said were, of course, compliance-driven or regulatory-driven companies, of course, and then it was larger scale, like global businesses, lots of different constraints, right? And so when I hear those things, I immediately think like finance and insurance being like really highly regulated.

You know, big businesses, spanning borders with different rules and different locations. you know, and then, and then like the complications and complexity that really feeds into that. you know, for the, you know, multinational insurance company that's trying to deal with this, right. There's, there's like a whole nother game that they have to play and a set of rules that they have to follow with these things.

Ed Bailey (38:25.788)
Yeah, it's crazy. Insurance is nuts because like in the United States, for example, you're regulated by every state. So typically they'll, they'll focus on NYDFS as their baseline, but you're still have a regulatory baseline with every state. then with California starting the past legislation as well around privacy, that's going to create some fun and games and then globally every jurisdiction. So this creates a whole set of just.

mean complications and it's a cost of doing business. So this is the idea is, you know, how do we lower that cost of doing business? So you comply, but at the lowest possible cost. And as you go back to sizing, this is where I started having those conversations about, you know, what do you have to do with your data? What do you want to do with your data? And then also start talking about.

What about your data matters? We'll call it a classification exercise. Cloud audit logs is a good example. Cloud audit logs are an enormous data source, but a very limited value. And so this is where, for example, this is another customer we took a look at. 20 % so they were spending about $10 million a year on their SIEM. 20 % of their SIEM was cloud audit logs. We figured out they were using 1 % of that cloud audit logs.

So they're spending $5 million to use 1 % of that data. And so this is the idea is like, how do we separate this out? And so this is where I learned to start asking the question of, all right, what of this data do you want to use? So we're going to put the 99 % of this data that you don't touch ever. We're going to put it over in your data lake. But we're still going to power your detections through the things you do care about. And so that gives that ability.

So that gives that ability to then.

Ed Bailey (40:13.224)
Are you hearing me?

Ed Bailey (40:27.912)
Are you hearing me okay?

Ed Bailey (40:48.23)
So that gives that ability. Yeah, I hear you. Do you hear me?

Max Clark (40:48.841)
Ed, can you hear me?

Ed Bailey (41:03.634)
Hopefully this will reconnect.

Ed Bailey (41:25.576)
Hopefully, they'll finish uploading. And so this is where, a Cloud Audit Lock standpoint,

Ed Bailey (41:41.128)
All right, we then take, so from a cloud audit standpoint, we then start taking a look at your data and say, hey, we're going to put the 99 % of your cloud audit logs that you're not using in your data lake. Use that 1%.

Max Clark (00:00.686)
This is the, this is the fun of doing like remote, right? You know, and I know in person, you've got a completely different set of problems. And one of those problems, I was talking to a guy who runs a podcast studio and they do recordings and that answer different people. And, and I actually kind of like, there was a part of me that was just like, I love you, man. Cause like he could, had data loss and now he had like a triple redundant data backup, like hard drive, like system. he was like, I'm never losing data again. Like it was, I felt so bad for the guy who's telling me this story, but.

Ed Bailey (00:06.919)
yeah.

Ed Bailey (00:23.368)
god.

Ed Bailey (00:29.451)
Yeah, people learn hard lessons, that's for sure. Yep.

Max Clark (00:32.054)
Yeah. we were talking, we were talking about insurance and you were talking about, to prompt you, you started getting into like, forget global, but like just on a state level, you know, standardized against New York, but California, know, and everybody's imposing their own stuff, right.

Ed Bailey (00:48.555)
Yeah, and so it creates major regulatory overhead. And this is why we spend a lot of time where when I have a conversation with customers, I talk about their requirements. What are your regulatory requirements? So what are your security requirements? are your IT requirements? What are your regulatory requirements? Because typically they're not the same. And so then we try to put together that comprehensive plan of being efficient and effective so that we're sharing data where appropriate, data separate where it's not appropriate.

Max Clark (01:08.366)
Mm-hmm.

Ed Bailey (01:18.695)
And taking these three buckets of cost and requirements into effect, where now we match it up with their data. Because the way I look at it is start with your requirements and then start with your data. And then do that gap analysis of understanding, do you have the data you need to meet your requirements? Do you have the data to meet what you may want? And then also from a third standpoint, typically your regulatory data, your compliance data, is data that you don't necessarily need to have searchable or available, but you need to retain it.

Max Clark (01:48.718)
Mm-hmm.

Ed Bailey (01:48.767)
And so this is where we break that up that way. And based on that whole idea of matching cost and value.

Max Clark (01:55.758)
So.

We see a lot of IT projects. And starting from a place of cost, like, we've identified an area. We need to reduce costs on this. We're spending too much money. Our budget can't tolerate anymore. Whatever it is. And of course, there's a lot of conversations that start from that. But I find that

Ed Bailey (02:00.405)
Yes.

Ed Bailey (02:15.211)
Right.

Max Clark (02:22.336)
If it's solely a cost driven exercise, there's very little internal motivation, especially with the amount of energy it requires to complete that project. this is like alignment of KPIs and whatever your measurement, tracking, and performance metrics are for people, right? And especially when you start talking about like we have an infrastructure team, you're like your IT team, and you have an engineering team and development team, and they're judged differently, right?

It needs to make changes, but it requires dev to do the actual work. And dev's like, well, we have to deliver on what our metrics are, which don't involve, you know, you see, you get into this kind of like weird, like position a lot of times with companies. And, you know, I mean, look, walking into somebody and saying, Hey, you're spending five, you know, $10 million a year in your SIM, and we're to turn that into five. And then we're going to give you more access to data. Right? Like that's a good starting conversation, but that doesn't feel like that's where the conversation actually ends or what gets people really excited. Right.

because it feels like there's a lot more behind the scene after that initial conversation that becomes like, now let's show you the rest of this picture, right?

Ed Bailey (03:30.637)
Exactly. What you're describing is a key issue. The oldest issue in IT insecurity is I have 10 terabytes of data and 5 terabytes of license. How am going to make this work? And so this is the kind of thing. It is endemic. And so this is where, in terms of setting expectations, it's important. You are unlikely to make your budget run. That's why I want to call it your run rate smaller.

Max Clark (03:39.523)
Mm-hmm.

Ed Bailey (03:53.363)
What my goal is, how do I flatten it out so you're not spending any more? Unless you're going to make requirements changes where you're going to say, hey, I'm going to lower my posture or something like that. But I can typically stop the growth. we typically, it's not unusual. 20 and 25 % growth year over year in cost is not unusual at all. And so this is so the idea. So we start looking at these programs where how can I put less data in my most expensive platforms?

but I shift my other data to less expensive platforms. And this is the thing I want to emphasize. We get involved in these conversations with cost, but no IT or security problems ever solve with less data. So you need the right data in order to solve problems. And this is where, just want to give you a great example, got brought into another engagement and we started looking at the companies going nuts because they're spending so much money, about $10 million a year on their SIM.

Max Clark (04:34.158)
you

Ed Bailey (04:48.393)
we were able to figure out that 20 % of their data was their Cloud Audit logs. They were only using 1 % of that data. So let's shift this enormous data source that you have to retain. You need access to it, but it doesn't need to be in your SIEM. So we took the 99 % out, put it their data lake. They now got 20 % of their SIEM back, of their capacity back, for more important data sources for other things. And they were still able to retain their data. And they were able to flatten out their run rate.

And so that's just a very classic, very common example of how we can help you get more value, but also flatten out your run rate. Because I find that's what leaders really want. And one thing I also want to imagine, leaders all the time come to me and say, hey, we want to use half of this particular SIM or whatever they're using. And the thing I have to remind them is, are you sure your existing vendor is going to let you? At renewal time,

Max Clark (05:28.462)
predictability.

Max Clark (05:43.598)
Mm-hmm.

Ed Bailey (05:43.659)
No, you say, hey, I have a 10 terabyte license. Now I want a 5 terabyte license. I'm seeing plenty of vendors right now who are like, we'll give you the 10 terabyte. We're not going to downsize.

Max Clark (05:57.006)
It's really hard. Yes, the contracting renewal cycles, it's very hard to get into a conversation where you start talking about reducing revenue with companies. They do not like that. I get to have that conversation a lot, but we're also engaged differently. People believe us when we say, this is changing. You want to be a part of this or not?

Ed Bailey (05:59.135)
Yeah.

Max Clark (06:25.462)
What I'm going to try to lead you into another question here, but I want to make sure I'm got the baseline of this first, which is we talk about like a telemetry pipeline. Walk me through the key components of what a telemetry pipeline has to do in order for it to be effective.

Ed Bailey (06:41.845)
So from a first principle standpoint, it has to be able to get data from anywhere to anywhere, where you're not restricted by protocol, you're not restricted by vendor, you're not restricted by, I'm gonna favor one vendor over another. And so when you're looking at that, that's incredibly important. So now you need the flexibility to be able to accept data from an extraordinary range of different tooling. Like for example, we mentioned like, we recently looked at manufacturing, they were had Windows XP systems.

I'm not kidding. Windows XP systems putting off data. Very old machines. And I understand from the business perspective, they're like, it works good enough. Why am I going to spend millions of dollars replacing something because it's old? I don't care. And so you have that. And then you have modern systems, modern cloud systems, APIs. The world is getting eaten alive by SaaS. Every one of those SaaS components has a different set of APIs with very little overlap. And so you need the flexibility.

Max Clark (07:22.722)
Yep.

Yep.

Ed Bailey (07:40.669)
of being able to accept data, but also reach out and grab data and APIs or something. I don't think you're still well understood how much of an issue it is and how many coverage gaps companies have with the SaaS vendors they do business with. So that flexibility is incredibly important.

Max Clark (07:58.424)
Some industry consortiums will come around and try to standardize things. I remember HR XML when it came out. I'm going to date myself there on that one as well. But even with attempts to standardize data, two firewall vendors are going to log completely different. mean, is it important for telemetry pipeline to also normalize this? Is that a job with a telemetry pipeline?

Ed Bailey (08:04.51)
yeah, it'll be now.

Ed Bailey (08:18.324)
Right, exactly.

Ed Bailey (08:23.227)
So that's the next step. that's you. So you collect it. Now you have to give the options for normalization. And the thing that also did not underestimate like with firewalls, I see all the time we go to a LAR, particularly see this at large companies, they'll take the standard output from that firewall vendor and make it different. And so we're looking at this like, how'd you do this? It's like, well, we added our own special sauce. Why? You know, and so it's

Max Clark (08:44.61)
Yeah.

Ed Bailey (08:45.323)
It's what it is. So this is where having the flexibility. And this is the other thing that I think is really important before I get into the normalization, is having a flexible user interface in order to get this done. Enterprise user experiences typically suck. And so this is something I think is extraordinarily important, is you want to be able to arm non-data people with data skills. And having a quality user experience is incredibly important. For example,

you're having, say, on the left-hand side, the ability to say you want to use a common language. So this is another thing you see in the telemetry space. Well, hey, I'm going to go invent my own data manipulation language where other vendors are going to use something common like JavaScript. So that way, it's more approachable. Then on the right-hand side of your UI, have the before and after. So every time I make a change, I can see the after really quickly. And so having that really good user experience accelerates value, accelerates the

Max Clark (09:37.934)
Mm-hmm.

Ed Bailey (09:43.307)
you know, more people will get into it because it's so approachable. They look at it as just so easy. And this is something I've really, have, I still have PTSD scars from before I had Cribble. is, know, from, I got my first, my first SIM license in 2007 all the way up to 2018. Just PTSD of all of how hard it was to do anything with data. And that ability to like, wow, we just made a change that before took us, you know,

hours and hours and hours of development in Python or NiFi or something like that. And we were able to do what we needed to do in a couple of hours. And that was just remarkable.

Max Clark (10:18.83)
I'm laughing because I'm like, so you're saying that we shouldn't be forcing people to learn Perl regex in order to actually manipulate data coming through a pipeline.

Ed Bailey (10:27.869)
Yeah, you should have a UI that helps them build the regex and then also also user experience that says is the regex any good?

Max Clark (10:36.471)
Yeah.

Yeah, it's, it's, don't know. kind of feel like everybody should have me force to sit down and like learn said, you know, just because it's like, can kind of appreciate it more and can force my children.

Ed Bailey (10:48.779)
And I get it. I mean, I struggle, you know, I struggle with if, because the fundamentals are important, but typically like, like you, have four to five people in it. Yeah. You got four to five people on the team. One person is going to be the Redgex person. Another person is going to understand like said, or Grok or something like that. And by having that approachable UX, you can have more people participate. Like, this is something I spent a lot of time getting my junior engineers involved.

Max Clark (10:56.15)
Nobody cares. We shouldn't, know. Yeah.

Ed Bailey (11:15.435)
So I didn't want my smartest people writing normalization code. I wanted my junior engineers to do it. Because that way I could have my smartest people focus on how do I make my data better for analytics? Because that's a different skill. Now, so circling back to normalization, it's incredibly important that it has to be approachable. You have to have guided experiences. Like I'm typically, I'm the curmudgeon yelling, off my lawn when it comes to AI.

And this is where an example of having AI look at your data and helping you map it out to say OCFS, mapping it out to UDM, mapping, know, fixing your timestamps, making suggestions. To me is that's the real place to put AI because it can help walk someone who's, you know, junior and has decent skills, but to do all the right things that a senior person would know by default. Does that make sense?

Max Clark (12:05.004)
Yeah, of course. So we talk a little bit about output, Output to your SIM, output to a data lake, snowflake, Databricks, going to an elastic cluster, Data's coming in, data's going out somewhere, right? Hopefully they're not using Tableau to try to evaluate it. But when you look at these things, so how does

You know, like this cripple stay in this lane of saying, we're just going to focus on, you know, input transformation output, you know, any to any input and output, or, you know, what point does this also become like, Hey, you know, it's time it's we're going to, you know, start getting into the data storage, you know, cause is that like the eventuality of the frontier for, people building pipelines?

Ed Bailey (12:52.555)
Well, and that's what we're already doing now. So we're always going to maintain that ability to get data from anywhere to anywhere. But now we're building in augmentation pieces to help out with various issues that we've seen. So I'm glad you mentioned storage. We've repeatedly seen where we go up to a team, we start talking about data lake, and they say, well, it's going to take us three to six months to get object storage, which is kind of crazy to me because when you think about, you you buy cloud for a capability.

that ability to automate, know, click, click, click, here's your object storage. So if it takes three to six months, man, something's broken. And then it takes another 30 days to get IAM access. It takes another 30 days to get lifecycle management, because typically the security teams especially do not have cloud skills. So they're relying upon a cloud center of excellence and there's a queue and all those fun things. And so this is where we came up with the idea of how do we automate and make it easy so that that way click a button,

Five minutes later, you have object storage. You move a slider, you now have a lifecycle policy. You use a wizard in order to go through your IAM access. So the idea is that turn that three to six months into three to five minutes. And so we found that ease of use to be very good. Now, one of the things from a first principle standpoint that I've been really a big fan of, we make it as easy to get data into the storage as get it out.

I think that's an important thing. look at storage vendors, typical system analysis vendors, SIEM vendors, you get data in, there's no easy out. And so I think that's from a, in order to differentiate is really important. And then finally, also we're starting to build search programs. And so the way I like to look at it is augmentation. Companies already have their primary, their Splunk Elastic, Palo, SIEM platform system analysis, but there's always...

add-ons, like to call augmentations that we start to offer. And this is where data lake becomes so important because they're looking for, this is a typical where, you we tried, we tried out object storage. We tried out the, you know, the cloud vendor object storage. We tried out the tools that come with them to search them and they're awful. And even in, in this is like, goes back to SQL. I mean, like I always thought that, you know, the S and SQL doesn't stand for simple. It doesn't stand for something else. And so, and this is where

Ed Bailey (15:12.693)
where it just goes back to don't have the query skills, don't have just, it's just new, don't quite have the data skills to do this. So this is where search comes in. So the idea, make it approachable, make it easy, have an AI driven interface to help someone write a search. Where you're literally saying, find me this string in this place and it'll help you write the search for you just to make it approachable.

Max Clark (15:35.854)
So we had this big transit, I'm not going to say like holy war, right? Kind of was of going from structured to unstructured, schema to schemaless data storage, right? So from your relational database into document databases effectively, Schemaless databases. And then now with AI and LLMs, and since you brought this up, we're getting into this now really unstructured data, being able to search across

Ed Bailey (15:43.019)
Yep.

Ed Bailey (15:52.842)
fun time.

Max Clark (16:05.258)
lots of things. And this is something where AI vendors and providers are starting to talk about this capability, but it doesn't feel like it's penetrated the enterprise that much. But being able to really actually start now correlating and say, hey, I've got this data in this document, which is stored in this place, which then is related to this other thing and starting to make those correlations. And a simplest example I can probably think of is like,

like transportation logistics, right? You know, have a, um, you have a bill of lading, you know, you have a, you have a document, you have a something with, with things on that document that then has to correlate to the stuff that actually gets on the truck. And when was it loaded? When was it unloaded? And like, when you start talking about like RFID, you know, tagging, which is really cool. If you've ever been involved in retail and they've gone like full RFID, what they get out of it, it's expensive, but holy moly, it's amazing. But, but then

But now you've got like, okay, you want to correlate a PDF document with something that's in a relational database. So something that's in some sort of CRM sales order process with something that's actually an RFID tag, you know, being scanned in and out of, you know, a barcode reader. And now you want to mash all this stuff up and create business intelligence with it. And it's like, okay, go like, how do you, how do you pull that off?

Ed Bailey (17:16.587)
Yeah, it is. It is really hard. then, the BI team struggle with workflows you're talking about. IT and security teams struggle even more, because frequently they don't even know where to begin. Because imagine trying to correlate data that has different types of timestamps. So typically, you're going to look at there's a mix of UTC, there's going to be GMT for Windows, there's going to be local timestamps. It gets even more fun when you're a global organization and you're in 22 time zones.

And so those are the kind of things that just drive people crazy. And this is where the normalization and the data prep process is so important. So like in data science, for example, you have the collect, the cleanse, the fit, the normalization processes. And this is where it's important for the telemetry pipeline vendors over in the system analysis vendors in the anti-internet security space to help users work through these issues, even if they don't necessarily know what these phases are called.

And so for example, this is how you help you walk through timestamp normalization to identify, you like a Palo event is gonna have four timestamps. Which timestamp do you really care about? That's the kind of thing. And that's don't get me started why there's four timestamps in an event, but you have to ask yourself, which one are you gonna key off of?

Max Clark (18:34.706)
you know, I mean, voice platforms, right? You have a call record and the call record has a start event. And then there's another record, which is the, the, the ending event. Right. And then you've got to go through and dig through and then, by the way, and this is the thing that, you know, that's really funny because. know, especially the old phone companies all bill in six second intervals. And the reason why they bill in six second intervals is because the radius system that's actually outputting this call detail.

only works in sex, you know, second intervals. Right. So you're like, you know, there's always, I really laugh about it. Cause I've gotten to that point where it's like, you find something that makes no sense and you're like, okay, what's the underlying like technology issue or business process that's actually driven this like nonsensical like reality that we have to deal with that was people.

Ed Bailey (19:15.531)
you

Ed Bailey (19:21.835)
It's insane. recently got, I, this is a, it's a huge privilege to get to talk to these teams. Telco vendor describes to describe that what they call their cyber interface, their cyber surface, billion events, you know, they call it a billion events, you know, every other second, but then they call their core. That's their actual telco environment, trillions of events every second. And I was just from a scale perspective. And then we start looking at why.

You know, they're using some fairly old technology, but it has been customized to the nth degree in order to handle it. And these are exactly the kind of things you'd like. Like you just said, there's a why behind this, like why call records are the way they are. You know, when you start looking at the local, the local routing, the way that is, you know, how that works in this, it's important to ask that why, because it is critical. And going back to your about talking about correlation and data. This is, and this is where for companies, it's important for companies to train.

It's important for companies also to hire data skills. remember one of the best things I ever did. I went out and hired a data scientist into my team. Didn't know lick about IT and security, but he sure knew a lot about data. And this is the idea of how do you bring those skills in order to level up and make game changing, to make really things that matter, make changes that matter, not just changes on the periphery. Like for example, that we build a bunch of ML work into our SIM, could not have gotten that done.

without his ability to understand data and think about it differently. But then the thing I love about it is he taught everyone else how to do it. So now my whole team is talking about data. And this is just so the culture is important, but the vendors have to step in and provide the tooling as well.

Max Clark (21:01.388)
What prompted you to jump from the end user enterprise side of the house over into the vendor service provider side with Cripple? You were a customer and you became...

Ed Bailey (21:09.611)
Oh, was agonizing about this. I'd never. So this is like very stable, well-established job. got to, I mean, it was a great job working, working with some really amazing people, but I'd been there 15 years working really hard. And, I was just amazed because you know, getting involved so early with Cripples, like just, here's a funny story. The Clint, Sharp, the CEO was, this was early 2018. He's you know,

They've, they've, you they didn't have any customers, didn't have any visibility. So he puts out a Facebook ad and says, Hey, you know, come talk to Clint Sharp about Cribble. And I just happened to see it. And I was like, I don't know what a Cribble is, but I knew who Clinton was because he had sold me stuff at Splunk. So I just happened to take the meeting. It could not have been more random, but then what he showed me, I was like, I want that. I mean, it's like, I want that now. then. But some rules, because I just saw the value, because we were struggling so much. was spending 80 % of my team's time.

on managing the mismatch of stuff in order to get data into my systems. anything I could do to make it easier, I was like, I want that. And then we got really involved. I got to see the culture. And this is where a couple of years later, it was just, I agonized about it because I like, I've never worked for a vendor before. So like, when you talk about like salespeople where people would be avoided. So he's like, you mean I'm going to have to work with them? And so it, but I got to talk to like Abby Strong and

and who runs marketing and it was just build those relationships and that trust. And I was like, you know what, I want to do this. Cause it was a huge leap. I never forget my wife was like, why are you agonizing so much about this? Cause I was in a mild panic about it. And I like, got to understand, I'm leaving a multi-billion dollar company to a company that has a couple of million dollars in revenue. So I just want you to think about this, but it was a great change. This was a great change and I'm glad I did it.

Max Clark (22:55.662)
Yeah.

Max Clark (23:01.23)
I, know, you were struggling with it, right? Which means other people are struggling with it and they maybe they haven't found telematics or they've, they've, they've not telematics, telemetry pipelines, or they haven't found Cripple yet, or they haven't, they haven't, they haven't dipped their toes into this. So they're like still in the fence line where it's like, okay, we kind of think that this thing is out there. Maybe it's going to solve some problem, but we're not really sure if it's going to solve enough of a problem. And do we want to spend time on it? what is the like,

Ed Bailey (23:04.06)
Yeah

Max Clark (23:28.942)
you know, like what's the like inception thought that you wish, you know, like that you'd want to plant with those people, right? Like other than like, hey, talk to me, let's have a conversation. I'll show you the golden road to the promised land, right? Like what do teams need to know about?

Ed Bailey (23:41.163)
Yeah.

The biggest thing I talk to, I always talk to is, I typically start a conversation, is tell me where it hurts. What are you struggling with? Because that's what started me to look for these things is we're struggling. This is driving me nuts. I mean, I'm getting yelled at, like my CIO is yelling at me because they're spending too much money. And everyone that uses my tools are yelling at me because they don't have enough data. so I'm literally, mean, I have one remarkable day where I had all these VPs and they're going on and on about how they needed faster response.

time and more access to data. Not an hour later, I get called in the CIO's office wanted to talk about how I'm spending way more money than he wants me to spend and I got to figure out how to do something else. And I leave that meeting thinking like, like, like, I'm not gonna say it like I like I'm kind of F'd here.

Max Clark (24:29.452)
Yeah, yeah. my goodness. That's so nobody's nobody else has ever had that conversation in the history of IT. I'm sure right.

Ed Bailey (24:36.402)
So those are the kinds of things where, where I talk like, Hey, does this sound familiar? Is this the kind of stuff you're struggling with? Are you spending too much time doing this? Is everything too hard? And this is where we start talking about how to make your life easier, how to get your bosses off your back about your costs. Let's put in a plan, a plan of how we can flatten out your cost in order to calm down leadership. But then also, cause this is personally for me, let's talk about how we can produce better data to help you solve problems.

Because end of the day, that's what gets engineers, architects excited is, how can I solve problems better, faster, easier? And that's the thing that excites me about Kribble because it's one of the few places where you can do both.

Max Clark (25:17.582)
Do you think that the SIM vendors look at you as friend or foe in these conversations?

Ed Bailey (25:22.419)
Absolutely, absolutely a foe. mean, there, and I appreciate, we have some really close partnerships with a couple of SIEM vendors who really appreciate what we do because we can accelerate value. we large insurance company. were on a probably the most popular SIEM platform for 10 solid years. And this is the CISO I was mentioning. He's like, I need to make a change. They're not offering me value. They want to double my costs. But he had five months in order to get it done. So he trusted.

you know, other partner and us to get this done in five months. And it was great. mean, just, just remarkable as we start this transformation, because we were able to, able to make that move. Cause typically teams don't want to do it because it's a lot of work. It's a lot of risk, but it was exciting that that trust is what really felt good that would know that we're going to be able to deliver you the value you want and give him the upside he wants in terms of data. So that, that was, that was key. And those are the kinds of things we talk about it, but you're absolutely right in general, they, they all view us.

Because think about the business model. It's all based on consumption. More data, more CPU, more storage, not value. And so we're directly in this is any telemetry pipeline vendor directly attacking their business model.

Max Clark (26:36.362)
And to the credit, this is one of the things I think you get a lot of value out with the MDR vendors, right? Because, I mean, yes, they're making money on the SIM in a lot of cases. There's some cases where they don't. But it's a big cost expense for them as well. So the MDR, they don't want terabytes and terabytes and terabytes of data being pushed into SIMs that they have to sift through and do. So it's a.

Ed Bailey (26:41.332)
Okay.

Max Clark (27:03.566)
You know, I'm thinking about like transforming and doing effectively a rip and replace of a SIM platform in five months. And I just immediately go to like, well, you've got a lot of data now that you have to export and then re-ingest and put somewhere else because at some point you're going to have to do, you know, compliance function, you know, thing of like, Hey, you know, tell me what happened 15 months ago on this platform. Right. And where do people store that?

Ed Bailey (27:23.049)
Yeah, we didn't have to do any of that because we have that problem solved. that was in the, this is, back to the ability to read data sources. So we didn't have to export any data. We just had to expose it to Kribble's tools that could then read the proprietary formats that they had from their, from their legacy vendor. So they didn't have to, they just, all they had to do is basically give us access to it. All right. So large proprietary vendor data that the data that's in their own format. Yep.

Max Clark (27:44.168)
Wait, Say that again?

Max Clark (27:50.606)
You're talking about the sim. You're talking about the sim. Okay?

Ed Bailey (27:52.523)
Yeah, so they had five years of data in there. That would mean a lot. So instead of having to take that data, export it, and then reformat it to put it into another tool, into another data lake, they put this data in object storage and expose it to our tools because we can read their formats, SIEM vendors' formats, and so they can search that data anytime they want to without that vendor.

Max Clark (27:57.976)
Okay.

Max Clark (28:19.832)
That's a big deal. mean, I'm trying to pause here, like, like just like, that is a really big problem that a lot of people end up having to deal with and solve for.

Ed Bailey (28:21.555)
It's a huge deal and it's...

Ed Bailey (28:34.535)
it stops, it stops vendor migrations. I mean, there's one particular vendor that has been got basically, you know, sold for, you know, basically crackers to another some vendor. And so you have about five, 6,000 users for this particular vendor who are orphaned. And this is the kind of thing that they're, you know, they want to get out, but it's, you know, they don't want to go to the vendor that bought them because they want three X more. But the thing is their data is trapped because of the nature of this particular. so

Max Clark (28:57.486)
Yeah.

Ed Bailey (29:01.535)
They're flooded into us looking at how do we do this? How do we get our data out? How do we do these things? And it stops migrations cold because this is what I had a conversation with the healthcare CISO and he was like, I hate my SIM. I hate my SIM vendor. I mean, I'm going to be really, but the cost and distraction of migration is so high. And I'm concerned about maintaining my posture that despite how much I loathe what we're doing,

Max Clark (29:21.538)
Mm-hmm.

Ed Bailey (29:31.327)
the cost makes me really hesitant to migrate. But this is where we walk them through. It's like, right, we can put Cripple Stream in place, and we can maintain the data flows as is to your existing SIEM. Nothing changes. But then we can take your entire production data flow, reformat it, and point it at your new SIEM vendor. So now you have two flows of data. They're optimized for both vendors. And now you can then start. The thing is, you never migrate your content.

that migrate from your old SIM because you have to rewrite it. That's why I never say migration because it's not like, I'm going to take the dashboard from here and put it over here. You can now rewrite everything, all your content into your new SIM and validate it. Because that's the hardest thing about SIM migrations is the validation of content because you rarely have enough data in your new SIM to validate. More often than not, it's a very bio-condious moment where we hope it works. Now we have some samples of data that might work, but you just don't really know because you have to do hard cutovers.

Max Clark (30:17.091)
Right.

Ed Bailey (30:26.869)
But having Cribble in place, can have my complete production data set so that I know my content works. And then you could do A-B testing. So this is like for this vendor, said, this particular CISO. So get this in place, and I want you to do A-B testing. So 60 days, if you got a hit on your old SIM, get a hit on your new SIM. And that way you can compare that when you cut over, you have the same level of coverage you had in your old SIM. So they did it for 60 days, they felt comfortable.

Then he left his ultim going for 30 days after. The idea is like, what are we missing? They didn't miss anything, but he just, that comfort, that comfort of knowing that we're going to, we're going to catch this. We're going to understand gaps is what gave that gave him the comfort of the cake. We're going to make this change. But I just really struck me as like, I hate my head. I hate myself. Yes.

Max Clark (31:02.498)
Right.

Max Clark (31:11.886)
Great.

I despise it, Well, I mean, but that's the whole thing, right? So you start talking about managing risk, right? A lot of these decisions have nothing to do with actual technology or capabilities. It's just what you perceived risk and can we mitigate risk or not? And if we don't think we can mitigate risk, then it's just like we're not going to do it because it's just not worth it. Even if we get better tech, we get a better tool, we get better capabilities, it's not worth crossing that Rubicon, right?

Ed Bailey (31:39.423)
Yep. that's the thing that understanding the psychology of the decision making, it's the same thing on the IT side. You imagine going into, you know, it's like, you know, hey, you're, you know, we have a better searching experience. Well, I have 5,000 users using our existing search tool. We have 10,000 dashboards and 21,000 alerts. Are we going to move all this? And how much is your tool better in order to mitigate the displacement cost of moving all this? mean, and it's rarely different enough.

in order to make that math work.

Max Clark (32:13.087)
Ed Bailey (32:14.781)
Unless of course you're asking to double and triple your renewals, because we are seeing that lately. Some vendors are coming in and what they'll do is they'll wait four or five weeks for renewal and then literally say, hey, we want 3X more. Yeah, we had one customer, one of my favorite, favorite customers, big Irish guy, great big red beard, who has the colorful use of language that I could appreciate.

Max Clark (32:28.184)
Drop the bomb.

Ed Bailey (32:41.099)
That's exactly what happened. And so he is on the security side, but he so he went to the IT guys. I was like, Hey, I can migrate you to their biggest vendor in 30 days. And that's exactly what they did. So even though they got their renewal five weeks out, he was able to migrate them to their biggest vendor in 30 days. And they, the other vendor came into the CIO's office and they were all waiting there because he thought he was going to get himself a big check. And it said he got a, he got a very colorful way of like never come back here ever again.

Max Clark (33:09.568)
You know, it's

Ed Bailey (33:10.635)
Okay.

Max Clark (33:13.966)
Without naming names, I know who you're talking about. It's an unfortunate reality of a lot of business, because once you become captive in a platform anywhere inside your stack, and there's just simple math. It's pretty easy. If you raise your rates x%, you're going to lose x%, but your total revenue is going to increase x%. And then the exercise just becomes

Ed Bailey (33:19.242)
Go.

Ed Bailey (33:24.126)
Yup.

Max Clark (33:43.022)
how much can you dial that knob before you exceed your tolerance of actual overall loss, And I mean, that's just, is it good? Is it bad? It's just reality, right? Like anybody optimizing their business is going to want to increase revenue as much as they possibly can to create the most value that they can for their shareholders. it's nice to have options if you're on the receiving end of that, because guess what? I mean, this is where it becomes like a

Ed Bailey (33:49.472)
Go.

Max Clark (34:12.472)
Capital is a maximalist, right? It's like if you have competition and you have an alternative, like you keep everybody in that relationship honest. It's like, hey, listen, I can leave you guys if you want to do this. I don't care.

Ed Bailey (34:21.867)
Yeah, exactly. This is where we're mentioned in terms of downsizing your system analysis or SIEM. A lot of times vendors won't let you. I mean, we got to see this real recently. The guys, you know, they needed half the size of their system analysis and like, sure, here's five terabytes, but for the price of 10. And so those are the kind of things that are just, and this is where I have to warn them. It's like, look, I want to set some expectations here that this is where you got to be prepared to leave.

Or you're going to be looking at say a 20 % increase in renewal, you're going to have to, you know, we're to to, we weren't happy to give you the advice of the options and position yourself, but you're going to be ready to play hardball.

Max Clark (34:59.128)
Yeah.

I wish more people were ready to play hardball. It's just, they think they're ready to play hardball, but when they actually get to that moment, it's scary. It's really scary. And you have to be able to walk into that conversation with conviction.

Ed Bailey (35:14.719)
You're absolutely right. And the risk conversation, you got to understand that. Had another customer, remarkable conversation where the guy who owns all their IT and security telemetry calls us up and he's like, the CFO said we have to make a change. So think about that. When the CFO, when the business relationship has deteriorated to the point where the CFO is making the decision, not the CIO or the CISO.

Max Clark (35:38.486)
Yeah, that's also not a good conversation to be in.

Ed Bailey (35:40.555)
You could just tell it, he is just like, I don't know what just happened, we're going to have to, like, we have to make some changes here. But this is where you've, you've, you've got to have, and this is where from a strategy standpoint, I always encourage decoupling. I always encourage open formats. There's, there's a set of like a strategy. I have this whole thing. It's, it's cheesy, but I call it a strategy for change. And this is the idea of decoupling your functions. You know, you know,

Max Clark (35:58.594)
Mm-hmm.

Ed Bailey (36:10.207)
Data lake versus retention versus compliance versus your system analysis versus your detection platform, decoupling your your data ingestion, your data collection from your backend, data skills, putting all those things in place with a focus on giving you options. And I think that's as options are key because if I have options, I can make choices. If I don't have options, I'm stuck.

Max Clark (36:36.278)
Ed, this has been fantastic. I'm going to leave you one last question here for us to end with. Somebody listening or watching this, interested in thinking about a pipeline, curious about Cribble, what is something that they should be, what is something they should know as they're kind of dipping their toes into this and starting down this path?

Ed Bailey (36:41.679)
Huh.

Ed Bailey (36:58.699)
This is be open to change, be open to learning about your data. Like first question I'm going to ask you, let's talk about your data. What's in your data? What do you care about? Be open to learning about your data, being open asking questions about just being a little bit uncomfortable that to look at your data differently. So I think it's very important. Also look for quality user experiences. When I say user experience, that means a UI that I can understand, things that are self-documenting.

UI user interface that's going to help me get my job done. Be really focused on that. And then also, just, you know, you wouldn't have it. Just think differently about your data and just adopt, adopt a data mindset, bring that into your IT and security world. I think those are the things that to be really open to. Cause I think that's something where Cripple is uniquely positioned to be able to help you with those, those choices. You know, you people like me, they're going to help walk you through it. Are going to help be there to help you. It's not going to be, you know, here's a license and good luck.

And so just, you know, it's just real important to just think about things differently. This is a joke I had one of my guys who worked for me made. He was like, yeah, we're going to break the wheel. We're going to do something different. Because right now the wheel's rolling over us. We need to break the wheel in order to do something differently. And that's where we started going on this experience of, and this was early, of how we can treat our data as code, make it flexible, make it transportable, and give us options. So now we're the ones making choices and not our vendors.

Max Clark (38:24.32)
Amazing. Ed, thank you very much. This was fantastic. Really enjoyed it.

Ed Bailey (38:27.339)
This is great. This is one of my favorite experiences. I really appreciate it. Thank you for the invitation. Love to do this again. Thank you.

Max Clark (38:33.438)
Absolutely, I'll keep you up on that one for sure.

Ed Bailey (38:36.74)
Appreciate it. Hey, have a great day. Take care.