What is network Redundancy? Does your Company Need it?

July 30, 2020

Anna Claiborne, SVP of Software Engineering at PacketFabric, delves deep into the potential future of global networking and offers invaluable insights into the ever evolving future of global networking capabilities, telemedicine and big data. 

Episode Transcript:

INTRO: [00:00] Welcome to the Tech Deep Dive podcast, where we let our inner nerd come out and have fun getting into the weeds on all things tech. At Clarksys, we believe tech should make your life better, searching Google is a waste of time, and the right vendor is often one you haven’t heard of before. 

Max: [00.18] Hi I’m Max Clark and I’m talking with Anna Claiborne, who is the co-founder and SVP of Product and Engineering for PacketFabric. Anna, thanks for joining.

Anna: [00.26] Hi Max, thanks for having me.

Max: [00.28] For the people that don’t know PacketFabric yet, or haven’t listened to the other podcast with Jezzibell, at a high level, how would you describe what PacketFabric is, or actually, let me restate that — What was the impetus to create PacketFabric and how did that influence what PacketFabric is? 

Anna: [00.45] So, the impetus to create PacketFabric came from — a lot of it came from the cloud revolution right, that we say? At least, that most of us in the tech world saw… So, we went from this old, archaic model of having to rack and stack servers, configure each and every one, or you know, ZTP them, and then sit there and give them all their special current feeding, to just clicking a button or calling an API and provisioning a thousand servers in a cloud. And you know, that’s great – it helped really advance a lot of things technologically, helped spur this whole new revival in like, app development. But that same thing never happened for the network, right? There’s two sort of crucial pieces of infrastructure, there’s compute, and there’s network, at the most basic level. I’m including storage in compute there, just two big buckets: compute and network. And so, compute jumped ahead, and these massive leaps and bounds as far as what you can do with it and how you can automate it and how easy it was to provision it, and network didn’t change, didn’t change, didn’t change. You know, provisioning a hundred-gig point to point is still exactly the same as provisioning a frame relay circuit back in 1996, like… Absolutely nothing changed with that. We realized that fundamentally that’s a huge problem for everybody, right, because you can only — advances in technology can only move as fast as infrastructure advances, right? As fast as that — it gets easier, and better, because if you’re waiting months to get more bandwidth or more compute, your innovation is going to stagnate. That was the driving force behind it.

Max: [02.26] I mean, one of the big things you touched on is anybody who has ever configured or ordered a circuit from a carrier, they’ve gone through a process, right? Even if you’re in a carrier neutral data center, that is a process of pain, right? I mean, I don’t feel like it’s… It’s, you know, so…

Anna: [02.44] A gauntlet, really.

Max: [02.47] A gauntlet, right? I mean, I was in a lot of forums where one of the big ones that came out was, you know, question to the carrier: could you just automate the LOA generation process for me? You know, if I order the circuit from you, can you just automatically send me the LOA, so I can set it to Equinix and order the cross connect without having to involve more people in the process, right? But PacketFabric is fundamentally more than just, how do you work through the pain of provisioning these initial circuits, because you know, if I’m ordering an internet circuit, it’s… Okay, it’s a lot of pain to get it, but then it’s just there, or if I have two facilities I want to link, it’s a lot of pain to get that up, but then it’s just there and we can make modifications to that. You took a pretty big next leap on this, and this is more than also just cloud, you guys… I mean, it’s network as a utility, it’s network as code, it really is something a little different here.

Anna: [03.36] Yeah, and that’s — I mean, a lot of that, you can sort of trace back to how we, you know, how we think about building our product, is that we think about it in terms of basic building blocks, right? You have ports, and you have VCs, and then on top of that what else can we build? We can build point to point circuits, you know, A to B? You can build connectivity to cloud, you can do data center connectivity, and then you can do all those things from a single port, right? And if you go back to saying that, you know, you set up a point to point circuit, it’s great, you’re done and you never have to think about it again… If only that was actually true, right? If only anybody never had to think about their network again after it was setup, but that really happens, you know? What is the path when you need to upgrade from ten gig to forty gig? Or from ten gig to a hundred gig? The path is exceptionally painful, it’s basically hitting the reset button on going back to, you know, talking to telecom sales, renegotiating, going through the whole provisioning process again, the whole circuit acceptance process, you’re stuck in a treadmill, going through another — anywhere from thirty days to six months of activity to get from that ten to a hundred gigs, and we’ve just… You know, we can press that down to a minute.

Max: [05.02] What I loved about PacketFabric when I first saw this announcement and you guys came out of stealth of, this is what we’re doing… A lot of the pain points that you get into of capacity planning, contract cycles, how much capacity do I need and what term, pricing based on term… There’s a lot of that that comes into this where immediately you say, you don’t have to worry about capacity planning, what ports do you want with this? You want a faster port, you want a slower port, you want this port? You have a port, right? You don’t have to worry about capacity planning between markets, you don’t have to worry about term lengths. You know, you can get into these flexibility agreements that are — you don’t know what you’re going to do with it! And anything that makes my life better, I’m all for it, right? Who wouldn’t want this? Oh wait, I don’t have to think about what my eighteen month plan is for capacity between these two markets, I just can change it really easily? I get all of that, and that speaks to me a lot. How much of this though is like… You build it, and then see what happens, you know? If you build it they will come, or there’s a certain amount of chicken and egg, because the original idea for PacketFabric versus how people are using PacketFabric… I mean, I’d be shocked if you said you predicted all of these use cases now that have come out of it.

Anna: [06.21] A lot of them, yeah… I don’t want to — I don’t know that there’s a good answer to that question, because either you sound over-confident or you sound like you don’t… Like you don’t know your customer that much, but a lot of them are pretty well known even five years ago. Remember, we’re not that — the company is still fairly young, in terms of for an infrastructure company, five years is insanely young. And so, a lot of things that existed then exist now… I think the thing that we probably didn’t see being quite as popular as it is now is the — is like the Layer 3 use cases, you know, people wanting to actually automate setting up BGP between — for the cloud scenarios. I mean, who saw cloud being a bigger thing, right? No-one saw that coming, clearly, but the hybrid clouds and multi-cloud use cases have definitely become much bigger, in like the last just even year or two, because I think there’s this sort of… People are going through a bit of a disillusionment period with public cloud, because there was this huge race to… I don’t want to use the servers any more in a data center, put everything in the cloud, everything goes in the cloud. You have people trying to put like, their AS400 database in the cloud, and then it’s like, oh, that’s actually a bad idea. That’s not going to work, but also the cloud can be expensive, right? Unless you’re utilizing it effectively… And so, there was this mad rush to put anything and everything in there without really considering the fit or cost, and now is sort of the era that I think we’ve been in the last couple of years where people are stepping back and saying… How do we best utilize public cloud and our own resources, and make the best out of all this? And with that actually comes a lot of more new communication challenges, because maybe AWS has some feature set that you want to use, and Google Cloud has some feature set you want to use, and Azure has some feature set that you want to use… Then on top of that, you want to use some Oracle, or you want to have a direct connection to Salesforce, right? So, the ecosystem got much bigger, and on top of that you have a couple of existing data centers that you have compute in and you want to move workloads around, and do all this stuff that’s been promised for like, years and years now, because that was the whole part of potential of cloud, right? It was free the workload, and be able to shift it where it’s most economically advantageous to run it at the time, and so people are starting to go, well now I want this dream fulfilled and I want to get my costs under control and I want to run things in the appropriate places, and the little bit that everybody forgot about in there is the network, right? If you want ultimate flexibility and compute, you’re moving all of this data over the network, and if you have that rigid — that super rigid piece in between, that can’t be changed, it can’t be dynamic, you can’t create new connections on the fly to where you need, that’s a huge problem – that right there is the show stopper for that dream, because you cannot move data without the network. 

Max: [09.23] I was thinking… I think there’s an xkcd comic where they talk about the trough of disillusionment in the tech adoption, so I’m cracking up over that. So, a lot of the infrastructure companies of course are pushing hybrid cloud, hybrid cloud, hybrid cloud, hybrid cloud… And there are a lot of use cases, hybrid cloud is amazing, and we see them a lot with our customers. But the other one I don’t think anyone was really predicting was this cloud to cloud, completely virtual environment, of… I don’t have physical infrastructure anywhere, I just need AWS to talk to GCP, or Azure or Oracle – in your example – and I need something to actually tie all these things together for me, privately, fast, and I can just click a button and make it happen. And you guys, you’re doing this. I mean, this is part of your core product. The other thing that I find very unique, I think with the early story of PacketFabric, was more… This idea of like, Layer 2 as a service, or network automation as a service… Of not, per sé, hey this is a cloud connection technique, this is a… We want a network between point A and point B, and you want that network to be… I mean, do you guys even sell one gig? It’s like ten gig, forty gig, hundred gig, x-by hundred gig, four hundred gig, I mean it’s…

Anna: [10.31] We do, we do sell one gig, it’s actually by percentage of customer allocated ports, ten-gig is the most popular, the next most popular is hundred gig, and then after that, forty, and one gig have about the same percentages. And hundred gig is getting pretty close to… It’s a lot closer than I thought it would get to ten gig demands, so that shows you where the world is going – everything is going hundred-gig. 

Max: [10.56] Well, I mean… Optics for an Arista for a hundred gig is relatively inexpensive. You can buy a hundred gig optic for fifteen hundred bucks if you’re not going third party, so it’s… Arista, don’t listen to that, but you know, I mean… Hundred gig finally got affordable, it feels.

Anna: [11.14] Yeah, in the last like, in the last year the optic prices have plummeted and you can get really… You can get sub-thousand dollar pricing at any sort of scale now, for a Layer 4… And that’s, you know, without even really trying, you can get better than that, even! So yeah, it’s very cheap, ubiquitous and affordable now.

Max: [11.34] But I mean, you’re already past that point. You’re at four hundred gig, six hundred gig and eight hundred gig, right? 

Anna: [11.39] Yeah, that’s — that is where we will be going next! 

Max: [11.43] This is when I feel really old, because then I have this moment of… Well, when I got on the internet, I had a 56k – actually I have a 1200 baud modem, but I’m not going to say that initially, and now you’re going to start thinking ports, I mean, this is a six hundred gig port. That’s a massive amount of data being shifted, but being able to gain access to that, and to have a dynamic pipe that you can turn up, move data, turn down, reconfigure, do something else with, and it no longer becomes a whole new factor of saying, “Hey, I need to move stuff from wherever they are, from two points, just instantly,” I mean, this turns into… People are doing things… Would you predict? I mean, would you predict IP transit carrier would be using you to backhaul to customers in different markets, as an original use case? 

Anna: [12.29] No, no, I means there’s definitely a fair amount of stuff we did not see coming, right? I guess that’s part of any as a service model, right? Compute as a service, network as a service, and so part of it is just you’re putting a generic framework out there and saying, “Hey people, what are you going to do with this?” And so there’s always some surprising things, and you were mentioning before the cloud to cloud use cases, and that’s definitely something that surprised me, when people first started asking about that, because it’s shifted — and I mean maybe this is how old I am — but you know, back in my day, you always had, you know, there was network provisioners and network engineers and people who — and peer — people who did peering, and so specialized in the network portion of it, it was its own little tight, network sphere of people who had deep expertise in this one area, and now there’s DevOps, SRE, just straight software engineers that are like, can’t I just spin up… Because their mindset, and where they come from is everything’s as a service, and they’re asking, well why can’t I just spin up network between these two things, why doesn’t that work? Which is pretty obvious, you know, when you think about it should work, and we’re like, “Wow, that should work, we can do that.”

Max: [13.49] I’ve, for years now, have advocated for higher bandwidth enterprises to, you know, take some small increment of data center close to their offices, their primary locations, do an in-metro connect, you know? Because a ten gig circuit even five years ago was relatively inexpensive in a metro… And then, now you’re in a data center that has dozens or hundreds of different carriers, and you have extreme flexibility in pricing, for what IP transit bandwidth you’re going to acquire, or where you’re going to land your MPLS circuits… Basically, the world’s your oyster with competition in different carriers, and… You know… The big shift now, right, we talk about — I think PacketFabric originally, when I was thinking about it, was… Oh, I can connect my own endpoints easier, right? You know? Or I can connect to cloud easier, but now there’s also this other use case which becomes, instead of deploying an MPLS network, I can connect to other PacketFabric customers, anywhere on your environment, dynamically, instantly, from any of these locations, and the data center still becomes a strategic asset for me, of… Get to the data center, connect to PackeFabric, have massively phenomenal port speeds, you know, with dynamic allocation of service, and I don’t need to worry about talking to a telco for MPLS, because any customer that’s on your fabric, I can connect to. And that’s interesting as well.

Anna: [15.06] Yeah, and talking about some of the old – I shouldn’t say old – some of the old world, how it used to work, you know, peering used to be a highly specialized discipline… Figuring out who to connect to, based on traffic stats, you know, over your public iaxes, you know, who deserves a PNI, public network interconnect, which is just a cross connect to another network, to do this traffic transfer… And, all that now — because there’s this fundamental limitation there, especially for PNI. If you’re doing a lot of traffic to another network, and you want to do it in a private, secure way, ie: not over a public iax, that you want something that is going to be more reliable, and direct, that means that you have to be in the same location as the network that you want to connect to; you have to physically be in the same data center. And the great thing that we did, is we eliminated that fundamental limitation. You no longer have to be in the same data center if you want to create a private, direct connection to a network. So, it doesn’t matter… You can do that anywhere on the PacketFabric platform, and you can also still connect to public iaxes over PacketFabric too, and peer with whoever you’d like to over there. To me, that was always one of the annoying things when I was building networks, for example, back at — I’ll step all the way back to Prolexic for fun — so, building the Prolexic network, which was a massive DDoS network, there was definitely like, we needed to take in a lot of traffic from a lot of different sources, and we had to peer… You know, we had to peer wherever we could, because we took in a ton of traffic over transit, and that obviously gets very expensive, and this is sort of like the first time when I encountered this, like why can’t you just get more bandwidth when you need it? When you’re taking in an eighty gig DDoS, which at the time was big, that was very big, and you are – you know, you have like… You know, eight by ten lag to your, you know, to your favorite transit provider, and you can’t just add another ten gig quickly, it’s… It’s very disconcerting. It takes like, weeks to get that added. So, anyway, we had this problem, you know, where we needed to peer as much as possible, and you know, our footprint especially when we were starting was pretty limited, we were only in a few data centers, and we couldn’t get a lot of the peers that we needed, wanted as private peers at that point, and had something like this existed… It would have made it a lot easier for us.

Max: [17.46] Sure… And you still see a lot of things where it’s, oh if you want to peer with me that’s great, but you have to peer with me in three timezones, and exchange equal traffic, and there’s all these other things that come into peering that become all these layer eight, you know, negotiation tactics… My favorite is the ones where you send a peering request out and the salesperson that is monitoring that inbox responds to you with like, the sales, “Oh, great! Here’s my sales pitch back to you!” Like, no, this is a peering ask, you know?

Anna: [18.11] Comcast! 

Max: [18.17] Oh boy. Let’s not get into this talk. So, so walk us through… You know, so we’ve decided we’re going to sign up with PacketFabric. Walk us through the process of, you know, of what does it actually mean in terms of like… Forget like, actually… You have a month to month option, so we don’t even have to talk about contracts, let’s forget it from a contract point, like, from a technical standpoint, walk me through the process of what this actually looks like for somebody, and we’ll pick, they’re in Ashford, Virginia, or Los Angeles or one of these major carriers hotels, so… Obviously you’re on net and they’re on net, and everything works out great.

Anne: [18.52] Well, you would just go to dub-dub-dub PacketFabric dot com, click on register, go through filling out some basic stuff, technical information, address, and then you click through the MSA, just, you know, click and agree, hit register, we’re on the backend and we just verify that the company is a real company and in the next twenty-four hours, you get your log-in and you log into the PacketFabric portal, you would then go to ‘order a port’ in the location that you are in, so say that this is One Wilshire, since that is very local to you. We’ll go ahead and pick CoreSite in One Wilshire, so you’re like, “Great, I want to order port CoreSite in One Wilshire,” you select that, the port takes about a minute to provision, you click download the LOA, and then you take that LOA and you give it to CoreSite, they run the cross connect, and then as soon as the cross connect is complete, you are connected to the PacketFabric network, and if you then say that your use case is to connect to Amazon, Google, and then also connect to your east coast data center, you would create a connection to Amazon, which takes again about a minute, pretty minor information to fill out your Amazon account ID, you know, what VLAN you’re going to use… The same thing with Google, and then you would also go and provision a port on the other side, we’ll say CoreSite against since we’re on that theme. We’ll pick CoreSite Virginia, VA1, they’ll provision a port there, we’ll then follow the same process on that side, get an LOA, get the cross connect done there, and then click to create a VC from LA to Virginia, and we’ll just say that’s going to be… These are both hundred-gig ports, you’re going to do eight gigs across the country, so about a minute later you’d have that eighty gig connection up, and then you’re also… We’ll say that you also have ten gigs to Amazon and ten gigs to Google, too.

Max: [20.52] And if you’re listening to this, it really actually is this easy. I mean, this is not, you know, paraphrased in any way, shape or form. The… So, VCs are virtual circuits, right? That’s what a VC is? I mean, so also — my brain works with association, so I think of VCs as like… You know, this is a VLAN tag on a dot one q trunk and you just happen to be in-between you know, you’re not doing that exactly on that network, but in terms of like, anybody that’s doing LAN networking, that’s a good way of explaining what this actually is.

Anna: [21.26] Yeah, yeah… And it does… You have a VLAN on either side that you’re using, they don’t have to match of course, and so – yes. It is very much that feel — and on the interface itself, all we are doing is provisioning that logical interface for that VLAN, so…

Max: [21.44] And part of what you do for PacketFabric is you guys have a big software-defined network – we’ll use the SDN buzzword, right? This is… You’re right, you have an application that’s dynamically configuring and talking to a lot now of physical devices, in a lot of different locations, to say… You know, this customer needs to come up, and what’s my inventory at this address, okay, I need a ten gig port, okay, here’s the inventory for that ten gig port, and then… Here’s everything that has to be built now in order to connect from LA to Amazon East and LA to Virginia East, or you know, these sorts of things, and the explanation of what’s actually happening is hiding a lot of sophisticated things at the same time, because there’s a lot of devices that have to be provisioned in order for this to work.

Anna: [22.29] Yeah, yeah… I mean, there’s a device on either side, I wouldn’t call it a lot of devices, but overall the network is a lot of devices, you know? And there’s some pretty, you know, when this is starting out I guess in my mind it’s always pretty simple because I built the original spec for it, so it’s not very complicated to me. But at scale, there’s some really interesting things that happen, right? Because, the laws of large numbers start to apply, and that’s where things get really interesting to me. So, you know, we have like our in-house SDN controller, you know, that we refer to, that we wrote all of it – for anybody that’s curious, this is all of our — all of our own software, we’re using open-source components in there, like… Nothing actually having to do with SDN controller, but you know, we use a PostgreSQL database, we use Redis, we always use RabbitMQ, and those are probably like the three major open-source components that we utilize in the controller. But other than that, it’s everything that we wrote in-house, and I think one of the more interesting sides is our metrics collection, because we’re dealing with about, you know, five hundred devices now, and we collect metrics from those devices every thirty seconds to a minute, there’s a little bit of variance in there depending on the device. Some of them don’t like to be pulled quite that often, so we give them a little bit more time… And by pulling that amount of information, we have, it’s roughly… As of a couple months ago, I didn’t look today, it’s going to be much larger, but as of a couple months ago it was about a forty-five terabyte database with all the stats from the network from its inception.

Max: [24.14] So, please tell me you’re joining like a time-series database in PostgreSQL so I can complete nerd out. Okay!

Anna: [24.20] Yeah, yeah, so… So, we actually started off — we actually started off with H-space, so this is when the systems when I started… So, we started off running H-space, and let me tell you, I have a very particular set of skills that encompass being able to completely destroy an H-space cluster very effectively. I burn that thing down at least, like, three or four times. This is well before we were ever operational, so it wasn’t a big deal. And after figuring out that running H-space and Zookeeper and all this is not for the faint of heart, we decided to actually go with BigTable and Google Cloud, and so we used that along with OpenTSDB, for storing all the metrics and it was super cool for a while, and it worked great, and then once we started to get to really big scale, we ran into a little of performance problems on BigTable, which we could have corrected just by throwing a lot more money at it, but we decided, you know… Not a big deal, we’re going to move it back in-house into a time series DB, and we’ve been running that for… Over six months now. So, we migrated all of our data back out of the cloud – thankfully we were able to do that effectively, because we were directly connected to Google Cloud from our own system, that’s one of the nice things, is that we’re able to dog food ourselves there for transferring all that data, and it works very effectively, which is great, and so… You know, transfer all that data back out, it’s in time scale DB now, and we don’t… We’re running completely our own software: collection, stack, storage, everything on it now. 

Max: [26.02] I wrote a billing platform in the early 2000s, based on top of MRTG, and it was feeding data into PostgreSQL, and it really — I mean, literally it was a parser that was parsing… I read through the specs in the MRTG log format and was like, “Oh, I could parse this and shove this somewhere else,” and it wasn’t that sophisticated, but now when I read all these things with time series databases and PostgreSQL, I get really excited of like, “Man, if this existed when I was doing this it would have made it so much cooler,” and then you read all these things of companies taking netflow data and shoving netflow into time series data and just what’s available now is awesome… You touched on this really briefly, but I want to walk back a second, and talk about, you know, almost every company has redundancy requirements. Inside of a data center you’re relatively stable, you know, cross-connects… I mean, they get bumped and inevitably somebody’s working on a panel somewhere, and you have a cross connect that just grounds down for some… And it always is the same thing, of you have to open a ticket, and you have to go and unpatch everything and go back and patch everything back in and it miraculously just starts working again. But you offer a couple of options, there’s one option where you get a second port in a facility for redundancy, but then you also do something else that’s very fun on your network that not a lot of carriers will do for you by default, you know, and let’s talk about this.

Anna: [27.14] Yeah, so that’s actually having redundancy by default, and it’s one of those things that’s always astounding to me, that this is a new concept, because when you pay for something like, you know, a hundred gig wave, and all it takes is a you know, an errant guy with a shovel to completely destroy that for you know, hours, days, weeks or months depending on the location of, you know, where this mythical dude decided to go out and dig, it just seems like it’s a really… It’s a really poor design, and it’s a really poor investment for your money. So, every service on PacketFabric is redundant by default, and we have at least two different carriers out of each and every data center, and in the vast majority of cases it’s much more than two, and we actually go for maximum fibre path diversity. Like, when we are… When we are actually sourcing our paths, we are looking for at what points do they cross over, right? How much crossover is acceptable on those paths, because it happens shockingly more than you think it’s ever going to happen: fibre breaks all the time, which is just amazing — it’s like, don’t these guys ever call 411 before they dig? I don’t know, because there’s always a backhoe going through something, somewhere, and if not, it’s like aerial fibre and a bird got caught in it, and… And a squirrel – now, squirrels are like… That is a couple of the best RFOs I’ve ever got is severe squirrel infestations on aerial fibre!

Max: [28.53] I’m laughing because I know exactly – the other one that I really like the most is somebody that was bored hunting decided to use our splice box as target practice, and shot out the splice box, and so you have… I mean, at this point the ship’s anchor doesn’t even seem exciting to me – what was the other one that was really terrible?

Anna: [29.10] Sharks! 

Max: [29.11] Sharks! Sharks are fun… 

Anna: [29.13] Sharks are good!

Max: [29.14] Train fires in tunnels, that was a good one. 

Anna: [29.16] Oh, yeah! Train fires in tunnels, yeah, I forgot about that one, that is a very good one.

Max: [29.20] That was not fun for anybody in Philadelphia, just in terms of not having any internet for a week.

Anna: [29.27] Yes, there was like a big period too in New York City where there was a lot of fires in manholes, for some reason. No – it was like, three or four like, right, one right after another.

Max: [29.40] It was an epidemic of fires in manholes.

Anna: [29.42] An epidemic of fires in manholes, taking our fibre and destroying everything. It’s one of those things that’s so… I’m always fascinated by undersea cable, right? The cabling ships… When they ran the new cable from Hillsboro down to Australia, Hillsboro, Oregon, down to Australia, you know… I saw lots of pictures of that, and video, and it’s fascinating to me because it’s so easy to forget that everything that we’re doing right now, you know, the fact that we’re talking to each other over the internet ultimately comes down to this weird cable, this weird, physical cable placed somewhere that all somebody has to do is, you know, chew through it – not even somebody, a squirrel – all that has to happen is a squirrel to get bored and chew through this thing and we are no longer talking over the internet. So much critical day to day activity occurs over the internet, and it rests on this very fragile, underlying thing, and people have a tendency to forget that because we’re so abstracted from it, but we certainly haven’t forgotten that and it’s a major part of our planning. 

MID-ROLL: [30.49] Hi I’m Max Clark and you’re listening to the Tech Deep Dive podcast. At Clarksys we believe tech should make your life better, searching Google is a waste of time, and the right vendor is often one you haven’t heard of before. With thousands of negotiated contracts, Clarksys has helped hundreds of businesses source and implement the right tech at the right price. If you’re looking for a new vendor and want to have peace of mind knowing you’ve made the right decision, visit us at Clarksys.com to schedule an intro call.

Max: [31:14] So, I mean, you say redundancy, which is the right word, carriers talk about this in terms of protected circuits, right? So, if you’re thinking about the code words, if you’re listening right, it’s protected circuits, and the cable systems… You know, every trans-oceanic cable system has protection built into the cable, there’s usually two paths, and on the west coast of the US, we have Hillsboro, Oregon, or Seattle… So, basically – Seattle I think is probably the dominant one, and then it’s San Luis Obispo, and Redondo Beach in Southern California, but these cable systems, you know, one cable runs across both paths, that come from Seattle, and they come from Southern California, and they’re protected, you know? The cable system is protected, but when you go to the cable operator and you buy away from them, they don’t give you protection by default, they’re like, “Oh sorry, you’re on the southern path and it went down, you want the northern path too? Pay us more money.” 

Anna: [32.06] Yeah! Yes, and there is an economic reason behind that, right? They are in fact two different cables, there are two sets of economics at play there, and it does make sense that it costs more, it’s just the thing that, especially again as we get into this age of cloud and everything as a service, and network becoming a service, you can’t expect the average person buying that service to understand the depths of, you know, redundancy, and why you need it, and understand how cable systems work, so it should just be sometnhig that’s there, but you expect that network to be up, and that’s how we built our network, was just to be up. 

Max: [32.49] Or that you know the magic codeword of like, I need to order a protected circuit, or I have an outage on that circuit and I can call up and say I need snap protection on this circuit and it’s like, you’ve got like the magic code — it’s like, “Oh! You know the codeword, okay!”

Anna: [33.00] “Oh, okay, yeah – you can come in the speakeasy now!” Yeah, and yeah, that’s just a layer… Our goal and — I say this in jest, but it is a good goal, is that we don’t want our customers to have to talk to us. I mean, we’re there if you do want to talk to us, but we don’t want our customers to have to talk to us, right? They should just get the service that they expect, without having to give us magic codewords.

Max: [33.26] There was a big moment for me a couple of years ago with PacketFabric, and one of the transit networks that I love is NTT, and they have a relatively small footprint, and part of what I love about them is they have a small footprint. I mean, there’s not a lot of stuff, I mean – their network becomes very efficient in how they control it and how they’ve built it out, and they had this little side note announcement of, “Hey, by the way, all of you people who have been asking us to have network in Vegas, we’re never going to build into Vegas and you don’t want to build your own waves, by the way, you can get ports from PacketFabric now and come directly over to us.” For me, that was a big moment, I had this “Oh wow,” this is really here – this isn’t theoretical anymore, this is really here. Because, a transit vendor at that scale saying you can order ten gig or hundred gig circuits from PacketFabric and interconnect back with us at One Wilshire, or interconnect with us up in San Francisco, or interconnect with us wherever you want this circuit to go, and by the way, PacketFabric can provide you that protection and redundancy with it, and you can just home into ports… I mean, come on, that’s really cool. 

Anna: [34.26] Yeah, NTT transit is awesome, and I’m a huge fan of NTT – they do a lot of really great automation as well on their network.

Max: [34.36] Their automation’s freaky though, when you get into what they’re doing. I mean, it comes from the fact they’ve been writing it for decades, but…

Anna: [34.40] Yeah, yeah… They do, they do a ton of cool stuff, huge fan of them, and it’s a great use case too, because we ultimately help make them, you know, a great transit provider, more available than they’re capable of on their own, just because of the capital investment to go into every data center that somebody might want is, you know, is a pretty big overhead, and the customer base in some of those is relatively small, so… We provide access to that, you know, through all of our POPs, and it just like, it makes their footprint, you know, it goes from, you know, I can’t remember what the NTT footprint size was before, but we add, you know, a hundred and sixty POPs to that, so… It’s a… Or a hundred and seventy now, so it’s not a non-trivial number.

Max: [35.33] I think in — I mean, they added Boston in the US, right? So it’s six… So, it’s three, five… I don’t know, like eight POPs, ten POPs I want to say… It’s not big.

Anna: [35.42] It wasn’t very man, yeah.

Max: [35.44] So, I mean, five years ago when you were starting PacketFabric, I mean, this was still a relatively novel idea. There was a little bit of noise in this market and there was this idea, and this kind of dates back to tier 2 networks that had started doing — it went from IP transit, but the margin IP transit completely evaporated, so they moved into IP transit, and then it became — sorry, transport, and then there was this idea of remote peering that was trying to be pushed, as like this… Like, you know… Was going to be the savior of all these networks and that didn’t materialize… So, the cloud interconnect and the network as a service idea, or these dynamically allocated networks… It went from conceptual to – this is like, really here. I mean, there’s… I mean, the data center carriers are getting into this, with either into metro or in regions, or they’ve partnered with companies, or they’re trying to build out their own fabrics… You know… It’s fun to watch that happen, because it means, you know, this is a real thing, and this is not going away. Like, this is going to become a standard for what happens here. I’ve been curious for a while, like… When do we see a carrier start doing this? I don’t really think we’re going to see a carrier doing this but when does a carrier actually start doing this, because they have the last mile, which is usually the hardest component to always solve, you know? Data centers are relatively easy, but the last mile is the tricky part. I’m kind of curious as to what you think is — what you guys are looking at over the next two to three years in terms of planning and cycle coming down the future here, because this is more than just, “Hey, we’re going to see terabit network links,” this is beyond port sizes getting faster and faster and faster, like, what else is like the crystal ball?

Anna: [37.26] So, I get asked this question a lot actually, as you know, when are the carriers going to jump into network as a service, and I don’t even know that we’ll see it in the next three years at all. Here’s the thing: automation’s really hard, networks are really hard, and combining the two things is really hard. We have this joke that is a running inside joke, you know the line in Jurassic Park, “We have all the problems of a major theme park, and a major zoo,” which is, you know, we have — we are a software as a service provider, and we have a massive network. Infrastructure and automated infrastructure is hard, and that’s why there’s… That’s why you see a couple companies that absolutely dominate at it, like AWS and Google and Microsoft, that’s why there’s only a few of them, because these are hard problems to solve, and when you combine the fact that this is a hard problem, you combine it with carriers have never had the mindset to automate everything, like software is not a thing that they do… Software traditionally has never mixed well with telecommunications, pretty much ever. And then, you layer on top of that the extra fact that, you know, all carriers have a ton of revenue tied to existing services, their wave services, their MPLS services, their VPLS services… All those things that are running today, where’s the incentive to do this? There’s a huge knowledge gap, there’s a huge hurdle in terms of difficulty, and there’s a huge disincentive, you know, from these are publicly traded companies, that have a ton of revenue, that is not tied to them automating a thing. So, there’s, you know, they have whatever the reverse of stick and carrot is. 

Max: [39.20] Innovator’s dilemma, is I think the popular term, right?

Anne: [39.23] Okay, yeah, innovator’s dilemma – and that is a massive problem for them, so… I just don’t think it’s going to happen any time soon, and I’ve had these conversations with some of the folks that are working at these big carriers, who are making efforts to do this automation. Some of them have done — they have implemented systems, they’ve built some software, but the problem is they can’t roll it out for a variety of reasons, they can’t roll it out because of internal process, they can’t roll it out because the network is just too — it’s just a brownfield network. Which, we had the advantage of starting on a greenfield, and easing into a brownfield network and that was a learning curve, and we did it, and now we’re super confident that we can go into any brownfield network and flip it around, but that’s five years of experience there, and also us having the will and drive to do it, and a carrier who is just like, you know, we’re not going to do this in our brownfield network, because there’s too much existing revenue at stake here, we can’t go and meddle with it, so you’ve just got barriers to entry all around.

Max: [40.24] So clearly what you’re talking about is, you know, you started out as a juniper platform and have added other network  vendors into the mix as time has lapsed, and that will continue to accelerate based just on what’s on market and who’s doing interesting things that you actually need? 

Anna: [40.38] And that’s the great thing about having the software translation layer that we wrote in between there, is it’s just adding a module for us to add another software — to add another hardware vendor. So, it’s a pretty trivial effort, and it’s not just that about brownfield, but it’s also things like, you know, cabling, right? We have been meticulous in keeping you know, all of our cabling is actually stored, you know, in a database and is kept up to data automatically, as opposed to the vast majority of what exists out there in other companies, other carriers – anyone, you know?

Max: [41.11] So, there’s a two RU juniper device, I forget the model number right now, but it’s two hundred and eighty-eight, ten gig interfaces that are laying on this two RU, and it’s awesome, I mean it has a huge bit table, it’s the holy grail of internet edge devices if you’re actually a carrier, and you’ll be able to do a whole table… And you’re like, on the surface this is great, and I was talking to somebody and they were explaining, how do you actually deal with two hundred and eighty-eight pairs of fibre in a cabinet at a data center, wiring into one two RU device… And, you know, so you have a device, you know, it’s three and a half inches tall, and you’ve got a bundle of cable coming into this device, and it’s thicker than the device is – what do you do with that? It’s such a strange problem to think about, you’re like “Aw man, this is such an awesome box,” and then you’re like, wait a minute… I can’t actually cable this thing, what does that look like?

Anna: [42.04] Yeah, we have a really excellent deployment team, thankfully, that makes all this stuff look pretty, and you know, gets the initial information into the system right, because that… It’s like, the point of breakdown is always the human interface in these systems, because there is, you know, there is no automated way to initially get that cabling information into the system, and then for a human to do it, and there’s where… If there’s ever any errors that happen, that’s where they happen. And that’s the high barrier to entry there, right? It’s having that, you know, having that human go and enter what every cable position is, from the patch panel to the device, and when you think on — and sorry, going back to that, because I like thinking about this telco problem, like, how would I take an existing telco and automate it, right? It’s kind of a fun thought exercise!

Max: [42.56] Be careful, somebody might buy you if they hear this! 

Anna: [43.00] Is it! You know, how to go back that initial step, like, how to go back and actually get all that information in, right? You’re talking a huge, existing footprint, thousands of locations, maybe more, to do that, and then the amount – you know, the mistake rate, it’s different when you’re building, when you’re actually building out these POPs, and we’re doing it at a rate that is sustainable for us to find errors, but if you’re doing it on such a big scale, those errors are… You know, you’re going to have – again, large numbers coming into play, if you have a ten percent mistake rate, that becomes massive, massive! 

Max: [43.40] I only know a few companies that have done these kinds of migrations well. So, in Los Angeles, it’s capacity. You know, moving from One Wilshire over to nine hundred… People hit that point, they just hit the wall and they have to move. So, watching a few companies make this transition, or watching a few carriers make this transition… There’s only been a couple that have done a good job, and CoreSite, to their credit, has done excellent with those cross-connect, you know, those hotcuts, and operationally they’ve — it was very impressive to watch this happen, because it is very difficult. I mean, to your point, I think this is the hard stuff. So, you have an early mover advantage, you have a greenfield advantage, and you’re definitely now getting to the point where you now have a capacity and size advantage. I mean, you guys are becoming the, you know, larger gorilla, or you have a lot of locations – you have a lot of infrastructure, it would be very expensive for somebody to replicate from scratch what you have in place. Now, on your relative scale of being able to add a new pop or a new facility is relatively – as you said – trivial, it’s relatively easy for you to do at this point, and each one of those gives you incremental value to your network and to your fabric and where you’re located, and it almost feels like you’re approaching the critical mass at some point. And again, another big moment for me was when you finally expanded into Europe, and that was a long time coming but watching that actually happen, that’s a pretty big… You know, going trans-oceanic is a big deal as well. 

Anna: [44.55] It is! It is, we went trans-atlantic and trans-pacific at the same time, and achieving that reach is a big deal, and again, all the problems with doing this are not where you think, you know, at the software layer it doesn’t matter – we could add locations on Mars, the Moon – we can… Extend out of the galaxy, you know? To the software layer, it doesn’t matter, but to the actual, you know, building out the infrastructure, there’s always just so much involved and going to new companies — countries, right? We had to worry about studying business entities, you know, what does your tax structure look like, dealing with shipping and customs, these are all of the things that you’ll encounter, and that’s where the hard part comes in. 

Max: [45.44] And so, the big takeaway there is, is don’t do the hard part, just use somebody like PacketFabric to do the hard part for you, just make your life easy in the process and just say like, “Hey, I’ve got a webpage, or an API, I want a circuit from here, *click*!” 

Anna: [45.56] Yes, yes, and that really is the beautiful thing. I mean, I wish that we had a PacketFabric that we could use to build PacketFabric on. It would make it a lot easier!

Max: [46.01] But at the same time, you’re being… I mean, this isn’t something where you’re massively marking up or taking advantage of people for the flexibility of ease of time or time to market, I mean… You really are driving costs down for people at the same time, as well. I mean, this is… As your costs have driven lower, you’re using that as an advantage to drive your cost to your customers lower, and the cost of this, you know, layer two network being driven down… I mean, that’s going to open up doors for people to do things that were never anticipated as well. I mean, at some point it just becomes so cheap you’re like, well… Why wouldn’t we just do this thing? 

Anna: [46.38] I agree with you! I think people should think exactly that! And that is one of the interesting things, is we made this decision really early on, that we were going to make all of our pricing public, because this is an as a service, and any as a service, whether network or software, you have public-facing, published pricing, and this is a pretty bold move in, you know, in terms of telecoms… I can’t think of another telecom anywhere that does this, and so… It was one of those decisions that was made at the beginning that turned out to have a really broad, long-term impact, because people would look at the pricing and some people are just blown away: “That’s it? That’s all you’re charging?” You know, for something that’s…

Max: [47.34] There’s no catch, this is really the pricing!

Anna: [47.37] Yeah, and we actually did face a little bit of that at first, is it looked too good to be true, right? So, you’re telling me I can provision a circuit instantly, and it only costs this much, it doesn’t sound like — it didn’t sound like reality to a lot of people. And so, that was certainly something we didn’t anticipate having to face, but it was interesting to say, “No, we’ll show you like, it really is true, this is not hypothetical, we’ve built it and you can do this now. We’ll show you a demo to show you that it’s real, that you can actually provision this and that is the cost.”

Max: [48.06] You know, for context, I mean these are use cases where it is cheaper to provision ports on PacketFabric to cross-connect data centers within the same metro, than it is to go to your local dark fibre provider and take fibre from… I mean, and that’s… And you take a step back from that and you say, well, you know, these two buildings are massively interconnected with each other, and there’s thousands of strands of fibre in the ground beneath them. Not only is it complicated to get access to those things, then they want five year contracts usually for those things, but just the price! And in some cases — so, LA specifically, the market there are some players in the market, it’s relatively inexpensive to get dark fibre between buildings, especially in the downtown area, but it’s still cheaper just to go to PacketFabric and just turn it on, you know?

Anna: [48.53] Yeah, yeah – and it’s not — it’s cheaper and not just the respect of like, where you’re actually paying, but the time investment that you have to put in… To use dark fibre is… It’s a pretty big time investment, right? You’re dealing with — typically fibre companies are more, you know, wholesale companies, they’re selling to businesses, they’re not selling to individual consumers, so… The business side is just, it’s not set up to make those sales and so, you’re spending again, weeks, months chatting to sales people, doing all this overhead thing. So, it’s not just the actual cost, but it’s the time savings that you’re looking at as well, because it’s pretty significant in time savings, and just… Sheer pain of doing business savings. 

Max: [49.42] One of your investors has — actually, one of your investors; a very major investor has this vision of telemedicine and… but it’s not telemedicine in the sense of like, I’m going to talk to my doctor over a Zoom session, this was more about, you know, can you get into a situation where you have low latency network with high capacity and you can do things like remote surgeries. And that’s a very bold vision for — what actually has to happen for that network to support that, but there’s a lot of infrastructure that has to go into place before that’s available… Not just infrastructure in terms of like, cables in the ground, but infrastructure in terms of robotics that has to have a certain sensitivity to support these things, but… I mean, obviously short-term, near-term PacketFabric, you’re talking to expanding to more POPs, more countries, more continents, more services, more clouds as they come online, more networks, right? I mean, are you starting the planning of pushing down into the last mile? I mean, is there a hit list of, let’s go get into every hospital, at this point? I mean, what is… How does this evolve?

Anna: [50.41] So, I’m really glad that you brought up the, you know, the whole entire medical field, because this is one of — this is just like a hobby of mine, I love to dabble in medicine and genomics and biomed and all these things are super fascinating to me. Again, there’s this thing that has happened right, people have been sold this future, and a lot of these cool things coming, and we don’t really have the way to deliver it, because the underlying infrastructure is so bad. For example, there’s a lot of AI out there now that does readings of MRIs and x-rays, and it’s already been shown through numerous tests that AI performs way better than a human, way lower — much lower error percentage, much more cost-effective, much quicker… You know, imagine if you could go in for an x-ray, and instead of having to wait a day for it to go to an actual doctor to look at it and give you a result, AI could respond to that in seconds, and let you know if it’s a break, what it is, you know, all the information that would need to treat, but we can’t do that because the facilities where you’re getting these x-rays done, are connected by three meg DSL, in the middle of a cornfield or something… And there’s just no way to communicate back to, say, the closest data center in an effective way, because images like x-rays and MRIs are actually… they’re huge images, you’re sending a lot of data, so there’s no way to effectively move that data back, and bring this, you know, what is a mind-blowing advancement in terms of medicine to the consumer, all because the network sucks. When you start to look at things like this, yeah, it’s pretty obvious that there are some places we need to push into to help make this dream of, you know, of better, healthier population actually happen, because when you think that all this stuff is… All these medical advancements are being throttled by the network, it’s pretty sad that we have — that the infrastructure is the pinch point in this. There’s also a lot of cases like that with genetics, like… There’s massive genome databases that sit in a lot of different institutions, both private and public schools, you know, like University of Chicago has a big one… There’s NCBI, and a bunch of them out there, and sharing the information between these two databases is incredibly hard, because a lot of these facilities just aren’t very well connected, and like, so many of them have options that if you want to do a big data export, you are literally, like… They have, it says on their websites, if you need more than ‘x’ amount of data, we’ll ship you a hard drive. They’re still shipping hard drives in this day and age, and a lot of them also use, like, AWS as FTP intermediary to move this data. And so, when you think about — because what this data is primarily used for is cancer research, and so… Like, you could kind of, if you want to do – a big hyperbole here – you could say that maybe one of the reasons why we don’t have a cure for cancer is because we just can’t move this data. I mean, the cure is locked out there somewhere in all these different little silos, and because they can’t communicate well, we still have this problem. And those are just two examples. I mean, there’s a million more advancements that are being held back by basic infrastructure at this point. 

Max: [54.25] Let me take this down a notch for everybody… So, one of the funniest — funniest… One of the… I’ll stick with funniest, so… Chic-fil-a actually has a very long, technical post on one of their engineering blogs – yes, Chick-fil-a has an engineering blog and it is awesome. It talks about — 

Anna: [54.53] I really want to read this!

Max: [54.45] — It’s relevant because it talks about machine vision and machine learning, on how they actually structure… Predict how much food to produce in their locations. So, in a real-time basis based on trends hyper-specific to that location, this application can say, “Hey, this is how much chicken you have in a basket ready to go, and this is your peak demand, and this is what our prediction is based on all these data points we have,” and they have a machine vision, coupled with a machine learning platform that now, at each physical location that can say, “You need to make more chicken,” or, “Don’t make more chicken.” You know, really basic things but when you look at that in scale and you say, okay, how many locations do you have across your footprint and what is the cost of your actual raw goods, and how many people do you have in line, and so for the average person going through the drive through, was their experience quick, or was it slow, and you’re like… Well, if you have a positive experience because it’s faster — and who would ever actually think that behind all of that, there’s a computer application running, with a camera looking at a basket of chicken, applying a machine learning model with a dataset against it. Okay, that’s awesome, I never would have thought that was a thing, right? But you know, when you have access to more and more data, and it’s not restricted and you can move it around, and it’s like you’ve set it free, all these things come up that you would never think of before, because somebody had some idea of like, I can make our drive through line go faster! 

Anna: [56.07] Yeah, that is amazing, and I absolutely want to go read this whole blog now, because that is fascinating, and it’s the perfect — it really is the perfect application of that whole promise of tomorrow. Like, well done Chick-fil-a. Thinking back to buzzwords, it’s kind of died down a little bit, you know, the buzzword of Big Data, right? Big Data, it’s all about Big Data, what can we find out of data? Data is amazing, because the insights that we can gain from it can do great things for so many applications, but I think now we’re going into the era of, you know, big networking, because Big Data is great, but if nobody can access it or do anything with that data, it’s not valuable. You know, data is only as valuable as, you know, how you can access it and what you can do with it, and if that data is trapped, it’s not very valuable.

Max: [57.08] Anna, it has been so fun for me to chat, I actually have been a big fan, watching you guys develop over the last few years, and coming out of stealth, I remember when they announced all of a sudden this little — this group got together and everybody updated their LinkedIn posts, saying “We’re all in stealth mode,” and you know, starting to get rumours of what was going on, so it’s been really fun watching you guys develop and like I said, there was a couple of really big moments for me of like, this is here, and this isn’t going away, and for somebody that’s been provisioning telco circuits for the wrong way for twenty years, having an option to do it differently sounds very exciting for me, so. Thank you very much for your time today, it’s been a pleasure. 

Anna: [57.49] Yeah, thank you so much for having me, this is really fun and I can’t even tell you how happy I am to learn about this Chick-fil-a blog, I feel like I’ve got the rest of my day planned out, I’ll be reading this thing.

Max: [58.02] It’s excellent, you’re going to love it.

OUTRO: [58.05] Thanks for joining the Tech Deep Dive podcast. At Clarksys we believe tech should make your life better, searching Google is a waste of time, and the right vendor is often one you haven’t heard of before. We can help you buy the right tech for your business, visit us at Clarksys.com to schedule an intro call. 

Transform your business without wasting money.

We help you identify, audit and implement technology changes within your business to create leverage points to scale your company faster.