Despite the fact that we don’t have a vaccine for COVID-19, governments have started to reopen the economy. However, it has to be done carefully and strategically. Kogniz has built AI and ML-based solutions that allow clients to follow CDC guidelines and start reopening their businesses safely.
In this edition of TFiR INSIGHTS, we sat down with Jedd Putterman, co-CEO of Kogniz to talk about the technologies Kogniz is building.
Jed Putterman: First of all, thank you very much for having me on today. So Kogniz was started around three years ago. We are primarily made up of computer vision experts, mathematics, and computer science backgrounds. And the idea was how do you take live video streams, and there are literally millions of them out there and make them intelligent? So we’re applying real-time artificial intelligence and computer vision against live cameras and understanding what’s happening in there. So like you mentioned, it’s an Edge-based device. We drop on a little processing unit at the customer site, which is able to process in real-time, run all of our machine learning algorithms and we use the web to manage and allow it to be a really friendly kind of modern interface for accessing the system.
Swapnil Bhartiya: Please tell us a bit about some of the use-cases.
Jed Putterman: How we started off was we built a platform for really understanding what was happening in that video in real-time. And we spent several years really taking on some really challenging issues. How do you understand accurately 100% that there’s a person there, finding their facial features, understanding what’s going on, if there are things that you should be concerned about, et cetera. And when COVID hit us, we started to hear from our customers that they really wanted to talk about how the same technology could be applied towards making their business and their offices safer. And so we looked back and we said, “One of the issues is understanding temperature, human body temperature. How could you do that in a really efficient way to get people back into the office?”
And so, we actually put out our own camera even though we’re a software company. We really needed two different kinds of sensors that we couldn’t find in the market, put in the way that we needed them. So one is a microbolometer, which is a very, very accurate thermal sensor. And the other uses the traditional video sensor that we’re used to seeing, and using all of our AI, we’re able to from up to 16 feet away, look at groups of people and in real-time, understand their temperatures normalized to body temperature. And this is something very unusual. So typically, skin temperature is much lower than your body temperature and is really affected by your ambient environment. It can go up or down when the sun comes out and whether you walk into a room with air conditioning or not. We solve for all that using AI in our platform.
And so, we can tell within a very high degree of accuracy, what everybody’s actual body temperature is as they’re walking into space.
Swapnil Bhartiya: And all of this is in real-time?
Jed Putterman: And all of this is in real-time with actually large groups of people as they’re walking through. And so, as people are learning temperature is a really important indicator, for example of COVID and other viruses that we have now and may see in the future. So as people come back in, we’re able to without getting people close to each other and continuing to have a high flow of people going through, we can automatically figure out what people’s temperatures are. If there’s a high temperature, we can flag that in real-time and notify staff who are nearby to ask the person to step aside and to do a secondary screening with a medical-grade thermometer. It allows people to get back to work safely and more importantly, keep people safe. So you don’t have to have people who are right next to you who are actually checking your temperature and slowing down the flow of visitors into an office.
Swapnil Bhartiya: Due to COVID-19, sending out physical devices is tricky. Software-based solutions are easy as users can them wherever they are.
Jed Putterman: Yeah, it’s a really great question actually. This is one of the things we had to ask for ourselves too, is we’re in the middle of a lockdown and you can see behind me I’m sitting in my home in San Francisco and we’re dealing with companies all over the place, East Coast, West coast, we have international customers now. And so we needed to make sure that we didn’t require a system that had a lot of setting up and also that required specialized people from our team to go out and do it. So the system that we put in place, it’s a tiny little device, it drops onto their network. It has very specialized hardware inside. We’re using the latest-gen chipsets from Nvidia as an example, but more importantly, it is they’re made for self-install. So these things drop in literally in minutes, even in some really, really complicated networking environments.
And we have some very, very large customers using this. And so we ship it out to them. They’re able to get things set up and running in a very, very short amount of time and get the cameras online, functioning and providing these tools. Often, we’ll do our training remotely in Zoom calls just like this. And so we’re able to deploy a really, really sophisticated piece of hardware as if it’s a Nest camera on that same ease and simplicity. And again, giving you realtime information almost immediately after they get things set up.
Swapnil Bhartiya: So these are like proprietary hardware. It’s like off the shelf white boxes where most of it could be open sources as well. So talk a bit about the hardware component. You did touch up on a bit, but I want to go dive deep into that.
Jed Putterman: So right now, we have chosen an Nvidia chipset. It’s the Zapier NX module. And so they are off the shelf, right? This is a new part, but it’s a commodity part from Nvidia and we’ve married the compute with the actual optics, right? So we have multiple sensors inside of our camera specifically of course, we have to have the thermal sensor with the microbolometer with the RGB sensor. And because we’re processing all of that locally there, we’ve also married that with this GPU based compute. So it’s a camera. It’s a little bit different than the camera maybe people are used to seeing, but it is able to do all that local processing. And we run some pretty crazy math on those devices in a really small form factor device that we’re able to put together again, using these off the shelf components.
Swapnil Bhartiya: But also you are doing a lot of work in the AI and ML space. Talk a bit about the software side of it now.
Jed Putterman: So there’s really kind of two sides to it. The first thing is understanding what’s happening in the scene. So as people walk in, typically if you look at some of these handheld thermal scanners, the person has to find the camera rather than the camera finding the person. So what I mean by that is there’ll be a small box area in the middle of that camera one person has to go through and make sure that their face is specifically in that box. So either the camera operator has to be doing this, or the person has to stand in a very specific space and we find this problematic because one, it means there’s a lot of time and skill taking and even getting the right meetings for each person. And the other is you go to one person at a time, which if you’ve walked into a large office facility before or meat packing facility or another area, it’s just not the real world. You can’t stop people there.
So our system first thing is just looking through and finding all the people that are in the space very, very accurately I mean, to the point where there are no false detections, looking for each of those individuals, and once it finds an individual, it’s able to, at the same time, no matter how many individuals there are in that scene, then go through, find the facial area of those people, specifically, we look for the ocular area, the area near your eyes. If you’re wearing glasses, we avoid that area because it’s using long-wave IR, which doesn’t work through glass. And then we’ll find other areas of the face, but it allows us to look at large groups each person individually, go through, find the right areas for that individual and then take the temperature for that person. So that’s one aspect of it.
And then the second piece is once we get those temperature readings and each person will get many, many readings as they’re walking through it, starting at 16 feet away, and then tracking them as they’re going across anywhere in the scene. We then have to figure out what does that skin temperature mean? Because again, they may have come from a different temperature than the camera’s in, the weather is constantly changing. The room temperature is constantly changing. And so what we do is we create these normalized baselines, which understand in real-time, how temperature is changing both for the people and for the environment where the camera’s in. And we’re able to convert that into a very, very accurate body temperature.
And as we’ve seen, people don’t want to be given a reading that the person is 94 degrees. It means nothing to them. It may be correct for their skin temperature, but if someone walks in with a 97-degree temperature, do they have a fever or do they don’t have a fever? We’re able to convert that and talk about body temperature and we always think about 98.6, well, that’s a number that we all understand. So we can talk about that as the reference point.
Swapnil Bhartiya: So what are the use cases? What are the industries that are your typical clients who are using your technologies?
Jed Putterman: Really has changed. Day one when we first started selling our solution, it was the most critical industry. So it was one of the ones where if someone gets sick and they have to shut down that facility, it literally will affect our nation’s food supply. So you can imagine we’re in some really critical infrastructure around food related services, manufacturing-related services, really critical infrastructure for the nation. So that’s where we started off and what’s happened is as we’ve looked at other industries that are starting to think about how they go back to work, it’s evolved from not just critical infrastructure now, but to a broader set of manufacturing facilities to office buildings where they really need to figure out what does it mean to reopen our offices, reopen our facilities.
So temperature is one aspect of it. There are other pieces that are important there that we also provide solutions for. So for example, contact tracing, well, if someone is detected as having a fever and later actually confirmed as having COVID, how do you go back and understand where else were they in that facility over the last 10 days, two weeks? Who else did they come into contact with? And so we’re able to use our technology to go through and say very accurately, “This person has been in these areas of the buildings.” And so when they go through and clean the facility, they can actually focus on the area where the person actually was as opposed to do a blanket very, very expensive, complete cleaning up of the entire facility.
Other areas that are important for us are how close is that person to other people. So for example, if you’re standing in an elevator bank and that elevator has been set up now for three people or four people, how do you make sure that there are only three or four people there? And how far apart are they from each other? Let’s make sure they’re standing six feet away. So these are all problems that we’re able to solve and provide solutions for using the same exact technology.
Swapnil Bhartiya: That means your sensor and cameras have to be there in all the locations where people do want to ensure and enforce that social distancing and all the requirements. Is this correct?
Jed Putterman: For temperature, it is our cameras because again, they have some very specific parts. For some of the other things I mentioned, for the contact tracing, for the social distancing, absolutely we can use all the existing cameras they already have in place. And this is what’s really exciting is that we don’t need special sensors for those. If they have any surveillance camera in any of those areas, we can actually use those video streams and provide all of the exact same functionality.
Swapnil Bhartiya: I do want to understand how the workflow works like.
Jed Putterman: First of all, it’s a great question because there’s a technology issue then there’s a process issue, and we needed to make sure that we didn’t just solve the technology issue, but we actually needed to help them with their process. And so, for example, if someone is detected by our system as having an elevated temperature, what do you do? So we do have a full workflow and notification system that is set up with the customers. It gives them a tremendous amount of flexibility around who should be notified, what do they do in that point there, what access do they have to replay the incident in case there’s a question afterward about who is around them. So all of this functionality is built into the system. Typically, in those cases for now, there’ll be someone who’s nearby, who’s able to respond and they’ll come in and do a secondary screening.
But as time goes on, we all understand the importance of temperature. For example, you may have one person six months from now, who’s actually responsible for four different areas. And so they’ll get the right notification. It can be escalated in there. And then they can quickly go back and understand who the person was and go and find them and speak with them. So these are all process-related questions, but we needed to make sure that we had the right tech in place within our platform. So there is the whole visual programming language. It’s a rule-based creator with workflow to let you describe the scenarios for how you want to respond to these and put those into our system and they result in real time notifications however you want to be reached, the ability to play back video, the ability to share those video clips, et cetera. These are all part of the system.
Swapnil Bhartiya: How about the one with open source in general?
Jed Putterman: First of all, we’re huge fans of the open source community. We contribute to open source wherever possible and AI computer vision is complicated. So there are layers of technologies there that absolutely we rely on a lot of the foundational layers, whether we’re using Cuda from Nvidia or whether we optimize for [inaudible 00:13:54]. We use components open CV so there are all sorts of layers that are put in there. Having said that, and what we’ve learned is that when you want to put somebody in a high velocity production environment where we can handle multiple streams and have very specialized pieces that we’re looking for, there’s some aspect of it that you can do using open source.
And there’s a lot of it that you really can’t. And so we had to balance out what we’re able to use from the open source community versus what we had to go off and do ourselves. So there may be something great that we can start with. We may find that it’s simply isn’t performing enough. We’re looking at multiple live video streams being processed by for example an annex chip. And so we’ll go through and we’ll either reduce layers. Obviously, we’re retraining the models all the time and wherever possible, we’ll actually then recontribute that back out to the community. We think this is something that should be shared by everybody. And there’s no reason for us to have to rewrite things that we don’t need to. We don’t want anyone else to have to. We can contribute to that.
Swapnil Bhartiya: What are the trends that you’re seeing there that will change the way we work and the work that you’re doing there will also help other industries, not only in just this, but a lot of other cases?
Jed Putterman: It’s a really interesting time we’re living through right now. Three months ago, if you told me that I would be working from home wearing a face mask when I left, I simply would’ve said, “This isn’t possible.” So the world has changed. And I think what we’re learning right now is one, we don’t know a lot of things, but what we do know is that this is not temporary. This is fundamentally causing changes in the way that we work, whether it’s the way people are working from home more, or whether how the density used to be, how do you get as many people as possible into a workspace now as how do you fit as few people as possible into a workspace. So there’s some real core changes that are going on.
Some of the things that we’re starting to see and understand are of course, looking at when people should work from home versus when they should go back to the office. I think that’s fundamentally a shift in the way that we’ve been thinking about. And so what technologies do you need not only at home, and obviously the fact that we’re doing these kinds of conversations more and more remotely is different from what we would have done in the past, but also then when you go back to the office, how do you make that experience safe and and ongoing. So there are some kind of the real urgent ones and then there are the ongoing ones. Where it translates into changes in technology are if you saw a face mask before, you assume the person was going to rob a bank, right? It was not a normal situation. And now if you don’t see a face mask, you actually get concerned about it.
So we’re looking at something, an article or a covering on a person’s face in a completely different way. Well, it also fundamentally changes all of the models we use in CV because again, primarily when we trained prior to this for face detection for example, you didn’t include a lot of masks or masked faces in these models. Well now, if you don’t include that, then your models don’t work and so people walk in, you don’t see their face. So these kinds of structural things are actually changing again, not only the way we’re fundamentally living our lives, but also how computer programs need to adapt too.
So one of the things that we did is we trained for example, for face masks, what’s not to look for the face mask is actually to look for the absence of a face mask and other PPE. So this is not going to go away. People will be wearing masks for a long time. And so how do you adapt to do some PPE enforcement, but also how do you make sure that all of your other systems that are in place work well when people are wearing masks and hats and glasses and all these other things that traditionally have caused problems for systems like facial recognition or facial detection, et cetera.