Taking Computer Vision Out of The Lab: Interview with TechSee’s Product & R&D Leads

Over the last few years, service leaders have embraced computer vision AI and conversational AI to scale, automate and improve service operations. Last week, we launched the Visual Intelligence Studio, a self-service, no-code computer vision AI training solution. I sat down with Hagai Ben Avi, VP Integrated Solutions and Renan Schilman, VP R&D to dive deeper into the world of computer vision, our insights from real-world deployments, and how we approached the development of VI Studio.

Q: Why do enterprises need customized computer vision AI solutions?

Hagai: That’s a great question. Computer vision AI can see and understand as well as any human. Just as a human being needs to be trained to work with specific devices or provide specific services, computer vision AI must be trained (customized) to troubleshoot and guide users on a particular product or service.

Traditionally, computer vision AI was a fairly standardized technology. A number of computer vision platforms offer generic models capable of detecting generic information. These solutions can easily identify a phone, television or washing machine. However, identifying a specific model device or attribute, such as a cable plugged into the wrong port, or a yellow LED indicator where there should be a green LED, requires a deeper level of insight and further customization.

With our background in the service industry, we identified the need for a more detailed AI analysis quite early on. A washing machine manufacturer sending a technician to fix a machine needs to know exactly which model they are servicing. A Telco providing guidance to a user fixing their router has to know the specific model and manufacturer, as well as which lights are lit up, what colors the LEDs are showing, which cables are plugged in and where they are plugged in. Similarly, a Telco automating job verification needs to know specifically how each cable in a panel is set up. This level of insight requires highly customized modeling.

Download our Telco Visual Intelligence case study here

Q: So why launch a self service offering? Why develop VI Studio?

Renan: For organizations to build a customized computer vision model requires them to have specialized resources, skills, cameras, graphics hardware and training software. Before VI Studio, introducing computer vision AI automation to any enterprise was a difficult undertaking. I experienced this firsthand from my work at TechSee, and gained a deeper appreciation for these challenges from the enterprise AI team leaders I collaborate with on a daily basis. Our first computer vision modeling projects, almost two years ago, took far longer than our most recent initiatives. We adapted as we learned, developing new processes and deploying new technologies to improve our outcome and accelerate the machine learning curve. With faster modeling, we began discussing computer vision AI as a truly scalable automation solution for organizations to use across dozens, or even hundreds of new use cases.

Let’s double click on that point. In some cases, scaling computer vision automated required only a broad rollout of the same handful of computer vision models. For example, a service organization specializing in fiber installation in the home may have the same wiring setup in every installation, requiring only a single computer vision model. However, a service organization supporting dozens of fiber-optic modems would need to train dozens of computer vision models. It quickly became clear that for many organizations, scaling with computer vision would require a scalable computer vision modeling studio.

Q: What about finding all of the skilled engineers required to train computer vision models? Isn’t there already a shortage of AI specialists?

Hagai: This is where VI Studio truly shines. Over the last year, we streamlined and simplified our modeling process. Ultimately, we arrived at VI Studio – a no-code modeling environment that does not require a background in data science or artificial intelligence. If you can tag a friend on social media, you can tag a status light on a coffee machine. We simplified the language of the entire Studio to be more user-friendly. Classification and Labels are simply Objects to be named. Sub-components and sub-component metadata are called “parts” and “statuses”. AI Features are called “Analyses” and a simple interface allows users to define queries without requiring any knowledge of SQL. We also removed the need for specialized hardware, using training photos and videos taken in real-world environments from everyday smartphones.

We also added accelerants that expedite the training process. For example, we added video interpolation. This technology allows users to tag the first frame in a video. Our engine will then auto-tag every subsequent frame. All the user needs to do is to review the tags assigned by the engine.

VI Studio effectively reduces the barriers to entry and accelerates your time to incredibly accurate models – enabling enterprises to scale up their customized computer vision across their entire organization.

Q: So how long does this entire process take?

Renan: In many situations, VI Studio has been able to achieve over 95% accuracy at the sub-component level in just 3-6 weeks. This time needed varies based on the visual complexity of the images, and the number of datapoints that are analyzed. This stands in stark contrast to typical customized computer vision model development, which can take anywhere from a few months to a year to train a single model.

Q: Why is scalable, customized computer vision a game changer?

Hagai: For many, Artificial intelligence is still a buzzword that sits at the top of Gartner’s Peak of Inflated Expectations. With VI Studio and the VI Platform, computer vision AI jumps forward to the slope of enlightenment, or even the plateau of productivity.

VI Studio provides a rapid onramp for any enterprise looking to deploy computer vision AI models with immaculate accuracy. The VI platform, with native integrations across TechSee’s products, provides a turnkey solution for deploying practical AI across the service organization. Furthermore, VI’s APIs make these AI insights instantly available across any business application.

For many, Artificial intelligence is still a buzzword that sits at the top of Gartner’s Peak of Inflated Expectations. With VI Studio and the VI Platform, computer vision AI jumps forward to the slope of enlightenment, or even the plateau of productivity.

Think about building a website way back in 1996. You needed to buy a server, and hire a specialist to set up and optimize the server. What now takes a single developer a few hours, required weeks or months development effort – all to achieve a less impressive website. The availability of third party infrastructure and common code languages and libraries reshaped the entire industry.

Similarly, VI lowers the barriers to entry by providing a common baseline for computer vision AI application development. The potential impact of accessible, scalable, customized computer vision AI applications is incredible.

Q: What would you advise someone new to computer vision AI?

Hagai: Start with what you know, and find the right first step and partner for your needs. Bright and shiny objects are fun and exciting in the lab, but real innovation drives business outcomes. Don’t limit yourself to what you know or can read online. Reach out and speak to others in the space. Approach this with an open mind, and a focus on real world impact.

If you want to learn more, feel free to schedule a complimentary consultation. We look forward to speaking with you.

Jon Burg, Head of Strategy

Jon Burg Led product marketing for Wibiya and Conduit, bringing new engagement solutions to digital publishers, in addition to launching Protect360, the first big-data powered mobile fraud solution. With 15 years of delivering value for several other technological brands, Jon joined TechSee to lead its product marketing strategy.