For decades, surveillance cameras have been the most expensive filing cabinets in the world. Businesses, cities, and institutions spend billions each year capturing video footage that, in practice, almost nobody watches. When something goes wrong — a theft, an accident, a compliance violation — someone has to sit in a dark room and scrub through hours of grainy recordings, hoping to find the right clip before the trail goes cold.
Conntour, a San Francisco-based startup that just emerged from stealth, thinks that’s absurd. And it has $7 million in fresh capital to prove it.
The company announced a seed round led by General Catalyst, with participation from Y Combinator, Soma Capital, Gaingels, and several angel investors, as reported by TechCrunch. Conntour’s pitch is deceptively simple: build an AI-powered search engine that sits on top of existing security camera infrastructure, letting users query their video feeds the way they’d query Google. Type in “person in red jacket near loading dock, Tuesday afternoon,” and the system returns the relevant clips in seconds.
No new hardware. No rip-and-replace. Just software layered onto whatever cameras and video management systems a customer already owns.
That last point matters more than it might seem. The global video surveillance market is projected to exceed $80 billion by 2028, according to estimates from MarketsandMarkets, and the installed base of cameras worldwide has ballooned past one billion units. Yet the intelligence layer sitting on top of all that hardware remains remarkably thin. Most video management systems still operate like digital VCRs — they record, store, and play back, but they don’t understand what they’re looking at.
Conntour’s founders, CEO Jared Bhatt and CTO Keshav Dulal, met at Y Combinator’s Winter 2026 batch. Bhatt previously worked in enterprise sales at a major cloud provider; Dulal’s background is in computer vision and machine learning engineering. The combination is telling. This isn’t a pure research play — it’s a product built by people who understand both the AI and the buyer.
“We’re not trying to replace anyone’s camera system,” Bhatt told TechCrunch. “We’re trying to make the cameras they already have actually useful.”
The technology works by ingesting video feeds — either live or archived — and running them through a series of vision-language models that index the content in real time. Every frame gets analyzed for objects, people, actions, colors, spatial relationships, and contextual cues. The result is a searchable database that can handle natural language queries with surprising specificity. Think of it as optical character recognition, but for the physical world.
And the timing isn’t accidental. The rapid maturation of multimodal AI models — systems that can process both images and text simultaneously — has made this kind of product feasible in ways it simply wasn’t two years ago. OpenAI’s GPT-4o, Google’s Gemini, and a growing roster of open-source vision-language models have dramatically lowered the technical barriers to building products that “see” and “understand” video content. Conntour appears to be using a combination of proprietary models fine-tuned on surveillance-specific data and off-the-shelf foundation models, though the company has been guarded about the exact architecture.
General Catalyst’s investment signals more than just enthusiasm for AI startups. The firm has been making concentrated bets on what it calls “AI infrastructure for the physical world” — companies that apply machine intelligence to operations in warehouses, factories, hospitals, and retail environments. Conntour fits that thesis neatly.
“The amount of video data being generated every day is staggering, and almost none of it is being analyzed in a meaningful way,” said a General Catalyst partner familiar with the deal, according to TechCrunch’s reporting. “Conntour is building the intelligence layer that should have existed years ago.”
The competitive field, however, is not empty. Several companies have been working on AI-enhanced video analytics for years, with varying degrees of success. Verkada, the cloud-based security camera company valued at $3.2 billion after a 2024 funding round, has built its own AI search features directly into its hardware-software bundle. BriefCam, acquired by Canon in 2018, offers video content analytics that can summarize hours of footage into minutes. Ambient.ai, backed by a16z, focuses specifically on AI-driven threat detection for physical security. And the legacy giants — Genetec, Milestone Systems, Avigilon (now part of Motorola Solutions) — have all been layering machine learning capabilities onto their platforms.
So what makes Conntour different?
The hardware-agnostic approach is the clearest differentiator. Verkada’s AI features only work with Verkada cameras. Genetec’s analytics are tightly coupled with its own VMS platform. Conntour, by contrast, is designed to plug into any existing system — Axis, Hikvision, Dahua, Hanwha, or whatever else a customer happens to have deployed. For large enterprises that have accumulated a patchwork of camera systems over the years through acquisitions, expansions, and vendor switches, that interoperability is a significant selling point.
There’s also the natural language interface. Most existing video analytics tools require users to set up predefined rules and alerts — draw a tripwire here, flag motion there, count people crossing this line. Conntour’s approach flips that model. Instead of anticipating every scenario in advance, users can ask questions after the fact, in plain English. It’s the difference between programming a DVR and searching the internet.
But skeptics will rightly ask: does the technology actually work at scale? Video is computationally expensive. Processing thousands of camera feeds in real time, indexing every frame, and returning accurate search results within seconds demands enormous compute resources. Conntour says it handles this through a combination of edge processing — running lightweight models on local hardware near the cameras — and cloud-based heavy lifting for more complex queries. The company claims sub-five-second response times for most searches across multi-camera deployments, though independent benchmarks aren’t yet available.
Privacy concerns loom large over any company working with surveillance video. Conntour insists it does not use facial recognition and has no plans to add it. The company’s search capabilities focus on attributes — clothing color, body type, objects carried, vehicle descriptions — rather than biometric identification. That’s a deliberate product decision, and likely a shrewd one. Facial recognition in surveillance has become politically and legally radioactive in many jurisdictions, with cities like San Francisco, Boston, and Portland banning its use by government agencies. By sidestepping biometrics entirely, Conntour avoids the regulatory minefield that has tripped up competitors like Clearview AI.
Still, attribute-based search raises its own questions. Civil liberties organizations have argued that even non-biometric surveillance tools can enable discriminatory policing if deployed without proper oversight. Searching for “person in a hoodie near the entrance at 2 AM” might sound innocuous in a retail loss prevention context, but the same query in a law enforcement context could reinforce existing biases in who gets flagged as suspicious.
Conntour says it’s focused initially on commercial customers — retailers, logistics companies, corporate campuses, and healthcare facilities — rather than law enforcement. The company’s early pilots, according to its YC profile, have been with mid-market retail chains and distribution centers, where the use case is straightforward: reduce shrinkage, investigate incidents faster, and improve operational awareness without hiring more security staff.
The business model is SaaS. Customers pay a monthly fee per camera connected to the platform, with tiered pricing based on the number of feeds, retention period, and query volume. Conntour hasn’t disclosed specific pricing, but comparable products in the video analytics space typically range from $5 to $30 per camera per month, depending on features. At scale, the unit economics could be compelling — a 500-camera deployment at $15 per camera per month would generate $90,000 in annual recurring revenue from a single customer.
The $7 million seed round is modest by current AI startup standards, where pre-product companies routinely raise $20 million or more on little more than a pitch deck and a PhD. But the Y Combinator pedigree and General Catalyst’s backing give Conntour credibility that money alone can’t buy. General Catalyst has been one of the most active enterprise-focused venture firms in recent years, with portfolio companies including Stripe, HubSpot, and Anduril. Its willingness to lead a seed round — rather than waiting for a Series A with more traction data — suggests strong conviction in both the team and the market.
The broader trend here is unmistakable. AI is moving from the cloud to the physical world at an accelerating pace. Warehouse robots, autonomous vehicles, smart building systems, and now intelligent surveillance — the common thread is applying machine perception to environments that have historically been analog or, at best, digitally dumb. The infrastructure exists. The cameras are already installed. The data is already being captured. What’s been missing is the ability to make sense of it all without a human staring at a screen.
Conntour’s bet is that the search paradigm — the same model that organized the internet — can organize the physical world as captured by cameras. It’s an elegant thesis. Whether it holds up against the messy realities of enterprise sales cycles, integration complexity, and the relentless pace of competition from both startups and incumbents will determine whether this $7 million seed turns into something much larger.
For now, the company has about 15 employees, plans to double headcount by year’s end, and is actively onboarding its first wave of paying customers. The product is live. The money is in the bank. And somewhere, a security guard is still rewinding footage frame by frame, searching for a person in a red jacket near a loading dock on a Tuesday afternoon.
Not for much longer, if Conntour has its way.


WebProNews is an iEntry Publication