Recent developments in machine vision have demonstrated remarkable improvements in the ability of computers to properly identify objects in a viewing field. Most of these advances rely on color-texture analyses that require target objects to possess one or more highly distinctive, local features that can be used as distinguishing characteristics for a classification algorithm. Many objects, however, consist of materials that are widely prevalent across a variety of object categories. For example, many trees have leaves, many manmade objects are made of painted metal, and so forth, such that color-texture detectors configured/trained to identify leaves or painted metal are good for some categorizations, but not for others. Much less effort has been made to characterize objects based on shape, or the particular way the component features are arranged relative to one another in two-dimensional (2D) image space.
The overarching goal of creating a machine that can see as well as a human has influenced prior research to focus on amplification of computing power to match that of the human visual system, requiring petaflops of computing power. The advent of cloud computing and the introduction of graphical processing units, multicore processors, smart caches, solid-state drives, and other hardware acceleration technologies suggests that access to sufficient computing power should not be the major impediment to effective machine-based object recognition going forward.
The goal remains to develop object-recognition systems that are sufficiently accurate to support commercial applications. Accordingly, a system and method for highly accurate, automated object detection in an image or video frame may be beneficial.
A contour/shape detection model was developed that is based on relatively simple and efficient kernels for detecting target edges (i.e., edges that are part of target objects) within an image or video. A differential kernel can be employed to select edge segments that are included in desired objects based upon probabilistic relationships between both pairs and n-tuples of edge segments as a function of their relative spacing and relative orientation. Such a method may function to suppress background edges while promoting target edges, leaving only those contours/shapes that are more likely to be included in the desired object.
The computer-implemented method includes calculating, by a computing system, a co-occurrence probability for two or more edge features in an image or video using an object definition. The computer-implemented method also includes differentiating between edge features, by the computing system, in response to a measured contextual support and extracting prominent edge features based on the measured contextual support. The method further includes identifying the object based on the extracted prominent edge features.