Non Maximum Suppression (NMS) is a computer vision approach for selecting a single entity from a large number of overlapping things. Entities that fall below a certain probability threshold are generally discarded.

Typical One component of the object detection process is for producing categorization recommendations. Potential regions for the item of interest are referred to as proposals. The majority of methods use a moving window over through the feature map to assign foreground/background scores based on the features calculated in that window. The scores of the neighboring windows are comparable to some extent, and they are regarded as prospective areas. Hundreds of suggestions result as a result of this. We preserve loose limitations in this step since the proposal generating process should have a high recall.

However, it is time-consuming to analyze all of these ideas through the categorization network. It refers to a procedure known as Non-maximum Suppression, which filters suggestions based on a set of criteria.

NMS is used by most object detection algorithms to reduce a huge number of observed rectangles to a handful. In most circumstances, the manner object detection works necessitate the usage of NMS. Windowing is used by most object detectors in a most foundational sense. Hundreds of thousands of windows of various sizes and shapes are created, whether literally on the image or based on a characteristic of the image. These windows are said to contain only one item, and each class is assigned a probability/score by a classifier. After the detector generates a significant amount of bounding boxes, the best ones must be chosen. The most often used approach for this task is the NMS machine learning algorithm. It is, in essence, a clustering method.

Non-maximum suppression algorithm

A list of Proposal boxes B, matching confidence scores S, and overlap threshold N are all included in the NSM input. Output, on the other hand, is a collection of filtered suggestions D.

  • Remove the proposal with the greatest confidence score from B and place it in the proposal list D. (At first, D is empty.)
  • Now compare this proposal to all of the others by calculating the IOU of this suggestion with all of the others. Withdraw that suggestion from B if the IOU exceeds the threshold N.
  • Remove the suggestion with the highest level of confidence from the other suggestions in B and add it to D.
  • Measure the IOU of this proposition with all of the proposals in B again, and remove the boxes with IOUs higher than the threshold.
  • This process is continued until B is devoid of any additional offers.

A single threshold value controls the entire filtering process. As a result, choosing the right threshold value is crucial to the model’s success. Setting this limit, on the other hand, is difficult.

So how does NMS work?

Assume that 0.3 is the overlap threshold. Even if the confidence is better than other boxes with fewer IOU, if there is indeed a proposal with 0.31 IOU and strong class probabilities, the item will be eliminated. As a result, if two things are placed alongside, one of them will be deleted. Although its confidence is relatively low, a proposition with 0.39 IOU is preserved. Of fact, any threshold-based method has this drawback. So, what are we going to do now?

Soft-NMS is a simple but effective technique to cope with this situation. The concept is simple- rather than fully deleting suggestions with large IOU and high confidence, lower proposal confidence proportionate to IOU value.

So this is simply a single paragraph modification in the code of the NMS algorithm, but it greatly improves precision.

These methods are effective for filtering predictions from a single model; but, what happens if you have forecasts from numerous models? Weighted boxes merging is a new approach for integrating object detection model predictions.

Conclusion

NMS (Non-Maximum Suppression) is a computer vision approach that is employed in several algorithms. It’s a group of methods for picking one thing out of a slew of overlapping ones. To get specific results, the selection process can be changed. The most typical criteria are some kind of probability number and some kind of overlap metric (e.g. IOU).