What recent developments in deep learning can we use in SAR imaging?
Part 2 —Detection
In this second part, we focus on the advancements of deep learning in the field of object detection in Synthetic Aperture Radar (SAR) images. This topic is not new. Deep learning is gradually making its way into object detection in radar images, but still in a somewhat nascent stage.
Some vocabulary
To discuss detection, it seems useful to first recall the definitions derived from computer vision for the various functionalities that can be developed from an image, in order to better understand the differences between detection, classification, instance segmentation, semantic segmentation, and clustering.
Indeed, I have noticed that in the remote sensing community, the vocabulary used may not necessarily be the same as in computer science and artificial intelligence.
As for myself, when I started in radar imaging, I conceived the following functions: segmentation, classification, and detection.
- For example, in polarimetric imaging, we are familiar with the H/Alpha classification. Classification involved assigning a label to each pixel: vegetation, urban areas, flat surface, etc.
- Detection was a subcategory of binary classification, with a single class of interest and a background class, like in this example:
- Segmentation was a way to select contiguous pixel regions. The notion of connectivity was essential in this definition, but there wasn’t necessarily an associated label.
In the field of computer vision, the definitions are different. We don’t reason solely on pixels but on “objects.” To define these concepts, most examples that can be found are illustrated with sheeps. I have adapted it here to the case of ships because it’s almost the same — especially for a French person — 😆
Let's consider the issue of a SAR image containing ships docked at a quay in a port.
- Detection usually refers to a type of object rather than pixels. To detect the ships, the goal is to draw bounding boxes around these objects, enabling their counting.
- What is called "clustering" in an image is the aggregation of connected pixels. It involves grouping different pixels in an image into connected regions, effectively tiling the image. In this case, there is no specific meaning associated with the objects within the image.
- Next, we have several types of "segmentation." In semantic segmentation, each pixel is associated with a label or class. In instance segmentation, it is similar to semantic segmentation, with the difference that among the labels, we distinguish different objects of the same class. If there are multiple ships, each ship represents a distinct instance and will be labeled differently from the others. Panoptic segmentation is similar to instance segmentation, with the additional requirement that all pixels in the image must be associated with a label. In instance segmentation, we can focus on specific types of objects, whereas in panoptic segmentation, every pixel needs to be assigned a label.
Indeed,
- What I referred to as H/Alpha classification corresponds to semantic segmentation in this terminology.
- Detection is commonly performed at the object level rather than on individual pixels.
- And what I previously referred to as segmentation (tiling into connected regions) is now referred to as clustering.
Detection in a SAR image
In practice, the salient objects in a SAR image may not necessarily be the same as those in an optical image. It is therefore normal that the focus is not on the same types of targets as in other imaging domains. Radar images are commonly used to detect artificial objects, such as:
- Military vehicles (which justifies many research efforts in the MSTAR database).
- Ships (these objects offer particularly interesting contrast compared to the clutter of the sea) (see the SAR Ship Detection Dataset (SSDD) at https://www.mdpi.com/1272438).
- Industrial buildings.
- Any metallic object.
Supervised learning methods rely on the availability of labeled databases for training. As a result, deep learning research has primarily focused on ships and military targets.
Detection in a SAR image… inspired by optical images: mmrotate
Open-MMLab (Open Multimedia Laboratory) is an open-source project that focuses on the research and development of deep learning algorithms in the field of computer vision. It aims to provide a cutting-edge research and development platform for the deep learning community. As part of the Open-MMLab project, several tools and software libraries have been developed to address various computer vision needs.
While some functionalities are related to object detection, for remote sensing applications, it is more appropriate to consider the MMRotate module. These tools are designed to handle objects with arbitrary orientations in the image, which is often the case in remote sensing images that may not have a preferred orientation.
In the following article, the authors present an alternative method for detecting objects that may have undergone rotations (SRNet), similar to the MMRotate module.
Yang, X.; Zhang, Q.; Dong, Q.; Han, Z.; Luo, X.; Wei, D. Ship Instance Segmentation Based on Rotated Bounding Boxes for SAR Images. Remote Sens. 2023, 15, 1324. https://doi.org/10.3390/rs15051324
The approach for handling rotated objects is indeed very useful. However, it is important to note the following distinction: in optical imagery, the geometries are often nadir-viewing (vertical axis), and objects are rotationally invariant with respect to the line of sight.
In radar imaging, the viewing geometry is side-looking. Therefore, an object is not strictly azimuth rotationally invariant. This poses a challenge when applying rotation-based approaches in radar imagery. While it may not be a problem for ships due to their negligible height and minimal projection effects, it can be problematic for other objects such as buildings or pylons. In such cases, these approaches may not be valid or applicable.
Finally, it should be noted once again that if one of the objectives is to accurately delineate the fine contour of a ship, it is possible to consider using the SAM algorithm, which provides more assistance for clustering, combined with the detection step.
However, even with guidance, it is not easy to overcome the challenges encountered, such as the aggregation of ships together or their aggregation with the dock
In summary, I believe that object detection in SAR images has not yet reached a mature stage in deep learning. Let’s get to work!