Contact | Previous Issues | Articles-by-Author

Applications

Intelligent airborne monitoring of irregularly shaped man-made marine objects using statistical Machine Learning techniques

Nov 2024 | No Comment

The objective of this study is to create a new platform for the automated detection of irregularly shaped man-made marine objects (ISMMMOs) in large datasets derived from marine aerial survey imagery. We present here the first part of the paper. The concluding part of the paper will be published in the next issue

Kaya Kuru

School of Engineering and Computing, University of Central Lancashire, Fylde Rd, Preston, Lancashire, PR12HE, UK

Stuart Clough

APEM Inc., 2603 NW 13th Street, 402, Gainesville, FL 32609-2835, USA

Darren Ansell

School of Engineering and Computing, University of Central Lancashire, Fylde Rd, Preston, Lancashire, PR12HE, UK

John McCarthy

APEM Ltd., The Embankment Business Park, Stockport SK4 3GN, UK

Stephanie McGovern

APEM Ltd., The Embankment Business Park, Stockport SK4 3GN, UK

Abstract

The marine economy has historically been highly diversified and prolific due to the fact that the Earth’s oceans comprise two-thirds of its total surface area. As technology advances, leading enterprises and ecological organisations are building and mobilising new devices supported by cutting-edge marine mechatronics solutions to explore and harness this challenging environment. Automated tracking of these types of industries and the marine life around them can help us figure out what’s causing the current changes in species numbers, predict what could happen in the future, and create the right policies to help reduce the environmental impact and make the planet more sustainable. The objective of this study is to create a new platform for the automated detection of irregularly shaped man-made marine objects (ISMMMOs) in large datasets derived from marine aerial survey imagery. In this context, a novel nonparametric methodology, which harbours several hybrid statistical Machine Learning (ML) methods, was developed to automatically segment ISMMMOs on the sea surface in large surveys. This methodology was validated on a wide range of marine domains, providing robust empirical proof of concept. This approach enables the detection of ISMMMOs automatically, without any prior training, with accuracy (ACC), Matthews correlation coefficient (MCC), negative predictive value (NPV), positive predictive value (PPV), specificity (Sp) and sensitivity (Se) over 0.95. The outlined methodology can be utilised for a variety of purposes, but it’s especially useful for researchers and policymakers who want to keep an eye on how the maritime industry is deploying and make sure the right policies are in place to meet regulatory and legal requirements to promote maritime tech innovation and shape what the future looks like for the marine ecosystem. For the first time in the literature, a method, the so-called ISMMMOD, has been developed to automate the detection of all types of ISMMMOs by statistical ML techniques that require no prior training, which will pioneer the monitoring of human footprint in the marine ecosystem

1. Introduction

The process of marine spatial planning is highly contentious due to the presence of multiple stakeholders, often with conflicting objectives and values (ElrickBarr et al., 2022). The maritime economy has historically been highly diversified and prolific due to the fact that the Earth’s oceans comprise two-thirds of its total surface area. As technology advances, leading enterprises and ecological organisations are building and mobilising new devices supported by cuttingedge marine mechatronics solutions (Shi et al., 2017) within the framework of Automation of Everything (AoE) (Kuru and Yetgin, 2019) to explore and harness this challenging environment. More specifically, robotic vehicles, autonomous vehicles, and surface vessels have been deployed for the offshore industries and deep sea archaeology, ocean engineering projects, rescue operations and environmental measurements for the last several decades. For instance, the Argo program, an international collaboration, has deployed approximately 3900 instruments in the world’s oceans to facilitate the collection of data for climatological and oceanographic studies. (Riser et al., 2016). Besides, artificial structures such as gas, oil and deep seabed mining platforms, offshore renewable energy harvesting technologies such as oil and gas installations, wind farms and wave energy converters, fish farms, ships, boats and yachts for transportation, autonomous marine vehicles from unmanned ships to smaller vessels are becoming inevitable components of the offshore environment. For instance, in recent years, the offshore wind industry has seen a remarkable expansion, with an annual rate of growth of 25%, for constructing offshore energy islands to meet the reduction of gas emission targets (Zhang et al., 2021). These may conflict with nature conservation objectives, such as habitat loss or species endangerment. In other words, this rapidly expanding industry, which allows for extensive, ongoing human influence in the marine domain, has the potential to have a significant impact on the marine environment, particularly on the marine floor, on turtles, fish, and birds. For instance, the population of monitored seabirds, which account for about 19% of the all seabird populations, declined by almost 70% from 1950 to 2010 (Paleczny et al., 2015), resulting in a net loss of almost 3 billion (%29) birds since 1970 (Rosenberg et al., 2019). The decline of bird populations serves as a stark reminder of the need for immediate action to mitigate threats to the eventual decline of avifauna and the resulting degradation of ecosystem health, functionality, and services (Rosenberg et al., 2019). Intervention into nature is a natural consequence of human activities, however, when managed effectively, these interventions can be beneficial not only to the environment, but also to the ongoing development of civilisation (S`anchez-Marr`e et al., 2004). To better understand the planet and to ensure effective conservation planning, it is essential to have a comprehensive understanding of the species, habitats, and sites that require protection. Unfortunately, for the majority of species and regions, comprehensive quantitative knowledge is not yet available (Bibby et al., 1998). One of the key objectives of the development and utilisation of ecological models and applications is to influence the ecological policy practices, outputs and results in a beneficial manner (McIntosh et al., 2011). There is an urgent need to monitor the environmental upheavals, impacts and possible trends with environmental time series analysis, models and tools as the footprint of human activities increases with the rapid development of the industry. In this manner, modelling, automated detection, location and realtime monitoring of industrial sites and ecosystems around them can help uncover the current and potential future effects on nature. Furthermore, the insights observed and models developed based on these insights may help researchers and policymakers to monitor this diverse ecosystem along with the associated maritime industries and thereby help to determine the legal and regulatory requirements for reducing the ecological foot-print concerning immediate foreseeable environmental problems.

There are numerous studies in the literature to detect underwater man-made objects (MMOs) within a limited region of interest (RoI) using underwater imagery, robots or sonography. For instance, Abu et al. (Abu and Diamant, 2022) proposes a contour-based features analysis method to discern underwater MMOs from natural environment considering that contours of MMOs’ are supposed to be smoother than natural objects. There are a limited number of studies in the literature to detect specific types of surface marine MMOs using supervised Machine Learning (ML) and Deep Learning (DL) approaches that require prior training in the marine ecosystem. For instance, Han et al. (Han et al., 2022) proposed a DL technique titled LCSE-ResNet to detect, classify and locate vessels and oil platforms based on remote optical imagery, by which all other MMOs are excluded. There are no studies in the literature that investigates the detection of all types of surface marine MMOs, which makes this research the first study of its kind. Most irregularly shaped man-made marine objects (ISMMMOs) are made of materials such as metal, treated wood, fibreglass, PVC plastic, glass, or concrete and they have different types of irregular shapes and colours. Hence, it is infeasible to apply: i) a template matching technique based on a specific object to input as a template, and ii) a supervised ML approach based on a prior knowledge/similar datasets to train similar objects and then detect these objects automatically. Moreover, the current clustering algorithms used to group visual datasets are not capable of accomplishing this task with a high degree of precision (Kuru et al., 2013; Kuru and Khan, 2018), particularly for objects with indefinite shapes. Therefore, a new method is needed to realise this objective. On one hand, automatic detection of ISMMMOs is not easy based on two main reasons which pose a considerable challenge: i) the rapidly changing background depending on the camera, water turbidity, weather, wind, wave speed and period, sun glint and density of clouds, and ii) various non-definitive morphologies of MMOs. On the other hand, the characteristics of ISMMMOs differ from the natural environment and other natural objects within this ecosystem regarding the composition, features of the surface, saturation of light and colourfulness relative to the brightness to which an area radiate a varying amount of light.

Many studies aim to detect marine natural objects in sea areas using stationary landbased fixed cameras, in particular, sea animals: detection of animals in deep-sea video (Mehrnejad et al., 2013), detection of sharks using multispectral imaging (Lopez et al., 2014), and detection of killer whales using infrared spectrum (Graber, 2011). Furthermore, aerial surveys from a helicopter or small aircraft have been conducted for many years to detect, locate and monitor specific marine animals using human-based visual observations. Although there are several studies for the detection of MMOs such as ship (Saur et al., 2011), specific objects (e.g., boats, humans) on the ocean surface using infrared cameras (Leira et al., 2015). To the best of our understanding, there is no study that aims to detect all kinds of ISMMMOs automatically with unsupervised approaches using standard advanced cameras and aerial surveys, in particular, from the perspective of ecology. Aerial surveys provide a costeffective way to collect environmental information over large areas in a short amount of time; however, they may not provide reliable data if not conducted correctly (Davis et al., 2022). Longterm data using standardised and wellstructured approaches are the best way to measure change in ecology; unfortunately, this data is not available for most biogeographical regions (Clements and Robinson, 2022) due to the cost of data processing with intensive human intervention. In this sense, this study mainly aims to fill this gap in the scientific literature either by processing the collected data in an automated way, with no human intervention, to separate several hundreds of ISMMMOs from large surveys, or by processing images as they are streamed from the airborne camera systems to monitor ISMMMOs with their geospatial locations immediately with a novel approach using statistical ML techniques and HSV colour mode.

In a conventional marine survey program, there may be a large number of images, e.g., around a million, collected over a period of one year to be analysed for a particular site, and it is labour-intensive to categorise the data into two groups: positive images containing man- made material and negative images without man-made material. In fact, many of the surveys that APEM Ltd.3 has acquired indicate that 95% of the aerial survey images do not accommodate any targeted object (Kuru et al., 2023). According to some research on visual perception, humans perceive only a small portion of an environment or scene in detail under typical viewing conditions (Noe et al., 2000), which may result in discarding other details that should be taken into account. Although the elements that influence how a scene is perceived are not yet known, it appears that focus is a significant factor (Noe et al., 2000). Within this context, detection of ISMMMOs in large-scale images within very large surveys is a non-trivial task and labourintensive. Therefore, the utilisation of automated intelligent computer systems to automate this work would be highly advantageous in order to facilitate the development of efficient environmental models with real-world inputs.

To the best of our knowledge, this study, for the first time, explicitly investigates the automatic detection of offshore ISMMMOs to assist researchers, environmentalists, and policymakers in monitoring and managing the various applications of the maritime industry and to provide guidance on the necessary regulations and legal requirements.

In order to illustrate the novelty of this research, specific contributions are listed below.
1. A novel methodology, the socalled ISMMMOD, that detects and splits ISMMMOs automatically in large-scale images in typical very large marine surveys is built.
2. The ISMMMOD is developed using the HSV colour space and statistical analysis of histograms of the channels in this space based on the ROC (receiver operating characteristic) curve analysis. The techniques in the methodology differ man-madebuilt structures from natural maritime habitats (i.e., waves, sea animals, birds, seawater) in various aspects, in particular, composition, features of the surface and saturation of light. The rest of this document is structured as follows: The methodology is revealed in Section 2. The datasets on which the methodology is built and tested are explored in Section 3. A summary of the findings is provided in Section 4. Discussions are outlined in Section 5. Section 6 draws a conclusion as well as future potential works. Finally, the limitations of the study are disclosed in Section 7.

2. Methodology

All types of mobile and stationary human activities – human foot-print – are required to be monitored on a regular basis and most of these activities involve the use of non-uniform, human built structures in multiple shapes during the exploration and exploitation of these tough marine ecosystem. Detecting these non-uniform structures on their highly dynamic background entails the development of a new technique that is not based on pretrained uniform object classifiers, but based on the features independent from their shapes. In this respect, we would like to reveal the features that are different from the maritime ecosystem by which a new detection method is aimed to be developed. Built structures differ from the natural maritime habitats and their creatures in various aspects, in particular, composition, features of the surface and saturation of light. The saturation level of ISMMMOs significantly varies from their surroundings (i.e., waves, sea animals, birds, seawater). More explicitly, the saturation level of these ISMMMOs is more intense than that of the natural marine life, and in this study, more saturated sections in images considering this distinguishing feature are made distinct to detect these artificial objects. More specifically, the methodology is based on the HSV colour space (elaborated in Section 2.1) and statistical analysis of histograms of the channels in this space (elaborated in Section 2.2). The essential phases of the technique and its automated execution are depicted in Fig. 1. The dynamic thresholding in the implementation of the methodology is presented in Algorithm 1 and the automated implementation of the overall methodology is presented in Algorithms 2 and 3 which are placed in Appendix A. The execution of the methodology is exemplified for the images in Figs. 11a, 12a, 13a, 14a, 15a. The techniques in this research was built with Matlab R2020a. The interface is shown in Fig. 2. Generally speaking, in the proposed approach, the aerial image in RGB colour space is converted into HSV colour space and then the converted image is split into three components (i.e., channels), namely H, S and V that are designed to approximate the human vision. The result is a 3D matrix with elements of Hue, Saturation and Value. In the next step, the histograms of these components are computed as illustrated in the first rows of Figs. 11b, 12b, 13b, 14b, 15b. Then, the dynamically calculated threshold value is applied to S component along with shifting the H channel. At the end, the morphological operations, namely, masking, filter, and smoothing are carried out to extract the required area by suppressing the irrelevant parts mentioned in Section 2.3 such as glinting regions. The statistical terms used throughout the paper are explained regarding the scope of this paper in Table 3 for the readers who are not familiar with these commonly used terms for the statistical analysis related to the confusion matrix.

Before revealing the methodology on data samples in detail, we would like to explore the basic concepts of the HSV colour space in Section 2.1 to shed light on the developed techniques. Then, the phases of the methodology (Fig. 1) are disclosed on the sample images acquired from various image surveys. The dynamic thresholding phase for S channel is explained in Section 2.2. The phases of the masking and dilation are presented in Section 2.3.

2.1. HSV and its applications in the methodology

The main colour models are RGB, HSV, CIELAB, CMYK, and XYZ. The colour models different from the RGB are employed to realise different objectives because several fundamental issues can not be addressed using the additive RGB colour mode for image segmentation such that it is not viable to get the luminance of the image regarding human perception. For instance, the CIELAB colour space that is close to the human visual perception is applied to H&E stained microscopical images to correct the Kohler illumination problem in microscopical images (Kuru, 2014). Likewise, HSV provides a close representation of human visual perception of colour in cylindricalcoordinate representations as illustrated in Fig. 4 whereas the RGB colour mode represents the colours processing in the human biological visual system (Loesdau et al., 2014). HSV stands for i) the hue that corresponds to the angle (from the red at 0◦ , to the green at 120◦ and the blue at 240◦ , and then back to red again at 360◦), more explicitly, moving from red to yellow to green to cyan to blue to magenta and back to red, ii) the saturation that corresponds to the distance from the axis (i.e., radius), the brightness of the colour, and iii) the value indicating the luminance or intensity.

In HSV, the component, hue, has the most control over the colour information compared to the other components in terms of determining the colour information whereas the saturation designates the colourfulness relative to the brightness based on the amount of light it appears to absorb and how much light it seems to be emitting. The saturation characteristics of ISMMMOs are significantly different from those of the sea background and maritime animals, as explained earlier. Therefore, we process the chromatic hue and saturation components to reveal the artificial objects not belonging to the natural marine environment. First, the hue component is shifted by 180◦ to suppress the blueish background into reddish (Fig. 4) as shown in the examples in Figs. 11c, 12c, 13c, 14c, 15c and in the technical reports in the supplements. Second, more saturated sections of the image are made more distinctive as explained in Section 2.2.

2.2. Dynamic thresholding in S channel

It is observed that the closer the values of histogram S are to the centre, with respect to the distribution of histograms, the likelier the pixels are of representing the background and natural marine life, and vice versa the more likelier they represent ISMMMOs wherever these values get away from the axis meaning that saturation is greater. However, there is no specific value that makes this separation distinct based on the different features of the images acquired in different circumstances, mainly different lighting times of the day, month, season, and type of camera. Furthermore, the distribution of the histogram values plays a major role in representing the characteristics of the image regarding the colourfulness relative to the brightness to which to which an area radiate a varying amount of light as explained in Section 1. The objective is to separate more saturated regions from less saturated ones to determine if there is an unnatural object. All threshold values and necessary parameters need to be determined based on the distribution and features of datasets in many surveys without any user intervention due to the maritime dynamics and image capturing techniques. It is noteworthy to emphasise that the saturation values are almost normally distributed with a Gaussian function as displayed in Eq. 1. The exact distribution of data points using this Gaussian function is presented in Fig. 5 with respect to the σ.

In the first instance, a viable threshold value that separates more saturated regions from less saturated ones is found using 145 images with ISMMMOs and 5000 images with no ISMMMOs which were acquired from 22 aerial surveys as shown Fig. 3 I. A ROC curve is an ideal figure to observe how the classification model performs at various classification cut-off points using TPR (True Positive Rate) and FPR (False Positive Rate) (1- TNR) (Table 3). Hence, a ROC curve is established using a large set of threshold values, i.e., cut-off points (i.e., 17) for the purpose of determining the optimal cut-off point, which is the point of the curve nearest to the upper left-hand corner. The results are shown in Table 4. The optimum cut-off point, 0.17, is found, which is between the cut-off points of 0.15 and 0.20 as displayed in Fig. 6, and this results in 0.856 (i.e., TP = 124) and 0.817 (i.e., TN = 817) and 0.80 for sensitivity (Se), specificity (Sp) and accuracy (i.e., ACC) respectively. However, these outcomes are far away from our objectives in terms of separating images with ISMMMOs from others within large-scale surveys with higher accuracy rates. In other words, in order to achieve the desired separation (i.e., (Se) > 0.95, Sp > 0.95, and ACC > 0.95), a curve that is much closer to the top left-hand side of the ROC figure is required where the area under the ROC curve (i.e., AUC) increases, which is a desirable outcome for a test.

The saturation varies significantly, in particular, from one survey to another based on the changing conditions as mentioned above and demonstrated in the technical reports in the supplements with many examples.4 Therefore, the designs of various ROC curves are based on the several most important sections of the histogram concerning the distribution of the saturation, and Se and Sp values by determining the required number of dynamic cut-off points for increasing the Se and Sp values significantly. Technically speaking, i) the mean values (μ) and standard deviations (σ) are acquired following the histogram of the S components are obtained from those 145 images mentioned earlier, ii) they are classified based on their μ values and iii) those classes are analysed separately to find out the best cut-off points for each class. The sections on which the ROC curves are analysed are depicted in Figs. 7, 8, 9, 10 and in Tables 5, 6, 7 and 8 based on the distribution of the histogram using the statistical analysis of the μ and σ values where the cut-off points on the ROC curves are specified based on the times of σ in the both directions of μ (Fig. 5).

The number of the cut-off points for each class is specified based on the distribution of the histogram values. This analysis is mainly carried out to find out i) if there is an evident saturated region in the image that distinctively differs from the other majority regions regarding the features of saturation and most importantly ii) what the best cut-off points making this distinction resulting in higher Se and Sp values are. The histogram values based on the obtained best cut-off points are transformed to the most outer side of the radius in S channel and set to 1 to make the most saturated sections more distinct, in other words, the probable ISMMMOs visible using the masking and dilation techniques mentioned in Section 2.3. Several examples are presented in Figs. 11, 12, 13, 14 and 15. The observed best cut-off points regarding the analysed sections along with their Se and Sp values in those ROC curves are summarised in Table 9. The implementation of dynamic thresholding is presented in Algorithm 1 and exemplified in Figs. 11d, 12d, 13d, 14d, 15d with several examples along with H shifting whose new histograms are presented in the second rows of Figs. 11b, 12b, 13b, 14b, 15b.

The methodology was developed using the characteristics and distribution of 22 surveys with around 3 million large-scale images that have been acquired in the various geographical regions, and in the various time zones and seasons. The images with no ISMMMOs were exploited to obtain the general characteristic of the background whereas the images with ISMMMOs were used to determine the general characteristics of ISMMMOs. Both features are merged in the methodology to distinguish the ISMMMOs from its background and consequently discern the positive images from the negative images for further analysis.

2.3. Masking and dilation

Two masks are applied on the image acquired from the dynamic thresholding technique on H and S channels mentioned above, one of which is for detecting the blueish part and the other one is for removing the unwanted background parts from the image. First mask (i.e.,((ImgR < 0.25&ImgG < 0.80&ImgB = 1)&(ImgR < (ImgB)&ImgG < (ImgB) ) )) makes the blueish sections visible by suppressing all other sections, in particular, reddish parts that dominantly indicate the background of sea as depicted in Figs. 11e, 12e, 13e, 14e, 15e and in our technical reports. After applying this mask, the obtained image is dilated and holes are filled to make ISMMMOs coherent as shown in Figs. 11f, 12f, 13f, 14f, 15f. This process is mainly performed to gain the complete white areas of objects that are not obtained with the proposed technique as elaborated in Sections 5 and 7.

There might be several small unwanted dots that are not a part of the ISMMMOs after applying the first mask, usually a process of glinting sections after the HSV processing phase. Around 20% of the blank images come up with similar small dots usually after dilation and filling holes in the image as depicted in Fig. 16h). Several examples for these types of processed images can be reached from our technical report (e.g., examples 4, 6, 7, 15, 17, 19, 20 in MarineObjects_ Man- made_Technical_Blank_1.pdf) in the supplements. These small sections are much smaller than the ISMMMOs and are discarded by applying a size mask technique. In the last phase of the implementation, the images with detected ISMMMOs are labelled as positive images and placed in a separate directory by the application for further analysis.

End notes

1. https://apem-inc.com

2. https://www.apemltd.co.uk

3. APEM Ltd. is an environmental company and proposes novel solutions for environmental problems (https://www.apemltd.co.uk).

4. The reports from 1 to 7 titled as MarineObjects_Man-made_ Supplement are for ISMMMOs and the reports from 1 to 5 titled as MarineObjects_Man-made_ Supplement_Blank are for blank images.

The paper is republished with authors’ permission.To be concluded in next issue.