TensorTide — Accelerated Analytics

Abstract

This research proposal presents TensorTide, a novel ICP-based tool that aims to tackle the growing data analysis complexities in the field of marine ecology. TensorTide has been developed to address the shortcomings observed in current tools such as CoralNet. It is specifically designed to efficiently handle large marine picture datasets, with the capability to process more than 10,000 photos in a single batch. TensorTide utilizes an EfficientNet backbone for the purpose of image classification, which is further complemented by the integration of a Large Language Model (LLM) to facilitate improved data interaction and the development of comprehensive reports. The utilization of this dual methodology enables the achievement of precise picture analysis and profound data interpretation, thereby replicating the analytical proficiencies exhibited by a marine scientist. The objective of this study is to establish a connection between sophisticated artificial intelligence (AI) technology and the field of marine science. This will be achieved by leveraging computer vision and language processing techniques. TensorTide is poised to bring about a paradigm shift in the realm of marine data analysis, facilitating the streamlined handling of extensive datasets and the production of all-encompassing foundational reports. TensorTide will effectively accelerate marine science.

Introduction

The field of marine ecology is seeing rapid advancements, notably in the domain of data analysis. Contemporary data collection technologies have generated huge datasets, posing increasing challenges for traditional systems such as CoralNet. TensorTide, an automatic classifier tool hosted in the cloud, presents itself as a potential solution. The system utilizes an EfficientNet backbone, which is further enhanced by an LLM for the purpose of interactive data analysis. TensorTide is a software solution that will be specifically developed to efficiently process large volumes of photos, with the capability to handle over 10,000 images in a single batch. This software will not only simplify the process of manual annotation, but it will also have automatic classification functionalities. The technology distinguishes itself by incorporating an additional LLM layer, which enhances its capability to provide comprehensive reports based on the inferences made by vision models. As LLMs are multi-modal, TensorTide’s domain specific LLM will not only be able to generate reports from the vision component but also will be able to describe raw images. This breakthrough will represent a groundbreaking advancement in the field of marine research, as it effectively integrates the capabilities of artificial intelligence with computer vision and language processing. The deployment of advanced survey technologies has facilitated the collecting of a vast number of images every survey. Consequently, the availability of tools such as TensorTide is of utmost importance in order to provide efficient data analysis and the development of baseline reports. A key question of this study will be to investigate the extent to which the LLM layer implemented on TensorTide can accurately reproduce the analytical capabilities of a marine scientist. This will be accomplished by analyzing both contemporary and past data.The potential of TensorTide resides in its capacity to transform the field of marine science through the acceleration of data processing and analysis. This significant advancement is made possible by the innovative combination of vision and language models.TensorTide is the logical next step to effectively accelerate marine science. TensorTide does not aim to be cutting edge, but bleeding edge.

Feature CoralNet TensorTide
Primary Function Analysis of coral reef images with manual, semi-automatic, and automatic tools Advanced analysis of large marine image datasets with manual and automated classification, utilizing the LLM layer to generate reports as well as describe raw unprocessed images with domain specific knowledge
Technological Backbone EfficientNet-B0 backbone used for machine learning EfficientNet backbone (B0-B7) for image processing, supplemented with a Large Language Model (LLM) layer
Data Handling Capacity Upload stream cannot handle more than ~1000 images at most Designed to efficiently process over 10,000 images in a single batch via chunking methods
Innovative Feature Utilizes transfer learning with a focus on coral reef imagery Incorporates an LLM layer for enhanced data interaction and comprehensive report generation, and can also include cloud computing to create 3D renders.
Target Use-Case Primarily focused on coral reef imagery analysis Broad application in marine science, capable of analyzing diverse marine environments and generating detailed ecological reports

Literature Review

CoralNet: Overview and Evolution

CoralNet is a notable achievement in the field of marine ecology, serving as a cloud-based platform designed for the examination of photos related to coral reefs. The usability of this tool encompasses manual, semi-automatic, and automatic analysis, which may be accessed through efficient web-based workflows and API interfaces. The utilization of this platform has gained significant attention and support within the marine research community, as indicated by the presence of over 3,000 registered users and a remarkable collection of over 1.74 million photos, accompanied by more than 65 million annotations.

CoralNet, which is hosted on the Amazon Web Services (AWS) platform, distinguishes itself by its user-friendly interface and free-of-charge accessibility. This advantage is further enhanced by the availability of its open-source code. A significant milestone in its progression occurred with the introduction of CoralNet 1.0 in January 2021. The latest iteration of the software incorporated a novel machine learning engine that utilizes transfer learning, leveraging the EfficientNet-B0 backbone. CoralNet 1.0, which was developed using a dataset of 16 million labeled patches from benthic photos and a hierarchical Multi-layer Perceptron classifier particular to the source, shown a notable improvement compared to its previous version, CoralNet Beta. This improvement resulted in a reduction of error rates by 18.4% on a hold-out test set consisting of 26 sources.

Large Language Models and MarineGPT

The emergence of Large Language Models (LLMs), such as ChatGPT/GPT-4, has significantly transformed user experiences by providing strong AI help. The progression of these models into multi-modal large language models (MLLMs) has significantly broadened their functionalities, enabling them to effectively analyze and comprehend several modalities, including visual and textual information. Nevertheless, the utilization of these technologies in specialized fields, such as marine science, has not been thoroughly investigated.

In response to this identified shortcoming, MarineGPT emerges as an innovative vision-language model designed expressly to cater to the unique requirements of marine-related contexts. In contrast to MLLMs designed for general purposes, MarineGPT has undergone fine-tuning to ensure the delivery of responses that are sensitive, informative, and scientifically rigorous. This fine-tuning process is particularly important in the context of the maritime domain. The basis of this model is rooted in the Marine-5M dataset, which comprises a substantial compilation of more than 5 million pairs of marine images and corresponding textual descriptions. This extensive dataset facilitates the development of a comprehensive and intricate comprehension of marine-related subject matter. The utilization of MarineGPT not only facilitates the democratization of marine knowledge but also sets a standard for the adaptation of general-purpose AI helpers to specific fields. The potential of this technology extends beyond the realm of public education, encompassing its ability to provide assistance for advanced research and facilitate informed decision-making within the field of marine science.

Integration and gap identification

The amalgamation of perspectives derived from CoralNet and Marine GPT provides a holistic portrayal of the present condition of artificial intelligence in the field of marine science. The substantial advancements in this field are exemplified by the combination of CoralNet's comprehensive image analysis capabilities and Marine GPT's revolutionary approach to domain-specific language processing. Nevertheless, there are still deficiencies that persist, namely in the realm of managing and evaluating the ever-increasing quantities of marine data, as well as the requirement for more specialized artificial intelligence (AI) tools.

The proposed approach, TensorTide, seeks to address these disparities. The integration of CoralNet's image processing skills with Marine GPT's unique language capabilities positions TensorTide as a groundbreaking tool for marine scientists. The integration of this technology holds the potential to not only improve the effectiveness of data analysis but also to raise the caliber of insights obtained from extensive marine datasets. Hence, the objective of this study is to enhance and expand upon the functionalities created by CoralNet and Marine GPT, thereby making a substantial advancement in the domain of marine ecology.

Methodology

TensorTide Architecture

A high-level architecture of TensorTide has been created here. It consists of all components, as well as the processes by which these components will interact with each other.

TensorTide Methodology