Published: 9 January 2024 at 09:07
Between the 14 and 15 of November, I had the privilege to represent the Centre for Access to Justice & Inclusion (CAJI) in a series of meeting at Geneva that brought together representatives form civil society organisations, academia, companies and governments to discuss about issues related to human rights in the digital environment, with a particular focus on AI. This is the second piece of a series of three blogs with some key conclusions and trends. In this opportunity I would like to demystify some concepts that are currently used in public debate and regulation of AI.
In recent years, the field of artificial intelligence has witnessed a surge in the regulation of AI technologies, accompanied by the emergence of concepts such as large information models, foundation models, and generative AI. However, amid this regulatory boom, certain concepts remain unclear, potentially leading to challenges. This blog aims to unravel these intricacies, bridging the gap between the technical sphere and regulatory frameworks. We'll delve into the significance of large information models, foundation models, and the onset of generative AI, shedding light on their applications and implications.
To comprehend the significance of large information models and foundation models, it's essential to grasp the fundamentals of machine learning. At its core, machine learning focuses on building software capable of performing tasks by learning from data. The goal is to create a machine, often referred to as a model, that can execute specific functions, such as identifying spam emails, recognizing images, suggesting movies or translating languages. The challenge arises when explicit rules for these tasks are challenging to define.
In traditional machine learning, the model is trained using explicit rules, which can be cumbersome and impractical for complex tasks like image recognition or language translation. This is where the concept of training data comes in. Training data consists of examples demonstrating how to solve a particular task, with inputs and corresponding outputs. For instance, to train a spam email detector, one needs a dataset with examples of emails labelled as either spam or not.
However, acquiring and annotating large volumes of training data is a labour-intensive and expensive process. For tasks like medical image analysis, relying on human experts to annotate data is both time-consuming and costly. This challenge necessitates a shift towards more efficient methods for model training and adaptation.
Foundation models represent a paradigm shift in machine learning. These models are trained on vast amounts of diverse data and possess the adaptability to perform a wide range of downstream tasks. The concept of upstream (foundation model) and downstream (specific applications) becomes crucial in understanding the adaptability of these models.
Adapting foundation models involves training them on specific tasks using a smaller dataset. This adaptation process allows these models to excel in various applications without the need for extensive, task-specific training data. The computational resources, advanced algorithms like Transformer, and the availability of large-scale training data from the internet play pivotal roles in the success of foundation models.
Just to give an example of the amount of data used by current large language (LLMs) models like Open AI GPT 3.5, Google PaLM-2/ Unicorn. GPT 3.5 uses 175B parameters while Unicorn around 340B parameters. In some cases, it is difficult to understand which kind of data LLMs are actually using (secret is usually part of the business model). Yet, there are some studies that have navigated through the online sources used by LLMs, very interesting much of the data comes from business and industrial websites led by fool.com and kickstarter.com.
In conclusion, the evolution from traditional machine learning to foundation models signifies a leap in the adaptability and efficiency of AI systems. By comprehending these concepts, one can navigate the intricate landscape of AI technologies, whether in the technical realm or within regulatory frameworks. The goal is to demystify these concepts, empowering individuals to understand and contribute to the ongoing advancements in the field of artificial intelligence and regulatory measures.
Sebastian Smart