Journey of AI to Generative AI and How It Works

This article discusses the basics of AI/ML, its usage, the evolution of Generative AI, Prompt Engineering, and LangChain.

Sukanta Paul

Aug. 31, 23 · Analysis

Likes (2)

Comment

Save

5.1K Views

In the last few years, cutting-edge technologies and services have drastically changed their directions, dynamics, and use cases. It is quite evident that the recent wave of global technology adoption by industries is overwhelmed by Artificial Intelligence (AI) and its various flavors. AI is becoming increasingly woven into the fabric of our everyday lives, changing the way we live and work. This article discusses the basics of AI/ML, its usage, the evolution of Generative AI, Prompt Engineering, and LangChain.

What Are AI and ML?

AI is the capability of simulating human intelligence and thought processes such as learning and problem-solving. It can perform complex tasks that historically could only be done by humans. Through AI, a non-human system uses mathematical and logical approaches to simulate the reasoning that people use for learning new information and making decisions.

Artificial intelligence has a wide range of capabilities that open up a variety of impactful real-world applications. Some of the most common AI capabilities used today include pattern recognition, predictive modeling, automation, image recognition, and personalization. In some cases, advanced AI can even drive cars or play complex games like chess or Go.

Machine learning is a subset of AI that uses algorithms trained on data to produce models for performing such complex tasks. Today, most AI is performed using machine learning, so the terms AI and ML are often used synonymously. A subtle difference: AI refers to the general concept of creating human-like cognition using computer software and systems, while ML refers to only one method of doing so. ML enables a computer system to continue learning and improving on its own based on experience.

Benefits

AI and ML bring a wide variety of benefits to both businesses and consumers. While businesses can expect reduced costs and higher operational efficiency, consumers can expect more personalized services and suggestions. Some of the key benefits are

Handling large, diverse data and analyzing it through mathematical models and predicting actionable insights — leading to decreased operational and support costs.
Improved MTTR (mean time to repair) and improved/exceeded RTO objectives.
Improved customer satisfaction and experiences that can be tailored to meet personalized customer needs.

What Is Generative AI?

Generative AI (popular as GenAI) uses a set of algorithms to enable users to quickly generate new content based on a variety of inputs. The generated content includes but is not limited to text, images, soundtracks, and videos. The algorithms are built on top of a Large Language Model trained on vast, unlabelled data. One of the breakthroughs with generative AI models is the ability to leverage different learning approaches, including unsupervised or semi-supervised learning, for training.

Difference Between Traditional AI

The main difference between traditional AI and generative AI lies in their capabilities and applications. Traditional AI excels at pattern recognition, while generative AI excels at pattern creation. Traditional AI can analyze data and tell you what it sees, but generative AI can use that same data to create something entirely new. Both GenAI and artificial intelligence use machine learning algorithms to obtain their results. However, they have different goals and purposes.

FM and LLM

Foundation Model (FM)

Foundation models serve as the base for more specific applications. They are AI models designed to produce a wide and general variety of outputs. They are capable of a range of possible tasks and applications, such as text, image, or audio generation. In other words, the original model provides a base (hence “foundation”) on which other things can be built. Typical examples of foundation models include many of the same systems listed as LLMs.

Large Language Model

A large language model (LLM) is a machine learning model that uses deep learning algorithms with neural network techniques to process and understand natural language. These models are trained on massive amounts of text data to learn patterns and entity relationships in the language. They are pre-trained with billions of data using self-supervised learning and semi-supervised learning. These models can capture the complex entity relationships in the text at hand and can generate the text using the semantics and syntactic of that particular language in the scope. Popular use of LLM: text generation, machine translation, summary writing, image generation from texts, machine coding, chat-bots, or Conversational AI.

Large language models (LLMs) fall into a category called foundation models. While Language models take language input and generate synthesized output, Foundation models work with multiple data types. They are multimodal, meaning they work in other modes besides language.

Many times, “foundation model (FM)” is often used synonymously with “large language model (LLM)” because language models are currently the clearest example of systems with broad capabilities that can be adapted for specific purposes. The relevant distinction between the terms is that “large language models” specifically refer to language-focused systems, while “foundation model” attempts to stake out a broader function-based concept, which could stretch to accommodate new types of systems in the future.

How GenAI Works

Generative AI is powered by LLMs (large language models) as well as Foundational Models (FM) comprised of machine learning models that are pre-trained on vast amounts of data.

Note: This article is aimed specifically at LLM and its relevance with Generative AI.

The term “large” refers to the number of values (parameters) the model can change autonomously as it learns. Some of the successful LLMs have trillions of parameters.

It uses self-supervised and/or unsupervised learning to predict the next token in a sentence, given the surrounding context. The LLM models typically follow a transformer-based architecture that uses self-attention mechanisms to calculate a weighted sum for an input sequence and dynamically determine which tokens in the sequence are most relevant to each other. This helps identify relationships between words in a sentence regardless of their position in the text sequence/sentence.

LLMs are used for few-shot and zero-shot scenarios. Both few-shot (very limited labeled data) and zero-shot (no labeled data) approaches require the AI model to have good inductive bias and the ability to learn useful representations from limited (or no) data.

Pricing of LLM

Usage of Generative AI and supporting models are billed based on the volume of input they receive and the volume of output they generate. Different LLM providers price their models and volumes differently, with some with tokens and others with characters, but the concepts remain related to each other. A token is a unit of measure, representing approximately four characters. A character is a single letter, number, or symbol. For example, the word “footballer” has eight characters and may be slightly larger than two tokens. In general, the pricing is done based on per 1K token or characters.

Core Components in LLM

Input data: Large volume of unstructured data
Tokenization: Tokenization breaks down a text document into smaller units called tokens — a tokenizer maps between texts and lists of integers. The goal is to convert an unstructured text document into numerical data that is suitable for predictive and/or prescriptive analytics. Tokenizer generally output only integers in the range {0,1,2,..V}, where V is called its vocabulary size. Another function of tokenizers is text compression, which saves computation time.
Embedding: Embeddings are vectors or arrays of numbers that represent the meaning and the context of the tokens that the model processes and generates. The embeddings are then used as inputs to NLP models, allowing the models to understand the meaning of the words in the text.
Transformer: It enables the LLM to understand and recognize the relationships and connections between words and concepts using a self-attention mechanism. That mechanism is able to assign a score, commonly referred to as a weight, to a given item (called a token) in order to determine the relationship. That mechanism is able to assign a score, commonly referred to as a weight, to a given item (called a token) in order to determine the relationship. The transformer has two parts — an encoder and a decoder. These two parts can be used independently or together — encoder only, decoder only, encode-decoder. A transformer uses self-attention on the encoder side and attention on the decoder side.

Generative AI in Industries

The implementation of Generative AI in each of the industry use cases requires a combination of different LLM/FM and underlying models. Below are a few use cases:

Media and entertainment: Bring more creativity in various ranges, starting from video games to film, animation, world-building, and virtual reality.
Automotive industry: Build 3D models for simulations and design of cars based on some text description given by the designer. Also, Generative AI can create synthetic data to train autonomous vehicles.
Healthcare industry: Generative models can aid in medical research by developing new protein sequences to aid in drug discovery. Practitioners can also benefit from the automation of tasks such as scribing, medical coding, medical imaging, and genomic analysis.
Image generation: Transform text into images and generate realistic images based on a setting, subject, style, or location that they specify.
Semantic image-to-photo translation: Based on a semantic image or sketch, it is possible to produce a realistic version of an image.
Music generation: Generative AI is also purposeful in music production. Music-generation tools can be used to generate novel musical materials for advertisements or other creative purposes.
Personalized content creation: Generate personalized content (text, images, music, etc.) based on their personal preferences, interests, or memories.
Sentiment analysis/text classification: Sentiment analysis, which is also called opinion mining, uses natural language processing and text mining to decipher the emotional context of written materials.
Code generation: Produce code without the need for manual coding. A very minimal manual update may be required to deploy the code, optimizing the development lifecycle.
Data synthesis: Create synthetic data that is similar in statistical properties to real-world data but is not necessarily based on any specific real-world data points.
Chatbots and virtual assistants: generating responses to user input in the form of natural language. Provide information, answer questions, or perform tasks for users through conversational interfaces such as chat windows or voice assistants.
Content summarization: Generative AI can process large documents and bring out meaningful summarized content. This enables the automatic generation of concise and accurate summaries of policy documents, reports, and legislative updates.
Streamlined drug discovery and development: Finding potential drug candidates and testing their efficacy with computer simulations could vastly expedite the process of discovering new drugs, from preclinical trials on animals to clinical tests on humans.

Popular LLM

Some of the most popular LLMs are shown below:

Name	By	Architecture type
GPT-3, GPT3.5, GPT-4	OpenAI	Autoregressive transformer decoder model
BERT	Google	Transformer encoder
BLOOM	Huggingface	Decoder-only Transformer model, auto-regressive model
PaLM (Pathways language Model)/PaLM2	Google	Decoder-only Transformer model
Gemini	Google	Merges a multimodal encoder and decoder.
DALL-E/DALL-E2	OpenAI	Autoencoder Architecture (both Encoder and Decoder)- multi-modal implementation of GPT-3
LLaMA	Meta	Transformer's decoder-only architecture

Prompt Engineering

Prompt engineering is the art of developing prompts (e.g., questions to be asked) that can guide LLM to perform specialized tasks more accurately. This involves instructions and context passed to a language model to achieve a desired output. It is an AI engineering technique for refining LLMs with specific prompts and recommended outputs and also the process of refining input to various generative AI services to generate text or images

Prompts can be of two types:

Hard-Prompt: hard-coded, hand-crafted by humans
Soft-Prompt: Built/generated by AI

Soft prompts are better suited to ask complex questions. The prompts themselves can be generated by using AI models. Prompt Engineering has become a helpful skill for AI Engineers to use language models in an efficient manner.

Prompt Engineering and Way of Working

This skill of building prompts (relevant questions) is being popularly adopted by business users and operations SMEs for creating FAQs/Troubleshooting various products and applications. On the other hand, the GenAI and LLMs are being used together to derive precise answers to these prompts. There are primarily two ways to do that.

First, send the prompt questions to a specific LLM model, run the model, and get answers. The LLM will use the model and run on available data (gathered from the internet) to give the best answers. However, this is likely to result in hallucination or biasing as LLM may not have been trained on relevant data/input/context of questions.

Second, contextualize the LLM model by fine-tuning. The word ‘context’ here means the LLM should have reference to specific information of a product/ domain the user is dealing with. Fine-tuning the entire LLM model is a complex and costly affair. The other way is to feed specific contexts externally (embedding) into LLM and then use a specific model to build inferences based on these embedded data.

An overall workflow of Prompt Engineering with Embedding is captured below:

Example of Prompt Engineering Using LangChain

LangChain is an open-source framework for building robust LLM-powered applications. It provides a set of libraries that help users code and integrate with LLM models via API. It provides various classes for Prompt templates and environment configurations. LangChain doesn’t exactly provide models itself, but it does allow users to integrate the models from other LLM providers. LangChain works by providing developers access to seven components that enhance the powers of LLMs, namely schema, models, prompts, indexes, memory, chains, and agents.

Below is a Code snippet of LangChain and prompts.

    Python
   
 

   # pre-requisites
Pip install OpenAI
Pip install Langchain

# Step 1: Import various Langchain libraries
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.chains.summarize import load_summarize_chain
import openai

#Step 2 : Set the API Key of LLM provider
import os
os.environ['OPENAI_API_KEY'] = ‘<API-KEY>’ #one must create an API Key as pre-requisite

#define the LLM model to be used 
davinci = OpenAI(model_name='text-davinci-003')

user_input = input("Enter a concept: ")

# Step 3: Define the Prompt Template
prompt = PromptTemplate(
     input_variables=["subject"],
     template="write me an essay about {subject}",
)

# Step  (optional): Print the Prompt Template
print(prompt.format(subject=user_input))

# Step 4: Instantiate the LLMChain
# default model is text-davinci-003, below (commented one) constructor also can be passed
#llm = OpenAI(temperature=0.9,model_name="text-davinci-003")

llm = OpenAI(temperature=0.9)
chain = LLMChain(llm=llm, prompt=prompt)

# Step 5: Run the LLMChain
output = chain.run(user_input)
print(output)

#=== summarizing the output and use a prompt ====
#step 6 : Summarize
response = openai.Completion.create(
   engine="davinci", 
   prompt=f"Summarize in 200 words:\n{output}",
   max_tokens=1024,
   temperature=0.5,
  n = 1,
  stop=None
)
print("summary is below\n")
print(response.choices[0].text.strip())
  

IBM and GenAI Space

With the evolution of Generative AI and its applicability exponentially increasing, IT leaders are aiming to be front runners by offering their enterprise clients solutions built on AI and Generative AI. Based on market studies and surveys, it was observed that one of the most preferred areas (for GenAI applicability) by clients is ITSM (IT Service Management) — aimed at predicting critical issues/problems, incident avoidance, optimizing/reducing OpEx, and improving MTTR and productivity.

Generative AI can enhance IT Operations and Service Management (ITSM) capabilities, especially in the context of Incident, Problem, Change, and Configuration Management. The successful application of generative AI in ITSM will hinge upon a strong understanding of the business context, the IT services being managed, the ITSM processes, and the interactions between them. This understanding should be reflected in the design and implementation of the AI system. Along with choosing the right platform and AI Models, the success of generative AI in IT Operations also depends on having clean, high-quality data and skills to manage, train/tune, and maintain AI models.

IBM has developed multiple assets in the space of AI-based operations management backed by Watsonx as well as custom models. These assets, along with IBM Control Tower (a combination of data collectors for ITSM tools, performance monitoring tools, DevSecOps, and ALM tools, Watsonx custom-built AI/ML models, and visualization), are deployed across multiple clients. Some of the use cases are given below:

Incident management: Generative AI is used to automatically prioritize and route incidents based on the incident descriptions and other attributes. It can also suggest potential solutions by drawing parallels with historical incidents.
Incident categorization and prioritization: Analyze the nature of incidents and historical data to categorize and prioritize incidents accurately. This helped in faster resolution of critical incidents and better resource allocation, improving MTTR.
Predictive incident management: By analyzing historical data and current system health, generative AI can predict potential incidents and suggest preventive actions. This helps in reducing system downtime and improving service availability.
Problem management: Analyze multiple incident patterns to identify underlying problems. It can generate problem hypotheses and simulate a range of scenarios based on historical problem data to help ITSM teams in root cause analysis and problem resolution. Extensive usage of this can be in ChatOps /virtual war room for IT Ops — where SMEs/SREs/support teams are automatically brought in and provided with predictive analysis and solutions.

AI Language model Machine learning

Opinions expressed by DZone contributors are their own.

Related

Trending