How To Create a Question-Answering Model From Scratch
Building your question-answering model might seem like it could be more manageable. This tutorial will walk you through step by step to get it done.
Join the DZone community and get the full member experience.
Join For FreeBuilding your question-answering model might seem like it could be more manageable. This tutorial will walk you through step by step to get it done.
Question and Answering is a natural language processing (NLP) technique used to answer questions based on data given or text. Question Answering models aim to develop a system that automatically understands questions passed in natural language and provides accurate or relevant answers. Question answering is widely used in various disciplines, such as Chatbots and virtual assistants, Education and E-Learning, information retrieval, and search engines.
Setting Development Environment
Before continuing to this tutorial, you need to have some knowledge of Python to follow quickly through this setup and development process. Navigate to Google Colab or activate your Jupyter Notebook. Create a new Notebook. Then execute the following commands to install the required libraries in your development environment.
!pip install transformers
You will use transformers to provide pre-trained models and tools for working with state-of-the-art NLP models.
Importing Necessary Libraries
Import the required libraries that you will use to preprocess the data and create the model.
import tensorflow as tf
from transformers import pipeline
from transformers import BertForQuestionAnswering
Imported libraries will be used later in the code to develop the model.
Define Context and List of Questions Based on Context
Define a context string that contains the information you want to ask questions about. Here we have chosen a storyline about mountain formation.
question_answerer =pipeline("question-answering")
context = "A mountain is an elevated portion of the Earth's crust, generall
Next, create a list of questions you want to ask about the given context.
questions = [
"What is a mountain?",
"What is a plateau?"
]
Encode the first question using the tokenizer’s encode method, specifying truncation and padding. Encoding helps improve the model’s consistency, memory efficiency, and training performance.
tokenizer.encode(questions[0], truncation=True, padding=True)
Create a question-answering pipeline using the BERT-based model and tokenizer.
question_answerer=pipeline('question-answering', model=model, tokenizer=tok
question_answerer({
'question':questions,
'context' : context;
})
Load the Pre-Trained BERT-Based Model and Tokenizer
BERT (Bidirectional Encoder Representations from Transformers) mostly used language representation models in NLP tasks and questions answering. Loading pre-trained BERT-based models increases the chances of the model understanding language and its capabilities for question answering. Tokenization will divide the input text into meaningful units for the model to process.
Combining BERT and tokenizer increases your model’s performance and accurate result.
from transformers import AutoTokenizer
model = BertForQuestionAnswering.from_pretrained("deepset/bert-base-cased-s
tokenizer = AutoTokenizer.from_pretrained("deepset/bert-base-cased-squad2")
Obtaining Answers
Now that you have got context and the pre-trained model loaded, you need to pass access to the answers by assigning the answers to a dictionary and printing the score together with the answer. Like in the above, we have two questions and will assign each question by its index. The model will print the answers and the score in each.
question_answerer({
'question': questions[0],
'context': context
})
question_answerer({
'question': questions[1],
'context': context
})
Wow! Just simply, you have your question-answering model using Hugging Face Transformers.
Conclusion
Following this tutorial, you have a basic QA model using the Hugging Face Transformers library. By following this tutorial, you can quickly adapt the code to work with different contexts and questions, allowing you to create your question-answering model for various contexts.
Remember that the example code provided is just a starting point, and you can explore and customize it further based on your specific needs and requirements. Have fun experimenting and building more advanced question-answering models!
Complete code available on GITHUB REPO
Opinions expressed by DZone contributors are their own.
Comments