Running powerful NLP models with the HuggingFace transformers library.
Intro
Welcome to the third lesson of the course. Let’s recap our progress so far:
Lesson 1: We made a python environment for LLMs.
Lesson 2: Set up a personal blog to track our progress.
Next we will use our first LLM. We’ll start with a Natural Language Processing (NLP) model provided by the HuggingFace team.
Notebook best practices
First, let’s set up our notebook to be fully interactive and easy to use. We can do this with a couple of “magic functions” built-in to Jupyter.
Specifically, we use the magic autoreload and matplotlib functions. The cell below shows them in action:
# best practice notebook magic%load_ext autoreload%autoreload 2%matplotlib inline
Let’s take a look at what these magic functions do.
autoreload dynamically reloads code libraries, even as they’re changing under the hood. That means we do not have to restart the notebook after every change. We can instead code and experiment on the fly.
matplotlib inline automatically displays any plots below the code cell that created them. The plots are also saved in the notebook itself, which is perfect for our blog posts.
All of our notebooks going forward will start with these magic functions.
Let’s start with the "hello, world!" of NLP: sentiment analysis.
Sentiment Analysis with HuggingFace
Note
The code and examples below are based on the official HuggingFace tutorial, reworked to better suit the course.
Imagine that we’re selling some product. And we’ve gathered a bunch of reviews from a large group of users to find out both the good and bad things that people are saying. The bad reviews will point out where our product needs improving. Positive reviews will show what we’re doing right.
Figuring out the tone of a statement (positive vs. negative) is an area of NLP known as sentiment analysis.
Going through each review would give us a ton of insight about our product. But, it would take a ton of intense and manual effort. Enter Machine Learning to the rescue! An NLP model can automatically analyze and classify the reviews in bulk.
First, a Pipeline
Let’s take a look at the HuggingFace NLP model that we’ll run. At a high level, the model is built around three key pieces:
A Config file.
A Preprocessor file.
Model file(s).
The HuggingFace API has a handy, high-level pipeline that wraps up all three objects for us.
Important
Before going forward, make sure that the llm-env environment from the first lesson is active. This environment has the HuggingFace libraries used below.
The code below uses the transformers library to build a Sentiment Analysis pipeline.
# load in the pipeline object from HuggingFacefrom transformers import pipeline# create a sentiment analysis pipelineclassifier = pipeline("sentiment-analysis")
1
Import the pipeline.
2
Instantiate the sentiment classifier.
No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Line 5 Load in the sentiment analysis pipeline.
Since we didn’t specify a model, you can see in the output above that HuggingFace picked a distilbert model for us by default.
We will learn more about what exactly distilbert is and how it works later on. For now, think of it as a useful NLP genie who can look at a sentence and tell us whether its has a positive or negative tone.
Next, let’s find out what the model thinks about the sentence: "HuggingFace pipelines are awesome!"
# sentiment analysis on a simple, example sentenceexample_sentence ="HuggingFace pipelines are awesome!"classifier(example_sentence)
Not bad. We see a strong confident score for a POSITIVE label, as could be expected.
We can also pass many sentences at once, which starts to show the bulk processing power of these models. Let’s process four sentences at once: three positive ones, and a clearly negative one.
# many sentences at once, in a python listmany_sentences = ["HuggingFace pipelines are awesome!","I hope you're enjoying this course so far","Hopefully the material is clear and useful","I don't like this course so far",]# process many sentences at onceresults = classifier(many_sentences)# check the tone of each sentencefor result in results:print(f"label: {result['label']}, score: {round(result['score'], 4)}")
label: POSITIVE, with score: 0.9999
label: POSITIVE, with score: 0.9998
label: POSITIVE, with score: 0.9998
label: NEGATIVE, with score: 0.8758
Congrats! You’ve now ran a HuggingFace pipeline and used it to analyze the tone of a few sentences. Next, let’s take a closer look at the pipeline object.
Going inside the pipeline
Under the hood, a pipeline handles three key HuggingFace NLP pieces: Config, Preprocessor, and Model.
To better understand each piece, let’s take one small step down the ladder of abstraction and build our own simple pipeline.
We will use the same distilbert model from before. First we need the three key pieces mentioned above. Thankfully, we can import each of these pieces from the transformers library.
Config class
The config class is a simple map with the options and configurations of a model. It has the key-value pairs that define a model’s architecture and hyperparameters.
# config for the modelfrom transformers import DistilBertConfig
Preprocessor class
The preprocessor object in this case is a Tokenizer. Tokenizers convert strings and characters into special tensor inputs for the LLM.
Note
Correctly pre-processing inputs is one of the most important and error-prone steps in using ML models. In other words, it’s good to offload to a class that’s already been tested and debugged.
# input preprocessor to tokenize stringsfrom transformers import DistilBertTokenizer
Model class
The model class holds the weights and parameters for the actual LLM. It’s the “meat and bones” of the setup, so to speak.
# the text classifier modelfrom transformers import DistilBertForSequenceClassification
Naming the model
We need to know a model’s full, proper name in to load it from HuggingFace. Its name is how we find the model on the HuggingFace Model Hub.
Once we know its full name, there is a handy from_pretrained() function that will automatically find and download the pieces for us.
In this case, the distilbert model’s full name is:
> distilbert-base-uncased-finetuned-sst-2-english.
# sentiment analysis model namemodel_name ='distilbert-base-uncased-finetuned-sst-2-english'
In the code below we can now load each of the three NLP pieces for this model.
# create the configconfig = DistilBertConfig.from_pretrained(model_name)# create the input tokenizer tokenizer = DistilBertTokenizer.from_pretrained(model_name)# create the modelmodel = DistilBertForSequenceClassification.from_pretrained(model_name)
Next we will compose these three pieces together to mimic the original pipeline example.
Putting together a simple_pipeline
Preprocessing the inputs
First, we create a preprocess function to turn a given text string into the proper, tokenized inputs than an LLM expects.
def preprocess(text: str):""" Sends `text` through the model's tokenizer. The tokenizer turns words and characters into proper inputs for an NLP model. """ tokenized_inputs = tokenizer(text, return_tensors='pt')return tokenized_inputs
Let’s test this preprocessing function on the example sentence from earlier.
# manually preprocessing the example sentence: "HuggingFace pipelines are awesome!"preprocess(example_sentence)
It turned an input string into numerical embeddings for the LLM. We’ll breakdown what exactly this output means later on in the course. For now, think of it as sanitizing and formatting the text into a format that the LLM has been trained to work with.
Runnning the model
Next up, let’s make our own forward function that run the LLM on preprocessed inputs.
def forward(text: str):""" First we preprocess the `text` into tokens. Then we send the `tokenized_inputs` to the model. """ tokenized_inputs = preprocess(text) outputs = model(**tokenized_inputs)return outputs
Let’s check what this outputs for our running example sentence.
You’ll see a lot going on in the SequenceClassifierOutput above. To be honest, this is where the original pipeline does most of the heavy-lifting for us. It takes the raw, detailed output from an LLM and converts it into a more human-readable format.
We’ll mimic this heavy-lifting by using the Config class and model outputs to find out whether the sentence is positive or negative.
def process_outputs(outs):""" Converting the raw model outputs into a human-readable result. Steps: 1. Grab the raw "scores" from the model for Positive and Negative labels. 2. Find out which score is the highest (aka the model's decision). 3. Use the `config` object to find the class label for the highest score. 4. Turn the raw score into a human-readable probability value. 5. Print out the predicted labels with its probability. """# 1. Grab the raw "scores" that from the model for Positive and Negative labels logits = outs.logits# 2. Find the strongest label score, aka the model's decision pred_idx = logits.argmax(1).item()# 3. Use the `config` object to find the class label pred_label = config.id2label[pred_idx] # 4. Calculate the human-readable number for the score pred_score = logits.softmax(-1)[:, pred_idx].item()# 5. return the label and score in a dictionaryreturn {'label': pred_label,'score': pred_score, }
We can now put together a simple_pipeline, and check how it compares to the original pipeline.
def simple_pipeline(text):""" Putting the NLP pieces and functions together into a pipeline. """# get the model's raw output model_outs = forward(text)# convert the raw outputs into a human readable result predictions = process_outputs(model_outs)return predictions
Calling the simple_pipeline on the example sentence, drumroll please…
# running our simple pipeline on the example textsimple_pipeline(example_sentence)
And just like that, we too a small peek under the pipeline hood and built our own, simple working version.
One pain point: we had to know the full, proper name of the different Distilbert* pieces to import the Config, Preprocessor, and Model. This gets overwhelming fast given the flood of LLM models released almost daily. Thankfully, HuggingFace has come up with a great solution to this problem: the Auto class.
True HuggingFace magic: Auto classes
With Auto classes, we don’t have to know the exact or proper name of the LLM’s objects to import them. We only need the proper name of the model on the hub:
# viewing our distilbert model's namemodel_name
'distilbert-base-uncased-finetuned-sst-2-english'
Run the cell below to import the Auto classes. Then we’ll use them with the model name to create an even cleaner simple_pipeline.
# importing the Auto classesfrom transformers import AutoConfigfrom transformers import AutoTokenizerfrom transformers import AutoModelForSequenceClassification
Next we create the three key NLP pieces with the Auto classes.
# building the pieces with `Auto` classesconfig = AutoConfig.from_pretrained(model_name)tokenizer = AutoTokenizer.from_pretrained(model_name)model = AutoModelForSequenceClassification.from_pretrained(model_name)
We can now use these pieces to build a simple_pipeline class that’s cleaner than before, and can handle any model_name:
class SentimentPipeline:def__init__(self, model_name: str):""" Simple Sentiment Analysis pipeline. """self.model_name = model_nameself.config = AutoConfig.from_pretrained(self.model_name)self.tokenizer = AutoTokenizer.from_pretrained(self.model_name)self.model = AutoModelForSequenceClassification.from_pretrained(self.model_name)def preprocess(self, text: str):""" Sends `text` through the LLM's tokenizer. The tokenizer turns words and characters into special inputs for the LLM. """ tokenized_inputs =self.tokenizer(text, return_tensors='pt')return tokenized_inputsdef forward(self, text: str):""" First we preprocess the `text` into tokens. Then we send the `token_inputs` to the model. """ token_inputs =self.preprocess(text) outputs =self.model(**token_inputs)return outputsdef process_outputs(self, outs):""" Here we mimic the post-processing that HuggingFace automatically does in its `pipeline`. """# grab the raw scores from the model for Positive and Negative labels logits = outs.logits# find the strongest label score, aka the model's decision pred_idx = logits.argmax(1).item()# use the `config` object to find the actual class label pred_label =self.config.id2label[pred_idx] # calculate the human-readable probability score for this class pred_score = logits.softmax(-1)[:, pred_idx].item()# return the predicted label and its scorereturn {'label': pred_label,'score': pred_score, }def__call__(self, text: str):""" Overriding the call method to easily and intuitively call the pipeline. """ model_outs =self.forward(text) preds =self.process_outputs(model_outs)return preds
Using the custom SentimentPipeline
Let’s leverage both the new class and a different model, to show the power of Auto classes.
Congrats! You’ve now built a flexible pipeline for Sentiment Analysis that can leverage most NLP models on the HuggingFace hub.
Conlusion
This notebook went through the basics of using a HuggingFace pipeline to run sentiment analysis on a few sentences. We then looked under the hood at the pipeline’s three key pieces: Config, Preprocessor, and Model.
Lastly, we built our own simple_pipeline from scratch to see how the pieces fit together.
The goal of this notebook was two fold. First, we wanted to gain hands-on experience with using the transformers API from HuggingFace. It’s an incredibly powerful library, that lets us do what used to be difficult, research-level NLP tasks in a few lines of code.
Second, we wanted to get some familiarity with downloading models. The model weights that we downloaded from HuggingFace are the same ones that we will be fine-tuning, quantizing, and deploying on our devices throughout the course.
There are two appendixes below. The first one gives a handy way of counting the number of weights in a model. The second one goes into more details about how to interactively debug an analyze the code in a Jupyter notebook.
Appendix 1: Counting the number of parameters in a model
The following code snippet counts the number of trainable parameters in a model. It’s a question that comes up often when working with LLMs, and having a quick reference to find out a rough model’s size often comes in handy.
def count_parameters(model):""" Counts the number of trainable parameters in a `model`. """returnsum(p.numel() for p in model.parameters() if p.requires_grad)
Here we use it to count the number of parameters in the distilbert model from above.
# view the number of parameters in the last model usedf"Number of trainable params: {count_parameters(model):,}"
'Number of trainable params: 66,955,010'
Appendix 2: Inspecting the classifier, notebook style.
What is the classifier object, exactly? Jupyter has many powerful ways of inspecting and analyzing its code.
One of the simplest ways of checking an object is to call it by itself in a code cell, as shown below.
# show the contents of the `classifier` objectclassifier
<transformers.pipelines.text_classification.TextClassificationPipeline at 0x176de7850>
We can see the classifier is a type of TextClassification pipeline. This makes sense: we fed it an input sentence and asked it to classify the statement as positive vs. negative.
There is also a tab-autocomplete feature to find the members and methods of an object. For example, to look up everything in classifier, hit tab after adding a ..
Uncomment the cells below and hit the tab key to test the auto-complete feature.
## tab after the `.` to auto-complete all variables/methods# classifier.
Let’s say you vaguely remember the name of a variable or function, say for example the forward() method. In that case you can type the first few letters and hit tab to auto-complete the full set of options:
## tab after the `.for` to auto-complete the rest of the options# classifier.for
Asking questions: ? and ??
Lastly, we can literally interrogate an object in Jupyter for more information.
If we tag a single ? after an object, we’ll get its basic documentation (docstring). Note that we omit it here to keep the notebook from getting too busy.
## the power of asking questionsclassifier?
If we tag on two question marks: ??, then we get the full source code of the object:
## really curious about classifierclassifier??
Both ? and ?? are excellent and quick ways to look under the hood of any object in Jupyter.
Inspecting a specific classifier function
Let’s take a look at the function that does the heavy lifting for our sentiment analysis task: forward().
# looking at what actually runs the inputsclassifier.forward
<bound method Pipeline.forward of <transformers.pipelines.text_classification.TextClassificationPipeline object at 0x176de7850>>
What does this function actually do? Let’s find out.
# source code of the forward functionclassifier.forward??
We can see that it automatically handles whether we’re running a TensorFlow (tf) or PyTorch (pt) model. Then, it makes sure the tensors are on the correct device. Lastly is calls another function, _forward() on the prepared inputs.
We can follow the rabbit hole as far down as needed. Let’s take a look at the source of _forward.
# going deeperclassifier._forward??
Signature: classifier._forward(model_inputs)
Docstring:
_forward will receive the prepared dictionary from `preprocess` and run it on the model. This method might
involve the GPU or the CPU and should be agnostic to it. Isolating this function is the reason for `preprocess`
and `postprocess` to exist, so that the hot path, this method generally can run as fast as possible.
It is not meant to be called directly, `forward` is preferred. It is basically the same but contains additional
code surrounding `_forward` making sure tensors and models are on the same device, disabling the training part
of the code (leading to faster inference).
Source:
def _forward(self, model_inputs):
# `XXXForSequenceClassification` models should not use `use_cache=True` even if it's supported
model_forward = self.model.forward if self.framework == "pt" else self.model.call
if "use_cache" in inspect.signature(model_forward).parameters.keys():
model_inputs["use_cache"] = False
return self.model(**model_inputs)
File: ~/mambaforge/envs/llm_base/lib/python3.11/site-packages/transformers/pipelines/text_classification.py
Type: method
Ah, we can see it calls the model of the classifier. This is the distilbert model we saw earlier! Now we can peek under the hood at the actual Transformer LLM.
# the distilbert sentiment analysis modelclassifier.model
We will breakdown the different pieces in this model later on in the course.
The important takeaway for now is that this shows the main structure of most Transformer LLM models. The changes are mostly incremental from this foundation.