GPT-J: GPT-3 Democratized

Published 04.07.2021

Author Hrittik Roy

There’s quite a similarity between cloud native and Artificial Intelligence. Things here move pretty fast, and if you are the curious kind, you would always be excited to explore new and exciting tech. It’s kind of bad as sometimes you get distracted by some new technology so much that you skip your work.

I am guilty of that 🙂

Anyways, let’s talk about such exciting development in our cousin field. It’s called GPT-J, “the open source cousin of GPT-3 everyone can use”.

The Leap: GPT-3

What’s GPT-3?

GPT-3 is the 3rd generation of GPT (Generative Pre-trained Transformer) language models with 175 billion parameters. It’s pretty powerful in comparison to GPT-2 where only 1.5 billion parameters were used. If you compare GPT-3 to all the remaining models, the one coming close to GPT-3 is Microsoft’s Turning NLG with just 17 billion parameters.

Now, you get some idea what the hype is all about since the last few years. The hype is about the deep learning model trained on massive text datasets with hundreds of billions of words which is capable of producing human-like text with 10x the nearest competitors parameters.

Uses of GPT-3

Now, as it’s a language model, the applications are vast. We, too, got access and are using GPT-3 in some of our blog posts, and it’s pretty good apart from producing a slight pale tone.

Read this post by GPT-3:

The list goes on, and GPT-3 enters new domains such as code completion/ generation. GitHub launched its Copilot, and we describe it as:

GitHub Copilot is an AI-assisted pair programmer that helps you write code more quickly and efficiently. GitHub Copilot extracts context from comments and code and provides quick suggestions for individual lines and entire functions. OpenAI Codex, a new AI system developed by OpenAI, powers GitHub Copilot.

The language model also has features such as chat, tweet classifier, outline creator, keyword extractor, HTML elements generator, and many more applications. Learn more here.

Commercial Uses of GPT-3

Apart from Copilot, Microsoft gave OpenAI a $1 billion investment and gave them exclusive rights to licence GPT-3 a year later. Over 300 GPT-3 projects are in the works, according to OpenAI, using a limited-access API. A tool for extracting insights from customer comments, a system that writes emails automatically from bullet points, and never-ending text-based adventure games are among them.

GPT-3 describing itself
GPT-3 describing itself. Cool isn’t it?

Now, the two big questions.

Is GPT-3 free?

NO, you need to pay, and it’s in beta preview. So, getting access is quite complicated and tiring. You need to wait months 🤷‍♂️

Our GPT-3 Davinci Engine Invoice! Learn about pricing here

Is GPT-3 open source?

NO, so all you can do is wait.

Open Source Triumphs: GPT-J

What’s GPT-J?

I have written a blog post previously on how CNCF helps the innovation cycle rolling. It’s a big organisation with hundreds of sponsors, and they are democratising cutting edge tech.

But now, GPT-J is not backed by an organisation. It doesn’t have that significant financial backing, so it’s trained on 6 billion parameters, a lot less than GPT-3 but look at the other options available.

The publicly available GPT-2 is trained on 1.5 billion parameters which are significantly less than GPT-J’s 6 billion parameters. Moreover, having so many parameters helps it perform on par with 6.7B GPT-3. So, these factors make it the most accurate publicly available model.

GPT-J gpt-3
GPT models compared

Anyone can use it and doesn’t has to wait to get access. The quality of content is quite good, and they are improving with every passing day.

Few problems with GPT-3

GPT-3 doesn’t have long-term memory. Thus it doesn’t learn from long-term interactions as people do.

Lack of interpretability is another problem that affects large and complex data sets in general. The dataset of GPT-3 is so extensive that its output is difficult to understand or analyse.

gpt-3 limitations
Lack of interpretability; more context needed

The limited input size is also a factor because transformers have a maximum input size; GPT -3 can only handle prompts that are a few sentences long.

The model takes longer to provide predictions since GPT-3 is so huge.

GPT-3 is no exception to the rule that all models are only as good as the data that was used to train them. This paper, for example, shows that anti-Muslim bias exists in GPT-3 and other big language models.

A problem the GPT-J goes to solve

It’s expected that GPT-J can’t solve all of the above problems with limitations. They lack hardware resources. But one thing Eleuther has done well is to try to remove the bias present in GPT-3 in GPT-J.

Eleuther’s dataset is more diversified than GPT-3, and it avoids some sites like Reddit, which are more likely to include questionable content. Eleuther has “gone to tremendous pains over months to select this data set, making sure that it was both well filtered and diversified, and record its faults and biases,” according to Connor Leahy, an independent AI researcher and cofounder of Eleuther.

Also, Eleuther could make developing similar tools like GitHub copilot without access to the GPT-3 API easy. People can create without being limited to API access.

Final Thoughts

Now, talk about any language model, and it will have a considerable amount of weakness, bias and silly mistakes, which is entirely normal. Work is in progress, and we are not even close to AI’s taking over everything as sceptics imagine. Sure, AI is about to change the world, but we are pretty far from that.

GPT-J is a small step towards democratising transformers, and GPT-3 is a small step towards human-like language models. Try GPT-J out here.

Feeling exploratory? If you want to learn about the technology that helped train GPT-3? It’s Kubernetes, a container orchestration technology, and you can read more about it here:

Subscribe to our newsletter for excellent posts on cloud native technology and exploratory posts like this delivered to you weekly.

Happy Learning!

Join 100+ cloud native enthusiasts

and stay in the loop on modern software development.

Sign up to receive exclusive content around cloud native software development right into your inbox.

We don’t spam! Read our privacy policy for more info.

More stories from our blog

How To Install Docker on Ubuntu 20.04?

How To Install Docker on Ubuntu 20.04?

Docker is an open-source tool that makes managing application processes in containers much easier. Containers allow you to run your programmes in separate processes with their own resources. Containers are comparable to virtual machines, except they're more portable,...

Answer to Everything isn’t 42, it’s Family

Answer to Everything isn’t 42, it’s Family

We’re experiencing digitisation. An era where every person has a voice, and it doesn’t matter if he’s wise. There’s more motivation circulating the vast stretches of the internet than it’s required. This would be good in a theoretical world, but if you seek the truth,...

What’s new in Gitlab 14? 🦊

What’s new in Gitlab 14? 🦊

GitLab 14 is out and fans must be thrilled to know about all the new features along with all the fixes and removals. In this post, we will go through the many changes and improvements, bug fixes, and some remarkable deprecations. We will see all of that here. So,...

k8s vs k3s: The Comprehensive Difference

k8s vs k3s: The Comprehensive Difference

Kubernetes is undoubtedly a champion in the container orchestration world. But currently, we see that K3s or a lightweight Kubernetes distribution which is light, efficient and fast with a drastically small footprint levelling up. Businesses nowadays scratch their...

What’s new in Fluentbit v1.8.1?

What’s new in Fluentbit v1.8.1?

Fluentbit is a lightweight and fast data processor and forwarder for Linux, BSD and OSX. And, for Fluentbit fans, there is good news as they have released their new update with lots of new features and fixes. We will have a look at all of them below. New Metrics...

What’s new in Envoy v1.19.0?

What’s new in Envoy v1.19.0?

Envoyproxy introduced its new version, 1.19.0, recently, and it came with many changes and improvements from the previous ones. We can see more stability in this version, along with specific bug fixes. So, without waiting any further, let’s see what the new version...

What’s new in Vitess 10?

What’s new in Vitess 10?

Vitess 10 is released with many excellent features and also many bug fixes that were bothering the user base. We are going to see all the features and exciting announcements. So, Let's roll! Major Themes in Vitess In this release, we can see that Vitess Maintainers...

What’s new in Contour 1.17.0?

What’s new in Contour 1.17.0?

Contour 1.17.0 is out with a layer seven HTTP reverse proxy for Kubernetes clusters. The new version has arrived with many new features and several fixes, which will make the functioning of the ingress controller smoother. More activities within the community came...

What’s new in Prometheus 2.28?

What’s new in Prometheus 2.28?

Prometheus 2.28 is out. If you don't know, Prometheus is an excellent open-source system monitoring and alerting toolkit. Let's have a look at those features and have a look at the changelog. Displaying Trace Examplers in the Graphic Interface From the previous...

What’s new in Portainer CE 2.6.0?

What’s new in Portainer CE 2.6.0?

Portainer fans will be very excited because a new version came out recently. We are going to see the changes that came in this release and also going to explore the features. We will also see the issues and bugs that got fixed to maintain stability and improve user...

Interested in what we do? Looking for help? Wanna talk about software strategy?