Polish Scientists Develop the Language Model More Efficient than ChatGPT

LongLLaMA will potentially handle 64x more text than ChatGPT. The LLM of researchers from Poland is based on OpenLLaMA software, created by META, the owner of Facebook.

It was developed by Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, and Piotr Miłoś. All of them are researchers associated with IDEAS NCBR, the University of Warsaw and the Polish Academy of Sciences. As well as by Yuhuai Wu, one of the co-founders of xAI, Elon Musk’s startup. And by Henryk Michalewski, associated with the University of Warsaw and Google DeepMind. By publishing their results in recent weeks, the researchers have caused a stir in the scientific community. The publication devoted to LongLLaMA, “Focused Transformer: Contrastive Training for Context Scaling”, has been accepted for the prestigious NeurIPS 2023 conference in New Orleans.

Piotr Miłoś – team’s leader

“LongLLaMA is an LLM available to everyone on the Internet,” said prof. Piotr Miłoś, leader of the research team at IDEAS NCBR. “Our model can handle 8,000 tokens at a time, which is approximately 30-50 pages of text. And for some tasks, much more, even 256,000 tokens, although this is only a technical result.

The first large open-source language models are available from March 2023. They allow scientists to do advanced work because creating your LLM from scratch is currently impossible.

“When Meta released OpenLLaMA, scientists from all over the world, including our team, took it to the workshop and modified it,” explains Piotr Miłoś. “Our LongLLaMA can process a much larger context than was previously possible, i.e., it can ‘eat’ much more text in one piece.”

Powerful and extremely accurate LLM

LongLLaMA’s advantage over other models is that it can process long inputs, generating more consistent and accurate answers. LongLLaMA can handle any context without truncating and filling it in, as passkey tests show.

The researchers checked whether LongLLaMA would be able to recall the password given at the beginning after receiving a very long prompt. LongLLaMA maintains 94.5% accuracy after receiving a 100,000-token prompt and 73% accuracy after receiving 256,000 tokens. OpenLLaMA could only handle a 2,000-token prompt.

Moreover, this model can now produce coherent texts with a length of 8,000 tokens and potentially even 256,000 tokens, which would significantly surpass ChatGPT. Importantly, it consumes relatively little power – a single processor is enough to use LongLLaMA – and works very fast. It can be used for all tasks in which chatbots already help us. It includes text generation, text editing, conversation with the user, creating summaries, translation, etc.

What is the difference between LongLLaMA and ChatGPT?

LongLLaMA, unlike ChatGPT, does not have an interface on the Internet. But anyone can download the model from the HuggingFace website and run it on their computer. Exerybody can modify open-source software as well. It distinguishes it from ChatGPT software, which has not been made available to the public. However, it is known to be based on the Transformer architecture as well.

It is a type of neural network architecture that analyzes text to distinguish complex connections between words on multiple layers. All of them by learning patterns from vast amounts of data. This technology has revolutionized natural language processing, enabling chatbots to generate text, translate, talk to the user. And perform many other tasks at a level previously unavailable to artificial intelligence.

When we ask a question to a chatbot using Transformer, it changes the text to tokens. These are pieces of information, usually between one character and one word. By dividing text into tokens, artificial intelligence can effectively process information.

However, the number of tokens a chatbot can accept is limited. In the case of ChatGPT 3.5’s token limit is 4,096, OpenLLaMA – is 2,000, and Google Bard – is about 1,000. Therefore, when we ask a chatbot a long question or provide a lot of information, it may be necessary to cut or omit some fragments. Most chatbots can’t analyze an entire book, a long conversation, or an article.

Limitations

“The full potential of LLMs is often limited by how much context a given model can take,” said Piotr Miłoś. “That’s why we introduced Focused Transformer (FoT), a technique that uses a training process inspired by contrastive learning. This novel approach allows fine-tuning of already available LLMs so that they can take on greater context.

“ChatGPT is a commercial product. It has been optimized for pleasant service” – explains Piotr Miłoś. “Models like LongLLaMA issue rather raw information on which you can build something, such as analyzing text or producing code. LongLLaMA is a great achievement. It shows that LLMs can overcome the limitations associated with the length of prompts. And produce long texts that will be useful for humans.”

How to start LongLLaMA?

  1. Go to the https://colab.research.google.com/github/CStanKonrad/long_llama/blob/main/long_llama_instruct_colab.ipynb
  2. Click “Środowisko wykonawcze” in the menu and then “Uruchom wszystko”.

3. After a while, the code will be launched. And at the bottom of the page, a pop-up window will appear after the word “USER:” in which you can enter prompts.

Read more about ChatGPT.

InPost Pay already has 1 million users

The service was awarded at the Mobile Trends Awards 2023, winning a statuette in the Fintech category and 2nd place in the Main Category. InPost Pay is also the winner of the Cashless Pay...

$350M for expansion of Atman data centres

Atman has secured the most extensive grant in Poland to expand data centres. An agreement was signed by 6 financial institutions from Poland and Europe. The loan is earmarked for constructing the new WAW-3...

Warsaw University wins 2024 ICPC European Championship

The Polish team (Arkadiusz Czarkowski, Bartłomiej Czarkowski and Tomasz Nowak, accompanied by prof. Krzysztof Diks, and prof. Jan Madey) achieved an emphatic victory in the 2024 ICPC EUC as the only one solving 9...

Jagiellonian University wins CERC 2023

UJ's victory in the CERC 2023 was undisputed, as it was the only team to solve 10 tasks out of 12. Second place went to the University of Warsaw ahead of the University of...

Invitation to the conference Perspektywy Women in Tech Summit 2024

The Perspektywy Educational Foundation is organizing the sixth edition of the Perspektywy Women in Tech Summit 2024 - the most significant event in Europe and Asia for women in STEM, Tech & IT. This year,...

Haptics and AR solutions support a virtual heart you can touch

Touch My Heart, by the Polish-Ukrainian scientific team of SoftServe, allows users to interact with a 3D model of a virtual heart. All you need for this is a computer, a haptic tile, AR...
We use cookies to personalise content and to analyse our traffic. We also share information about your use of our site with our analytics partners. View more
Cookies settings
Accept
Privacy & Cookie policy
Privacy & Cookies policy
Cookie name Active

PoLAND of IT masters: Information Hub

Privacy Policy

Address of our website is: hub.landofitmasters.pl

What data do we collect?

Our Company collects the following data:
  • A unique ID is used to generate statistical data on how the visitor uses the website.
  • Determining the preferred language of the visitor and setting the language accordingly on the website, if possible.
  • Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.
  • Cookie used by Google Analytics to throttle request rate
How do we collect your data?
Our website collects the data about your preferred language with the use of a built-in widget and statistical data with the use of third-party provider Google Analytics.

How will we use your data?

Our Company collects your data so that we can improve the page's content and performance in future development.

What are your data protection rights?

Our Company would like to make sure you are fully aware of all of your data protection rights. Every user is entitled to the following: The right to access – You have the right to request Our Company for copies of your personal data. We may charge you a small fee for this service. The right to rectification – You have the right to request that Our Company correct any information you believe is inaccurate. You also have the right to request Our Company to complete the information you believe is incomplete. The right to erasure – You have the right to request that Our Company erase your personal data, under certain conditions. The right to restrict processing – You have the right to request that Our Company restrict the processing of your personal data, under certain conditions. The right to object to processing – You have the right to object to Our Company’s processing of your personal data, under certain conditions. The right to data portability – You have the right to request that Our Company transfer the data that we have collected to another organization, or directly to you, under certain conditions. If you make a request, we have one month to respond to you. If you would like to exercise any of these rights, please contact us. You can find our detailed contact information in the footer of this website or by following the contact link in the main menu.

Cookies

Cookies are text files placed on your computer to collect standard Internet log information and visitor behavior information. When you visit our websites, we may collect information from you automatically through cookies or similar technology For further information, visit allaboutcookies.org.

How do we use cookies?

Our Company uses cookies in a range of ways to improve your experience on our website, including:
  • Understanding how you use our website
  • Collecting information about your preferred language

What types of cookies do we use?

There are a number of different types of cookies, however, our website uses:
  • Functionality – Our Company uses these cookies so that we recognize you on our website and remember your previously selected preferences. These could include what language you prefer and location you are in. A mix of first-party and third-party cookies are used.

How to manage cookies

You can set your browser not to accept cookies, and the above website tells you how to remove cookies from your browser. However, in a few cases, some of our website features may not function as a result.

Privacy policies of other websites

The PoLAND of IT masters: Information hub website contains links to other websites. Our privacy policy applies only to our website, so if you click on a link to another website, you should read their privacy policy.

Changes to our privacy policy

Our Company keeps its privacy policy under regular review and places any updates on this web page. This privacy policy was last updated on 21 November 2021. If you make a request, we have one month to respond to you. You can find our detailed and up-to-date contact information in the footer of this website or by following the contact link in the main menu.  
Save settings
Cookies settings