The initial reactions of the Open Historical Open Source GPT-SOS model are extremely different and mixed

Want a smart insight into your inbox? Sign up for our weekly newsletters to get the only thing that is important to enterprise AI, data, and security leaders. Subscribe now

With the long wait of Openi, his name “Open” returns yesterday with the release of two new major language models (LLM): GPT-OSS-120B and GPT-SOS-20B.

But despite the Openi’s other powerful proprietary AI model offers, the wider AI developer and the equivalent of technical standards equal to the preliminary offers of the consumer community The answer has been on the whole map so far. If this release was being classified on a movie premiere and tomatoes, we would be looking at about 50 % of the spiral based on our observations.

First some background: Openi just released these two new text models (no photo generation or analysis) Both Permit Open Source Apache 2.0 Under License B (b (b ( For the first time since 2019 (before Chat GPT) That the company has done so with a modern language model.

The whole chat of the past 2.7 years is powered by the proprietary or closed source models so farWhich is controlled by Openi and that users have no way to run or run offline with limited customization or have no way on private computing hardware.

AI scaling is hit by its limits

Power caps, increasing token costs, and delays in reduction are changing enterprise AI. Join our special salon to find out how high teams are:

Changing energy into a strategic advantage
Effficient carcincing effectively of real throptic benefits
Unlocking competitive ROI with a sustainable AI system

Save your space to stay ahead:

But all of this changed with the release of a pair of GPT-SOS models, a small or medium enterprise data center or server form, and even a little that works like in your home office.

Of course, the models are so new, taking several hours to run the AI power user community freely and test them on their own individual standards and tasks.

And Now we are coming to a wave of opinions related to excitement About the ability of these powerful, free and efficient new model An important thing of dissatisfaction and frustration that some users see as important issues and limitsEspecially compared to a similar Apache 2.0 Licensed wave Powerful Open Source, Multi Moodle LLM from Chinese Startups (Which can be taken locally on US hardware for free by US companies, or anywhere in the world, customized, operated).

High standards, but still behind the Chinese open source leaders

Intelligence benchmark GPT-SOS models are mostly ahead of the US Open source offers. According to the independent third party AI Benchmarking Firm Artificial AnalysisGPT -OSS -1220B is the “most intelligent US Openweight model”, though it There is still a shortage of Chinese heavyweights such as DiPsic R1 and Queen 3 235B.

“On the reflection, they just did. The bench was on the mark,” Self -Deep Sak “Stein” wrote. Tweet Embed. “No good derivative models will be trained … no new use has arisen … Barjar claims the rights of boasting.”

This suspicion resonates through the pseudonym open source AI researcher Trickm (@Thenom 1)Co -founder of rival open source AI model provider Nose researchWHO Called releases “Any legitimate burgers” on the X, and predicted that a Chinese model would soon be eclipse. “He was very disappointed as a whole and I came to his mind legally openly.”

Bench maxting on math and coding at the expense of writing?

Focused on other criticism The clear narrow utility of GPT-SOS models.

AI Infloys “Lizan El Gab (@Scilling 01)“Noticed that models perform well in math and coding but” completely lacking taste and intellect. “He added,” So is this just a math model? “

In creative written tests, some users found the model injection in poetic output. “This happens when you are on the benchmark Max,” Trickm gave remarksShareing a screenshot where the model added a mandatory formula mid -POM.

And @KlomsDear Pradesh AI Model Training Company a researcher Prime Minister“GPT -OSS -12 knows less about the world that does a good 32B. Perhaps he wanted to avoid copyright issues, so he probably offered the majority of the synthesis. Beautiful devastating things”.

Former Google and Independent AI developer Kyle Carbits agreed that the GPT-OSS pair of models seemed to be Mainly trained on artificial data – that is, the data developed by an AI model, especially for the purposes of new training – makes it “extremely fast”.

“This is very good in the tasks that have been trained, everything is really bad,” Carbit wrote. Great on coding and math issues, and bad in linguistic works such as creative writing or report generation.

In other words, the allegation is that Openi deliberately trained the model on the real world facts and data more than artificial data to avoid using copyright data from websites and other reservoirs that it does not own or has a license to use it, which has resulted in the result.

Others speculated that Openi had trained the model mainly on artificial data Avoid affairs of safety and securityThis results in a worse quality if it is trained on more real -world (and potentially copyright) data.

Related to the results of a third -party benchmark

Moreover, examining models on third -party benchmarking tests has come out about the matrix in the eyes of some consumers.

Speech map – which measures LLM’s performance in consumer compliance, indicates not permitted, biased, or politically sensitive results. Show compliance scores for GPTOS 120B less than 40 %For, for, for,. Near the lower back of the peer open model, Which indicates resistance to the user’s requests and defaulting the Guardials, at the expense of providing accurate information.

I Aider’s polyglawat diagnosisFor, for, for,. The GPT-OSS-1220B scored only 41.8 % in multi-linguistic reasoning-which is below rivals such as Kimi’s 2 (59.1 %) and Dipic-R1 (56.9 %).

Some users also said their tests show that the model is Strangely resistant to creating criticism of China or Russia, Contrary to its behavior with the United States and the European Union, they raise questions about prejudice and training data filtering.

Other experts have released and appreciates what it has indicated for our open source AI

To be fair, not all comments are negative. Software engineer and closed AI viewer Simon Wilison called the release “really impressive” On x, to explain In a blog post On The performance and capacity of models to achieve equality with Openi proprietary O3-mini and O4-mini models.

He praised his strong performance on reasoning and stem heavy standards, and praised the new “harmony” prompt template format-which offers developers more organized terms to guide the model’s response-and helps as a meaningful contribution to the use of a third-party device.

A Long x PostCEO and co -founder of the AI code sharing and open source community, Claim Delango The hugs faceIn encouraging users not to rush the decision, indicating that these models are complicated, and preliminary problems may be due to infrastructure instability and inadequate correction in hosting providers.

“The power of the open source is that there is no fraud,” said Dalingo. “We will naked all the powers and boundaries … slowly.”

Even more cautious was a Wartan School of Business at Professor Ethan Mulk, a professor at the University of Pennsylvania, Which was written on x “Now the United States has the leading weighing models (or close to it), but the question asked whether it is once by Open AI. “With the grip of the other, the edge will become rapidly,” He noted, adding that it was unclear what the openings were to keep the models updated.

Nathan Lambert, a leading researcher at the rival open source lab Allen Institute for AI (AI2) And observers, Appreciated the symbolic significance of release on your blog mutual contactsCall it “An extraordinary move to the open ecosystem, especially for the West and its allies, That the most famous brand of AI space has returned to release open ordinary models.

But they DEMPT on X that GPT-OSS Is “Unlikely to be meaningful to be slow (Chinese e -commerce Dev Eliaa’s AI team) Queen,” Referring to its use, performance and various types.

He argued that the release indicates a significant change towards open models in the United States, but that Openi still has a “long way” to catch practically.

A division decision

The decision, for now, is divided.

Openi’s GPT-OSS model is an important place in terms of licensing and access.

But when the benchmarks look solid, the real world “webs”-as many users explain it-are at least compelling.

Whether the developers can create strong applications and derivatives above the GPTOS, it will determine whether the release is remembered as a progress or a bulp.

Daily Insights on Business Use Matters with Daily VB

If you want to impress your boss, the VB Daily covers you. We give you internal scope what companies are doing with Generative AI, from regulatory shifts to practical deployments, so that you can share insights for more and more ROIs.

Read our privacy policy

Thanks for subscribing. Check more VB Newsletter here.

There was a mistake.

Extremely GPTSOS Historical initial mixed Model Open Reactions source

High standards, but still behind the Chinese open source leaders

Bench maxting on math and coding at the expense of writing?

Related to the results of a third -party benchmark

Other experts have released and appreciates what it has indicated for our open source AI

A division decision

Why is your cache reserve probably hurt more than help

Dominley: AI -powered domain generator for iOS.

Mobile App Development with Dart and Flutter

Create a website screenshot generator with Python and...

Vibe Coding Platform Cursor First Home LLM, Composer,...

Generative AI Hype Check: Can It Really Change...

Prepare for Kubernetes Administrator Certification and Pass

Editor's pick

Get latest news

The initial reactions of the Open Historical Open Source GPT-SOS model are extremely different and mixed

High standards, but still behind the Chinese open source leaders

Bench maxting on math and coding at the expense of writing?

Related to the results of a third -party benchmark

Other experts have released and appreciates what it has indicated for our open source AI

A division decision

Why is your cache reserve probably hurt more than help

Dominley: AI -powered domain generator for iOS.

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news