2026.01.09 (금)

  • 흐림동두천 2.2℃
  • 구름많음강릉 5.8℃
  • 흐림서울 3.0℃
  • 흐림대전 5.6℃
  • 맑음대구 6.9℃
  • 맑음울산 9.0℃
  • 구름조금광주 7.0℃
  • 맑음부산 7.2℃
  • 구름많음고창 6.5℃
  • 구름조금제주 10.7℃
  • 흐림강화 4.1℃
  • 흐림보은 5.1℃
  • 구름많음금산 5.8℃
  • 구름많음강진군 7.6℃
  • 맑음경주시 8.0℃
  • 맑음거제 5.5℃
기상청 제공

English

Naver Admits Use of Chinese AI Technology in Korea's Sovereign AI Model... Will Naver's Sovereign Model Address the Technological Sovereignty Controversy?

 

[News Space=Reporter seungwon lee] Naver Cloud, which is participating in the Korean government's 'National Sovereign AI' project, has admitted to having installed Alibaba's vision encoder in its model, putting the Korean version of the sovereign AI strategy on a fundamental test.

 

With one of the five teams eliminated in the first evaluation on January 15th, tensions are clashing head-on between the government's principle of a "domestic model designed from scratch" and the use of global open source.

 

The Shadow of China's Qwen Encoder Revealed by 'Cosine 99.51%'

 

According to an analysis released to the developer community, the vision encoder weights of 'HyperCLOVA X Seed 32B Sync' submitted by Naver Cloud to the national model project showed a cosine similarity of 99.51% and a Pearson correlation coefficient of 98.98% or higher with the Qwen 2.4/2.5 series model encoder of Alibaba in China.

 

This figure shocked the industry because it went beyond simple structural similarity and enabled the interpretation that the learned parameters (weights) were essentially identical.

 

Vision encoders are a core component of multimodal AI, converting visual information like images and videos into vector signals understandable by language models. Their sensitivity is so sensitive that they've been likened to the human optic nerve. Naver Cloud acknowledged that the encoder is a fine-tuned and optimized module based on a Chinese open source (Qwen series) but emphasized that it was "a strategic choice for global ecosystem compatibility and system efficiency.

 

Naver: "Our brains are 100% domestically produced... Our encoders leverage the 'shoulders of giants' "

 

In its statement and press release, Naver Cloud repeatedly emphasized that the foundation model (the "brain" for language and reasoning) of this model was "100% developed in-house," and that its core intelligence was domestically produced. The company explained, "The vision encoder is a module that acts as an "optic nerve" that converts visual information into signals. It is an advanced engineering decision that utilizes high-performance open source proven in academia and globally, and then adds our own optimizations and additional learning.

 

Naver also asserted that it possesses existing video and image processing capabilities, including its proprietary vision technology, VUClip, and that "the general strategy of global big tech companies is to maximize efficiency by reusing global standard modules in the encoder area and to focus resources on the real competitive edge, the generation and inference areas." Furthermore, it reaffirmed the legitimacy of utilizing open source, saying, "The development of AI technology is a process of adding our own value by 'standing on the shoulders of giants.

 

Does it conflict with the "domestic, from scratch" requirement? The government's standards are in a gray are

 

The root of the problem lies in the two principles outlined in the government's announcement for the Sovereign AI project. The Ministry of Science and ICT defined the project as a "domestic AI-based self-reliance" project and specified that participating models must be: 1) based on domestic technology; and 2) not derivatives of fine-tuned foreign models, but "foundation models designed and pre-trained from scratch.

 

However, the detailed guidelines focus primarily on the 'from scratch' requirement for the random initialization of the foundation model (language/multimodal body) and the possibility of overall learning traceability, leaving room for interpretation as to how far external dependence on peripheral modules such as encoders and tokenizers can be achieved.

 

In fact, during the upstage Solar Open 100B controversy, the Ministry of Science and ICT stated that "if the weights are randomly initialized and the entire learning log and checkpoint are verified, it is recognized as 'from scratch'," and that this requirement is not met when reusing pre-learning weights.

 

Just three days after the Upstage controversy, a "happening" occurred… Naver faces a different level of risk

 

This Naver incident inevitably draws comparisons to the Upstage plagiarism allegations that surfaced just a few days ago. Upstage's Solar Open 100B was initially suspected of plagiarism after reports surfaced that its architecture and layer norms were similar to those of China's Zhipu AI's GLM-4.5 Air. However, at a public verification session held in Gangnam on January 2nd, Upstage released learning logs and checkpoints, proving that it was "an in-house model trained from scratch using random initialization." 

 

The informant later issued a public apology, saying, “I could not have drawn a conclusion based solely on the similarity of the layer norm values, but I raised suspicions without verification.” The industry evaluated this as “a stress test for Korean AI transparency and an incident that took verification culture to the next level.”

 

On the other hand, the Naver case is unlikely to be dismissed as a mere incident, as the parties involved have already acknowledged the installation of the Qwen encoder. It is highly likely that this will escalate into a normative and policy issue that directly conflicts with the government's definition of sovereign AI.

 

Five teams face the first round of eliminations on January 15th… Evaluation criteria put to the test

 

The Sovereign AI Foundation Model Project, with a budget of 1.46 trillion won, aims to foster domestically produced, large-scale models that achieve at least 95% of the performance of leading global models. The Ministry of Science and ICT and the National IT Industry Promotion Agency (NIPA) selected five consortiums—Naver Cloud, Upstage, SK Telecom, NC AI, and LG AI Research—as elite teams in August 2025. They announced a "survival structure" where evaluations will be conducted every six months, with only one or two finalists remaining by 2027.

 

The first evaluation results will be announced on January 15th, at which point one of the five teams will be eliminated. Especially with the recent Naver Encoder controversy and the recent Upstage plagiarism allegations erupting in succession, the government's application of "from scratch" criteria and the scope of open source use in actual evaluations will likely serve as a litmus test for the credibility of future sovereign AI policies.

 

"Technological sovereignty = all modules domestically produced?"... Diverging views from industry and academia

 

In the domestic and international developer communities, there is a direct clash between criticism that "using the core encoder of the Chinese model while promoting national sovereignty AI is inappropriate from a symbolic and risk management perspective," and defense that "combining global open source modules to build the optimal stack is common sense in modern AI engineering.

 

Some academic experts, including Professor Lim Seong-bin of the Department of Statistics at Korea University, have pointed out that “cosine similarity alone cannot determine plagiarism,” and that “transparent disclosure of design, learning, and module composition is the key to trust in national AI projects.”

 

Meanwhile, some in the startup industry are concerned that if the government interprets "domestic production of all components" as technological sovereignty, Korea will be left behind in the global AI competition.

 

The true yardstick for technological sovereignty: "license risk" is key

 

This issue is particularly sensitive because the policy starting point for the Sovereign AI project was to "relieve the risk of dependence on US and Chinese big tech companies." The government set a goal of meeting public and industrial demand with domestic infrastructure and domestically produced models, in preparation for a scenario where US and Chinese big tech companies suddenly raise their basic model usage fees or revoke their licenses.

 

However, if the vision encoder of the Naver model is structured under the Qwen open source license, there remains a potential risk that the module will need to be replaced or retrained if Alibaba changes or withdraws the license in the future.

 

A big tech expert pointed out that "if Qwen withdraws its license, the very maintenance of the model could become uncertain," and that "the core criterion for technological sovereignty should be found in 'licensing and control,' not 'performance and design.

 

In a GitHub post, Lee Seung-hyun, Vice President of FortiTumaru, urged, “We need to move beyond this pointless mudslinging debate and establish clear standards for what constitutes true technological sovereignty,” while Minister of Science and ICT Bae Kyung-hoon also assessed that “the current debate is a growing pain that Korean AI must go through to make a greater leap forward.”

 

The National AI Team's Test Bench… Transparency and Consistency Will Determine Victory

 

The Sovereign AI project is a strategic initiative aimed at establishing Korea's independent position in the global AI hegemony race, albeit belatedly. It also serves as a mirror that tests national finances and public trust. The Naver Encoder issue, which erupted shortly after the Upstage controversy was resolved with public verification and an apology, goes beyond the dark history of individual companies and raises a more fundamental question

 

The ball is now in the government's court, the evaluation committee, and the participating companies. How will they achieve these three goals—following global standards in performance, maintaining Korean control, and coexisting with the open source ecosystem—through institutional, technological, and governance mechanisms? The results of the first evaluation on January 15th and the subsequent follow-up measures are expected to be a watershed moment in assessing the maturity of Korea's AI strategy.

배너
배너
배너

관련기사

93건의 관련기사 더보기