“Samsung to Use AI to Resolve Cloud Outages… Aims for Autonomous Operation by 2028”
Presentation by Yoo Hyun-sung, Group Leader of the Cloud Team, MX Business Division
“Aiming to Cut Downtime by 90% and Achieve a 99% Detection Rate Within 10 Minutes”
Advancing AI to Power Over 50 Services, Including Samsung Pay and Bixby
Following a pilot phase in 2027, the project will move to the autonomous operation phase in 2028
[Edaily Reporter Shin Yeong-bin ] SamsungElectronics(005930)is accelerating the automation of outage detection and recovery by integrating artificial intelligence (AI) into the cloud operations of key services such as Samsung Pay, Bixby, and Galaxy AI. The company aims to establish an automated response system for each operational area this year, move to a predictive operations phase next year, and reach the autonomous operations phase by 2028.
Yoo Hyun-sung, Group Leader of the Cloud Team at SamsungElectronics’ MX Business Division, made this announcement on the 24th during a presentation titled “AI-Based Large-Scale Intelligent Cloud Operations” at “Deloitte Connect Korea 2026,” held at the JW Marriott Hotel Seoul in Seocho-gu, Seoul. Yoo Hyun-sung, Group Leader of the Cloud Team at SamsungElectronics’ MX Business Division, is delivering a presentation at “Deloitte Connect Korea 2026,” held on the 24th at the JW Marriott Hotel Seoul in Seocho-gu, Seoul. (Photo: ReporterShin Yeong-bin ) SamsungElectronics’ MX Business Division Cloud Team is a centralized Site Reliability Engineering (SRE) organization that operates the company’s key services. It is responsible for the stable operation of more than 50 consumer-facing services, including Samsung Pay, Bixby, Galaxy AI, Galaxy Store, Samsung Health, SmartThings, Samsung Cloud, and Samsung Account.
Group Leader Yoo outlined the organization’s goals as: △ improving stability through AIOps, △ strengthening security through SecOps, and △ optimizing costs through FinOps. His presentation focused on case studies of the transition to AIOps. AIOps is a concept that applies AI to cloud operations to automate and intelligentize operational tasks such as fault detection, root cause analysis, change impact analysis, and automatic rollbacks.
Group Leader Yoo stated, “Through AIOps, we aim to reduce incident recovery time by more than 90% compared to current levels in the medium to long term and raise the incident detection rate within 10 minutes to over 99%.” He added, “We also plan to reduce the ‘human-in-the-loop’ ratio for system operations to 20% or less over the medium to long term.”
Group Leader Yoo explained that to achieve these goals, a task force (TF) has been formed to drive key initiatives in areas such as incident management, change management, and monitoring. In the incident management domain, key tasks include automating monitoring and control room operations, as well as log and code recommendations; in change management, automating change impact analysis and implementing automatic rollbacks; and in monitoring, optimizing noise alerts and advancing anomaly detection. Yoo Hyun-sung, Group Leader of the Cloud Team at SamsungElectronics’ MX Business Division, is delivering a presentation at “Deloitte Connect Korea 2026,” held on the 24th at the JW Marriott Hotel Seoul in Seocho-gu, Seoul. (Photo: ReporterShin Yeong-bin ) As a specific use case, he introduced the automation of infrastructure change impact analysis. When a developer requests a change to infrastructure code, an AI agent analyzes the impact of that change on the service by referencing the status of cloud resources, the details of the code change, metrics, logs, and internal documentation. The agent then communicates the risk level and review results to the SRE team, who use this information to determine whether to implement the change in the actual production environment.
AI is also being applied to the review process for new service architectures. Internally at SamsungElectronics, when a new service is built, it undergoes a procedure to verify that operational requirements—such as redundancy, automated backups, and scalability—have been properly incorporated. Previously, this was a manual, checklist-based process that took more than five weeks. SamsungElectronics aims to automate this process using AI and reduce the review period to within two weeks.
Group Leader Yoo categorized AIOps maturity into four stages: reactive response, automated response, predictive operations, and autonomous operations. He said, “We are still at the reactive response stage, but our goal is to enable automated responses across each operational area through AIOps this year, move to the predictive operations stage next year, and reach the autonomous operations stage by 2028.”
He noted that even as AI increases the rate of operational automation, the role of engineers will not disappear. “While we are focused on advancing AI agents, we must not neglect training the engineers who will carry out that work,” Group Leader Yoo emphasized, adding, “Ultimately, people must come first.”
Amid a continued trend of net redemptions—where redemptions exceed new issuances—in the corporate bond market, Shinhan Investment Securities and GS Entec are set to conduct bookbuilding this week for …
I’ll try just about anything and report back to you. I’m interested in not only new products but also products making a comeback. I avoid simple reviews. I’ll also explain why a product is popular and…
SK Group has announced plans to invest an unprecedented amount of funds—totaling approximately 1,000 trillion won in total project costs alone—to develop South Korea into “Asia’s largest AI infrastruc…