扫码阅读
手机扫码阅读

IEEE论文|事半功倍的规模化敏捷实践案例

118 2024-01-23

Proceedings of the 55th Hawaii International Conference on System Sciences | 2022

Rocket Mortgage Delivers Twice the Value at Half the Cost at Scale

David Juan

Rocket Mortgage

David Negron

Rocket Mortgage

Michael Simmons

Scrum Inc.

Jeff Sutherland

Scrum Inc.

davidjuan@rocketmtingortgage.com

davidnegron@rocketmortgage.com

michael.simmons@scruminc.com

jeff@scruminc.com

翻译:尚君领,周一行,朱婷,王新      审校:赵卫,王蕾


1 Introduction 前言

Rocket Mortgage wanted to scale Agile while still providing autonomy for engineering teams. Agile environments are critical to adapting to changing conditions, and during COVID-19, Agile business units outperformed their non-Agile counterparts by up to 94% [1]. Rocket Mortgage began formally adopting a consistent Agile methodology in Q3 of 2018 before COVID-19 by implementing a Scaled Agile framework. By Q4, the implementation resulted in the formation of 26 release trains and improved feature throughput by 101% by Q3 2019, as measured by the number of features completed in a program increment. Deployment continues to expand, and in 2021 there are 41 release trains of roughly 30-90 people each.

Rocket Mortgage贷款公司希望在为工程团队提供自主权的同时扩大敏捷的规模。敏捷的环境对于适应不断变化的条件而言是至关重要的,在COVID-19期间,采用敏捷的业务单元比传统业务单元的业绩高出94%[1]。在COVID-19爆发前,Rocket Mortgage2018年第三季度开始在全公司的范围内实施SAFe。到了第四季度,已经形成了26个发布火车,并在2019年第三季度前将特性吞吐量提高了101%,该数据是由以一个项目群增量(Program Increment, PI)中完成的特性数量来衡量的。之后,敏捷实施还在继续扩大;到2021年有41列发布火车,每列火车大约有30-90人。

But MIT Sloan Business Review [2] reports only 17% of leading companies today will remain leaders in five years. Knowing this, the Rocket Mortgage Director of Engineering for the front end of the mortgage loan systems began evaluating the Scaled Scrum framework as a tool for improving their Scaled Agile implementation. This is significant because it is commonly assumed that Scaled Agile and Scaled Scrum are incompatible and are either/or options. Here we address the results when Scaled Scrum is added correctly to a highly successful Scaled Agile implementation.

但麻省理工学院斯隆商业评论[2]报告说,今天处于领先地位的公司中只有17%能在五年内继续保持领先。了解到这一点,Rocket Mortgage贷款系统前端的研发总监开始评估另一种规模化框架--Scrum@Scale,看看能不能在SAFe实施的基础上继续提高。这点值得敲一下重点:因为人们普遍认为SAFeScrum@Scale是不相容的,是非此即彼的选择。在这篇文章中,大家将会看到把Scrum@Scale正确地添加到一个非常成功的SAFe实施中取得的丰硕成果。

2


2

Comparable Industry Case Studies 对标行业案例研究

With the multitude of benefits that a properly implemented Scaled Agile process delivers, the reasons for undergoing a Scaled Agile transformation can vary significantly from one implementation to the next. However, research shows the primary driver for moving to Scaled Agile, when surveyed across implementations at the organization, division, and project levels, was to reduce time-to-market [3].

由于正确实施SAFe(译者注:在本文中,“Scaled Agile”特指SAFe)可以带来诸多好处,所以各个组织进行SAFe转型的原因会有很大不同。不过,对组织、部门和项目层面的实施情况进行调查的研究表明,应用SAFe的主要驱动力是为了缩短产品的上市的时间[3]

Scaled Agile implementations can often cut cycle times in half. Cerno, a leading software development company in China, experienced a 58% reduction in cycle time from their Scaled Agile implementation [4]. Another company, EdgeVerve, whose industry-leading financial software is used by banks in 94 countries, implemented Scaled Agile that resulted in a 50% cycle time reduction [5]. Royal Philips, the medical technology company, reduced cycle time by more than 58% through Scaled Agile [6].

SAFe的实施通常可以将周期时间缩短一半。一家中国领先的软件开发公司--隆正(Cerno)在实施SAFe后,周期时间缩短了58%[4]。另一家印度的著名的金融软件公司EdgeVerve(他们的行业领先的金融软件被94个国家的银行使用),在实施了SAFe后,周期时间减少了50%[5]。医疗技术公司皇家飞利浦,通过SAFe实施将周期时间减少了58%以上[6]

However, as more people and organizations learn of the benefits of Scrum, its implementation is growing from just teams up to divisions and organizations. A systematic literature review found that what was experienced as a lack of communication between teams was caused by Scrum not being implemented at a system level. As one Scrum Master (SM) put it, "We had Scrum within our small groups, that's about it [7]". This realization of the benefits of Scrum at an organizational level over that of a team or division level is a significant force driving Scaled Scrum implementations.

随着越来越多的人和组织了解到Scrum的好处,它的实施正在从单纯的团队层面发展到部门和组织层面。一项系统的文献评论发现,目前大家所经历的团队之间缺乏沟通的情况,是由于没有在系统层面上实施Scrum造成的。正如一位Scrum MasterSM)所说的那样,"我们在自己的小组内有Scrum,仅此而已[7]"。领导层意识到Scrum不应该只停留在团队或者部门层面,而是应该将Scrum的战果扩大到组织层面上。这个思路是推动Scrum@Scale实施的一个重要力量。


3 Research Methodology 研究方法

Takeuchi and Nonaka reviewed the best lean hardware teams worldwide and published the first paper introducing Scrum Project Management in 1986 [8]. Sutherland created Scrum for software development in 1993, has worked with Schwaber since 1995 to create the Scrum Guide [9], and created the Scrum@Scale Guide in 2016 [10]. Scrum is rooted in lean, and the authors have worked with Takeuchi and Nonaka directly since 2011 and published a second Harvard Business Review paper on Scrum [11]. The authors are coaches and trainers for Toyota in Japan, the United States, and Europe and worked closely with the Lean Enterprise Institute (LEI) to incorporate lean tooling into Scrum. John Shook, a former CEO of LEI, created the first Toyota plant in the United States [12] and helped proliferate the A3 Process [13, 14].

竹内弘高和野中郁次郎回顾了全世界最好的精益硬件团队,并在1986年发表了第一篇介绍Scrum项目管理的论文[8](《新的新产品开发游戏》-译者注)。萨瑟兰在1993年创建了用于软件开发的Scrum,从1995年起与施瓦伯合作发布了Scrum指南[9],并在2016年创建了Scrum@Scale指南[10]Scrum植根于精益,作者(杰夫.萨瑟兰)自2011年起直接与竹内和野中合作,并发表了第二篇关于Scrum的《哈佛商业评论》论文[11](《拥抱敏捷》- 译者注)。萨瑟兰是丰田公司在日本、美国和欧洲的教练和培训师,并与精益企业研究所(LEI)紧密合作,将精益工具纳入ScrumLEI的前CEO John Shook在美国创建了第一家丰田工厂[12],并帮助推广了A3流程[13, 14]

The A3 Process, used as the research methodology for this paper, is fundamental to process improvement at Toyota. Its evolution began at the end of World War II when General MacArthur brought Fundamentals of Industrial Management to Japan [15], followed by W. Edwards Deming and others in the 1950s [16]. Taiichi Ohno developed this into the Toyota Production System [17]. Sutherland worked with many A3 experts at Toyota, including Mike Tromas at the Toyota Kentucky plant, who used A3 to introduce Scrum into assembly- line production support and more than doubled productivity at scale [18].

作为本文研究方法的A3流程是丰田公司流程改进的基础。它的演变始于二战结束时,麦克阿瑟将军将《工业管理基础》带到了日本[15],随后戴明博士和其他人也在20世纪50年代来到日本[16]。大野耐一将其发展为丰田生产系统-TPS[17]。萨瑟兰曾与丰田的许多A3专家合作过,其中包括丰田肯塔基工厂的Mike Tromas。在那里,Mikes通过A3Scrum引入装配线生产支持中,并将生产力提高了一倍多[18]

The research methodology of this paper uses the six-step A3 process to describe the background, current condition, target for improvement, root causes analysis of major problems, recommend interventions, and follow-up in a way that other organizations can implement the recommended changes [19].

本文的研究方法采用A3六步法,即:描述背景、现状分析、确定改进目标、主要问题的根因分析、建议干预措施,及可实施改进措施的跟进[19]

4


4

Rocket Mortgage Background Rocket Mortgage背景

Before a Scaled Agile organizational change, the technology teams at Rocket Mortgage were organized into roughly eight technology-centered "platforms," with each group responsible for large pieces of the underlying technology. Each platform was staffed by a group of teams with almost total autonomy in how they operated. As a result, there was a range of development methodologies from traditional project management to Kanban.

在实施SAFe组织变革之前,Rocket Mortgage的技术团队被分成大约八个以技术为中心的 "平台"每个平台组各负责一项底层的技术工作。每个平台组都由一组团队组成,这些团队几乎完全自主地运作。因此,在公司内存在多种多样的开发方法,从传统的项目管理到看板方法都有。

There was a need to realign the ownership of projects and initiatives across the organization. In July 2018, the overall portfolio consisted of more than 275 "high-priority" work items ranging in size from small projects to large-scale initiatives. These were prioritized quarterly by a large group of business and platform leaders and roughly coordinated by project managers and "Epic Owners." Additionally, platforms and their component teams managed individual work backlogs fed by requests from multiple business areas and partners across the organization. There was little structured coordination from top to bottom. As work was divided up and funneled down to teams, the priorities, processes, and working structures became progressively independent and disparate from one another.

实施SAFe需要重新调整整个组织里所有项目和举措的所有权。20187月,整个公司的业务线包括超过275 "高优先级 "的工作事项,规模从小型项目到大型举措不等。这些事项由一大批业务和平台领导人每季度进行优先排序,并由项目经理和 "史诗负责人 "进行协调。此外,平台和他们的组件团队管理着各自的工作待办列表,待办列表中的事项来自多个业务领域和整个组织的合作伙伴的请求。从上到下几乎没有结构化的协调。随着工作的划分并下放到团队中,优先级、流程和工作结构逐渐变得独立和互不相干。

A Scaled Agile transition organized teams into collections of release trains centered around business value streams with a common Scaled Agile methodology. These groups of release trains, called "streams," were strategically aligned around similar business and technology capabilities, explicitly covering the mission-critical areas of Product Engineering, Data Intelligence, Infrastructure and Operations, Security, and Enterprise Services. This restructuring happened all at once: the organization shifted from technology capability-centered platforms in the third quarter of 2018 to 23 value-focused release trains in the fourth quarter of 2018 (42 as of April 2021). The Scaled Agile restructure allowed teams to plan, commit, and execute together, communicating in a unified Agile language and adhering to standardized processes. Technology teams at Rocket Mortgage were becoming more focused on delivering value for common business and technology missions.

SAFe转型将团队组织成以业务价值流为中心的发布火车集合,采用共同的规模化敏捷方法。这些被称为 " "的发布火车组围绕类似的业务和技术能力进行战略调整,明确涵盖产品工程、数据智能、基础设施和运营、安全和企业服务等关键任务领域。这种重组不是渐进式的,而是一下子就发生的:该组织在2018年第三季度从以技术能力为中心的平台转变为2018年第四季度以价值为中心的23列发布火车(截至20214月有42列)。SAFe的重组使团队能够一起计划、承诺和执行,用统一的敏捷语言进行沟通,并遵守标准化的流程。Rocket Mortgage的技术团队也越来越专注于为共同的业务和技术任务交付价值。

The Scaled Agile transition also included rolling out several tools and measurement guidelines and provided training and coaching resources. All leaders from the technology and product organization, along with many team members and business leaders, were trained in Scaled Agile and development lifecycle practices over thirty days during the third quarter of 2018. Each release train created a DevOps roadmap, tracking their progress on the DevOps wheel [20]. Several large-scale initiatives introduced standardized methods and pipelines into the software development lifecycle. Internal tools were also created leveraging integrations with the company's primary work-tracking tools and rolled out to provide visibility into progress and establish consistent productivity measurements.

SAFe的转型还包括推出一些工具和度量规范,并提供培训和辅导资源。在2018年第三季度,来自技术和产品组织的所有领导,以及许多团队成员和业务领导,在三十天内接受了SAFe)和开发生命周期实践的培训。每个发布火车都创建了一个DevOps路线图,跟踪他们在DevOps环上的进展[20]。一些大规模的举措将标准化的方法和流水线引入软件开发生命周期,还创建了内部工具,利用与公司的主要工作跟踪工具的集成,提供对工作进展的可视化,并建立一致的生产力度量。

After restructuring into release trains, while the epic portfolio remained, it was trimmed from 275 epics in July 2018 to just 37 by October 2019, reflecting only large solution (multi-release train) initiatives. Epics were prioritized through a continuous refinement process by a handful of leaders, with ownership of feature-and team-level work transferred to the release trains, with the ultimate goal of empowering self- sufficient release trains and autonomous value delivery. The Scaled Agile rollout drastically improved feature delivery time and predictability across the organization by a factor of 2x. Feature Cycle Time was cut by more than half, from 71 days down to 33 days. The company also improved cycle time standard deviation by 49%, from 82 days down to 42 days. As a result, when normalized for a 10-week release timebox, feature throughput improved by 101% from 414 features delivered per release to 833 per release. In addition to producing more incremental value across the organization, the Scaled-Agile implementation was the catalyst for increased commitment completion percentages. The planned work completed during a program increment increased by 37%, from 60% in 2017 to 88% in 2019.

重组为发布火车后,虽然史诗(Epic)组合仍然存在,但史诗(Epic)的数量从20187月的275个缩减为201910月的37个,仅包含了大型解决方案(多发布火车)的举措。史诗通过少数领导者的不断梳理过程进行优先级排序,特性和团队级工作的所有权转移到发布火车上,最终目标是授权自给自足的发布火车,让它们能自主价值交付。SAFe的实施基本上将整个组织的特性交付时间和可预测性提高了2倍:特性的周期时间减少了一半以上,从71天减少到33天。该公司还将周期时间的标准偏差提高了49%,从82天减少到42天。因此,在10周的发布时间下,特性的吞吐量提高了101%,从每次发布414个特性到每次发布833个。除了在整个组织内产生更多的增量价值外,SAFe的实施还促进了承诺完成率的提高。在一个项目增量期间完成的计划工作增加了37%,从2017年的60%增加到2019年的88%

With Rocket Mortgage's own Scaled Agile implementation resulting in a 46.5% reduction in cycle time, the Client Marketing Release Train was already looking for something better.

客户营销部的发布火车随着Rocket MortgageSAFe的实施,已经将周期时间减少了46.5%,但是他们还不满足,已经在寻找更好的实践了。


5 Client Marketing Improvement Opportunities

客户营销发布火车的改进机会

Client Marketing provides the client-facing frontend of the mortgage system with high demand for new and improved functionality, delivering a better user experience. As the leadership looked at the next steps in improving their Agile implementation, it became clear that four areas needed upgrading:

客户营销发布火车负责的是抵押贷款系统的前端界面。由于是直接面对客户门户,因此公司对客户营销发布火车提出了更高的要求,即:在提供新功能和功能增强的同时,又能提供更好的用户体验。在领导层研究改进其实施敏捷的下一步行动方案时,可以看到有四个方面需要提升:

    1. Client Marketing Scrum teams were using inconsistent tools and techniques. All teams needed better ways to integrate work at higher velocity.

    2. There were many different specialized roles on the Scrum teams. Less specialization with all roles focused on delivery could improve results.

    3. Communication across all teams needed to be clear, consistent, and more productive.

    4. Cycle times needed to be reduced further.

1.客户营销发布火车的Scrum团队没有使用一致的工具和技术。所有这些团队需要以更好的方式和更高的速度把工作集成起来。

2.这些Scrum团队中有许多不同的专业角色。减少过多的专业角色的划分,将所有角色聚焦于交付价值,这样能够有更好的效果。

3.所有团队之间的沟通需要清晰、一致、更有成效。

4.需要进一步缩短团队从开发开工到交付价值的时间。

6


6 Root Cause Analysis 根本原因分析

Scaled Scrum enables a performance analysis of components of scaling frameworks using a heat map [21]. Each column represents an organization, and each row is a Scaled Scrum component. Green is great, and red is blocked. Figure 1 is a photo of the actual heatmap used by Client Marketing in their analysis of the items identified in this paper.

Scaled@Scrum支持使用热图Heat Map)对规模化敏捷框架的组件进行性能分析[21]。如下图所示:每一列代表一个组织中的团队,每一行代表一个Scaled@Scrum组件。绿色表示正常状态,红色表示阻塞状态。图1是在对本文中提到的条目进行分析时,客户营销发布火车实际使用的热图照片。

Figure 1.Scaled@Scrum Performance Heat Map

1 Scaled@Scrum性能热图

Some scaling components relate to the teams (team process, team coordination, continuous improvement, delivery), others relate to the Product Owner (PO) (vision, portfolio prioritization, backlog refinement, release planning). Key components relate to the entire organization (Executive Action Team, MetaScrum, product release, metrics) [10]. By evaluating the effectiveness of each component, a prioritized list of improvement initiatives can be generated. This list of improvements is then driven by an Executive Action Team that runs like a Scrum team. Client Marketing quickly identified several targeted areas for improvement using this process. Improvements are prioritized by maximum impact for minimum effort and reprioritized after every individual improvement implementation. The system is constantly changing, so the A3 Process is a highly effective way to target the prioritized dysfunction's root cause.

一些Scaled@Scrum组件与团队相关(团队流程、团队协调、持续改进、交付),另一些与产品负责人(PO)相关(愿景、组合优先级、待办事项优化、发布计划),以及与整个组织相关的关键组件(最高行动团队、MetaScrum、产品发布、度量)[10]。通过评估每个组件的有效性,可以生成一份改进计划的优先级列表。然后,这个改进列表由一个像Scrum团队一样运行的最高行动团队驱动。通过该流程,客户营销发布火车很快确定了几个需要改进的目标领域。改进待办项以最小的努力产生最大的影响为标准排定优先级,并在每个单独的改进实施后重新确定优先级。系统在不断地变化,采用A3流程是从根本上解决优先级问题的非常有效的方法。

6.1


6.1 Scrum Basics Scrum的基础知识

A performance analysis revealed that inconsistent Scrum implementation was a significant impediment to efficiency. In a review of all Client Marketing Scrum teams, it was found that roughly 50% had significantly altered four (Planning, Review, Daily Scrum, Retrospective) of the five Scrum events. Additionally, there was no uniform application of the three Scrum roles (PO, SM, Team) across Client Marketing.

一项性能分析显示,Scrum实施的不一致性明显地阻碍了效率的提升。在对所有客户营销发布火车的Scrum团队调查后,发现大约50%的团队明显地改变了五个Scrum事件中的四个(计划、评审、每日Scrum、回顾)。此外,在客户营销发布火车中对于三个Scrum角色(POSM、团队)的应用没有做到一致。

A related issue discovered was the existence of too many roles. Based on the work of the Pasteur Project at Bell Labs with over 200 published case studies, there was ample evidence that too many roles caused poor communication saturation increasing the need for meetings and extensive rework. This poor communication environment was the primary driver of reduced velocity [22]. Leadership hypothesized that an abundance of different roles within Client Marketing was causing process constraints, hindering communication, and slowing down work.

一个与此相关的问题是角色过多。根据贝尔实验室巴斯德项目的工作,以及200多个已发表的案例研究,有充分的证据表明,过多的角色导致沟通不畅,从而导致大量的开会时间以及大量的返工。这种糟糕的沟通环境是速度降低的主要原因[22]。领导层认为客户营销发布火车中的大量不同角色导致了流程的约束、沟通的不畅通以及工作的速度降低。

Figure 2. Roles decrease communication saturation

2 角色过多减少了沟通饱和度

As seen in Figure 2 from the Bell Labs Pasteur Project [23] involving the first 82 companies audited, communication saturation drops as roles are added. The circled area represents most companies with 20+ roles at ~25% communication saturation. Thus, fewer roles leads to improved communication among team and organization members. For Client Marketing, this directly impacted the target condition of all functions being focused on delivery.

2来自贝尔实验室巴斯德项目[23],在对于前82家公司的审计中发现,随着角色的增加,沟通饱和度下降。圆圈区域说明了,大多数公司拥有20多个角色,而且只有25%左右的沟通饱和度。因此,较少的角色可以改善团队和组织成员之间的沟通情况。对于客户营销发布火车而言,这直接影响了专注于交付价值的各个方面。

6.2


6.2 Deployments Too Far Apart 部署大相径庭

Inconsistent structures, tools, and techniques on Scrum teams were delaying deployments. A recent 7- year study with 13 preparatory case studies developed hypotheses to be tested using an academic framework for understanding what makes Scrum teams effective [24]. Findings indicate that two primary variables determine Scrum team results – frequent releases and a clear understanding of stakeholders' needs. At Client Marketing, deployments occurred only bi-weekly (after each Sprint), took on average two hours, and had to be done after 10 PM local time. These restrictions increased batch sizes and caused unacceptable delays in time-to-market for product enhancements, bug fixes, and new product deployments.

Scrum团队中不一致的结构、工具和技术延迟了规模化敏捷的实施。最近一项为期7年、包含13个专题的研究,提出了一些假设,这些假设将使用一个学术框架进行测试,以了解是什么让Scrum团队变得有效[24]。研究结果表明,两个主要因素决定了Scrum团队的结果——频繁的发布和对利益相关者需求的清晰理解。在客户营销发布火车中,部署仅每两周进行一次(每次冲刺之后),平均需要两个小时,并且必须在当地时间晚上10点之后完成。这些限制增大了批量规模,导致产品增强、缺陷修复和新产品部署的上市时间,出现不可接受的延迟。

6.3


6.3 Testing 测试

Analysis revealed that testing was a bottleneck, a problem often encountered in software development environments. Time to test and fix in a later Sprint compared to the current Sprint can be 24 times longer for complex hardware/software projects. This has been observed in Europe and Silicon Valley. A recent example was an investigation at Simplivity, a cloud infrastructure company that is now part of Hewlett Packard [25]. For a technology company like Rocket Mortgage, automation of testing and deployment is a good solution for this problem [26]. A small Scaled Agile implementation cut delivery time by 75% with an exemplary DevOps implementation [27].

分析表明,测试是一个瓶颈,这是软件开发环境中经常遇到的问题。对于复杂的软硬件项目来说,与在注入这个缺陷的同一个Sprint测试和修复这个缺陷的代价相比,在之后的Sprint测试和修复这个缺陷的代价要高出24之多。这种情况已经在欧洲和硅谷发生了。最近的一个例子是Simplify的一项调查,Simplify是一家云基础设施公司,现隶属于惠普[25]。对于Rocket Mortgage这样的科技公司来说,自动化测试和部署是解决这个问题的好办法[26]。通过标杆性的DevOps实施,一个小范围的规模化敏捷(SAFe)实施将交付时间缩短了75%

6.4


6.4 Meeting Structures Did Not Drive Delivery

会议结构并没有驱动交付价值

Scrum is designed to minimize meetings and reports as these are the largest source of waste in most organizations. They significantly reduce process efficiency [28], which radically reduces throughput. Systematic, a large consultancy in Europe operating at CMMI Level 5, implemented Scrum and cut project costs in half [29]. Client Marketing leadership did an ROI analysis of all meetings, including the time the meeting took, topics discussed, attendees, and value of outputs and reports, and decided to eliminate anything non-essential to the Scrum framework [11].

Scrum旨在最大限度地减少会议和报告,因为它们是大多数组织中最大的浪费源。它们显著降低了流程效率[28]从根本上降低了吞吐量。Systemic是欧洲的一家大型咨询公司,它通过了CMMI5级认证,在实施了Scrum之后,将项目成本削减了一半[29]。客户营销发布火车的领导层对所有会议进行了投资回报率分析,包括会议时长、讨论的主题、与会者,以及产出和报告的价值,并决定取消了对Scrum框架不重要的一切会议[11]

It was found that priorities and outcomes were not consistently communicated across teams. They were often discussed in separate, loosely attended (often- canceled) meetings with marketing leadership primarily focused on lists of deliverables and delivery dates. Meanwhile, leadership leaned on individual meetings with a single Product Manager and monthly sync-ups with each engineering team to prioritize multiple value streams. These delivery teams planned work quarterly with the organization and experimented inconsistently with shorter cycles to address uncertainty and wasted effort.

研究发现,团队之间对于沟通优先事项和需要的结果没有达成一致。他们经常在单独的、出席人数不多(经常被取消)的会议上和营销部的领导层重点讨论可交付成果和交付日期的清单。与此同时,领导层依靠与单个产品经理的单独会议,以及与每个工程团队的每月同步,来确定多个价值流的优先级。这些交付团队每季度与组织一起做工作计划,为了设法解决不确定性和人力浪费的问题,会在更短的周期内进行一些与计划不一致的尝试。

A daily "Leadership Standup," which followed the typical pattern of a status meeting, was the only organized daily communication. Each SM would talk about what they would be focused on for the day, with updates typically consisting of a list of meetings and action items receiving their attention. This meeting took place first thing in the morning, before the SMs had their Daily Scrum with their teams, produced little to no value, and was disparate from what each team was doing in delivering value or being hindered by impediments.

每日领导层站立会是唯一有组织的日常沟通,它遵循典型的状态会议模式。每个SM都会谈论他们一天的工作重点,更新内容通常包括会议列表和他们关注的行动事项。这个会议是早上第一件事,在SM与他们的团队进行每日Scrum之前举行,但是几乎没有产生任何价值,并且会议内容与每个团队在交付价值方面所做的事情,或受到的阻碍完全不同。


7 Countermeasures Taken 采取对策

Because Scaled Scrum and Scaled Agile are both major Agile frameworks, it is generally accepted that one or the other should be selected [30]. But while the rest of Product Engineering stopped at Scaled Agile, Client Marketing took the extra steps of putting Scaled Scrum on top of Scaled Agile. This allowed them to address all the issues found in their root cause analysis and reach their target conditions, essentially applying the A3 Process to Scaled Agile. As part of the process of adding Scaled Scrum, the countermeasures below were put into place.

因为Scrum@ScaleSAFe都是主要的规模化敏捷框架,人们普遍认为应该选择其中之一[30]。但是,当产品工程部的其他部门满足于SAFe实施时,客户营销发布火车采取了额外的措施,在SAFe之上继续应用Scrum@Scale的一些原则。这使得他们能够解决在根本原因分析中发现的所有问题,并达到他们的目标条件。这么做本质上是将A3流程应用于SAFe。作为Scrum@Scale过程的一部分,他们采取的对策如下:

7.1Scrum Reset 重置Scrum

In addition to the organization-wide Scaled-Agile transformation, Client Marketing employed four primary tactics to improve and scale its Scrum practice. First, Client Marketing senior leadership completed training with Jeff Sutherland on scaling Scrum. Second, Client Marketing was retrained in the three Scrum roles, the five events, and the three artifacts. Third, all of Client Marketing did a "hard reset," establishing a strict implementation of Scrum at the team level. While each team within the release train was already doing some variation of Scrum, leadership set the expectation that each team would strip away any complexity that had been layered on over the years and revert to following the Scrum Guide [9].

在公司整体实施SAFe的基础上,客户营销发布火车还采用了四个主要策略来改善和扩展Scrum实践。首先,客户营销发布火车高层领导与Jeff Sutherland一起完成了关于规模化Scrum的培训。第二,客户营销发布火车接受了三个Scrum角色、五个事件和三个工件的再培训。第三,所有的客户营销发布火车做了一次 "硬重置",在团队层面建立了严格的Scrum实施。虽然每个团队在发布火车中已经在做一些Scrum的变体,但领导层设定了一个期望,即将每个团队从多年来层层设置的枷锁中剥离出来,并恢复到以遵循Scrum指南为准的状态[9]

Before the reset, only one-third of Scrum Teams held Daily Scrums, Sprint Planning, Sprint Retrospectives, or had a Definition of Ready. None of the Scrum Teams had a Definition of Done. After the reset, all teams followed each ceremony as defined in the Scrum Guide [20] and had Definitions of Ready and Done. Finally, the Client Marketing leadership began to operate as a Scrum team as well.

在重置之前,只有三分之一的Scrum团队举行了每日ScrumSprint计划、Sprint回顾,或者有一个 "就绪的定义(DOR)"。没有一个Scrum团队有 "完成的定义(DoD。重置后,所有的团队都遵循了Scrum指南[20]中定义的每个仪式,并且有了 "DOR " "DOD "的定义。最后,客户营销发布火车的领导层也开始作为一个Scrum团队运作。

7.2 Scrum Roles Scrum角色

Consistent with the Scrum Guide [9], it was agreed that POs were primarily responsible for increasing the product's value, while SMs were accountable for accelerating delivery. All permutations of these roles were eliminated.

严格将Scrum的角色与Scrum指南[9]一致,即:PO主要负责提高产品的价值,而SM则负责加速交付。这些角色的所有衍变都被取消了。

7.3 MetaScrum 产品决策团队-- MetaScrum

The MetaScrum is a regular meeting (at least once a Sprint) with Senior Leadership, the Chief PO, and senior members of the PO and engineering teams [10]. The purpose is to align the organization with the Chief PO's backlog. This meeting was first implemented at Patient Keeper [31] in 2003 and was later formally defined by the Scrum Patterns Group [32] as essential to high-performing organizations with many Scrum teams. Client Marketing implemented a weekly MetaScrum to coordinate prioritization and dependencies and achieve ongoing alignment with stakeholders. The process was led by the Release Train Engineer and involved consolidating all prioritization, train-level work intake, feature refinement, and dependency coordination discussions down to a weekly meeting with all stakeholders, product managers, and leaders present. This allowed prioritization to be aligned through POs down to each team.

MetaScrum是由高级领导层、首席POPO和工程团队的高级成员参加的定期会议(至少每个Sprint一次)[10]。其目的是使组织与首席PO的待办列表保持一致。这个概念于2003年在Patient Keeper[31]首次实施,后来被Scrum模式小组[32]正式定义为MetaScrum,并作为拥有多个Scrum团队的高绩效组织的精髓对外推广。客户营销发布火车实施了每周一次的MetaScrum,以协调优先级和依赖,并与利益相关者持续保持一致。这个过程由发布火车工程师Release Train Engineer-RTE,译者注:SAFe框架中的一个角色)主导,内容包括提前整合所有的优先级、火车级的工作项、特性精炼和依赖协调的讨论,并在每周的会议上与所有的利益相关者、产品经理和领导会面。这样可以将优先级通过PO传递到每个团队,从而使各团队对齐。

7.4 Scaled Daily Scrum 规模化的Daily ScrumSDS

As MetaScrums began, Client Marketing leadership started holding a Scaled Daily Scrum (SDS) with all the SMs in the Release Train. Two of the most significant changes of the SDS from the previous Leadership Standup were the time it occurred and its focus. The SDS occurred after each team's Daily Scrum, and the focus was purely on delivery. With a dashboard with Sprint burn down charts for each team prominently displayed on a screen, each SM would report on their team's progress towards the Sprint goal and any impediments.

随着MetaScrums的开始实施,客户营销发布火车的领导层开始与发布火车上的所有SM举行规模化的每日ScrumSDS)。与以前的领导层会议相比,SDS的两个最重要的变化是它的时间和专注点。SDS发生在每个团队的Daily Scrum之后,其专注点是纯粹的交付。在屏幕上醒目地显示着每个团队的Sprint 燃尽图,每个SM将报告他们的团队在实现Sprint目标方面的进展和任何障碍。

If a team did not clearly show to be trending towards early Sprint completion, the SM was expected to bring one or more impediments to the SDS that they were focused on removing. If the SM could not eliminate an impediment within hours, it became the highest priority item the Release Train Leader would focus on. The SDS converted a low-value status report meeting about the schedules of SMs into a high-value event focused on taking immediate action to resolve challenges that same day that teams were facing to hit Sprint goals. This Scaled Scrum ceremony, along with the Daily Scrum, directly attack and resolve decision latency. The importance of this cannot be overstressed, as the Standish Group noted, “The value of the interval is greater than the quality of the decision” and “The root cause of poor performance in a software project is slow decision latency [33].”

如果一个团队没有明确显示出提前完成Sprint目标的趋势,SM就应该向SDS提出团队遇到的一个或多个障碍,他们要集中精力消除这些障碍。如果SM不能在几个小时内消除一个障碍,它就会成为发布火车领导们需要关注的最高优先级的事项。SDS将一个关于SM日程安排的低价值状态报告会转化为一个高价值的活动,专注于采取立即行动来解决团队当天面临的挑战,以达到Sprint目标。这个Scrum@Scale仪式和Daily Scrum一起,直接攻克和解决决策延迟的问题。这一点的重要性怎么强调都不为过,正如Standish Group所指出的,"决策的效率比决策是否完美更有价值""软件项目中表现不佳的根本原因是决策延迟过长[33]"

7.5 Role Alignment 角色调整

The leadership team examined all roles and their responsibilities. Any role that was not playing a direct part in delivery, or was causing a bottleneck, was eliminated or repurposed. With SMs focused on removing constraints within the team's delivery process, it became increasingly clear that traditional quality assurance and specialized roles that only concentrate on a small part of the delivery lifecycle were causing bottlenecks. SMs worked with team members in these roles to either broaden their skillset to apply to any user story in the backlog or helped them find positions within the broader organization that still required their high degree of specialization.

领导团队检查了所有的角色和他们的责任。任何在交付过程中没有发挥直接作用的角色,或者造成瓶颈的角色,都被取消或重新安排。随着SM专注于消除团队交付过程中的制约因素,大家越来越明显地看到,传统的质量保证和只专注于交付生命周期的一小部分的专门角色正在造成瓶颈。SM与这些角色的团队成员一起工作,要么扩大他们的技能范围,使他们能够承担待办列表中的任何用户故事,要么帮助他们在更广泛的组织内找到仍然需要他们高度专业化的岗位。

For team members in the role of Quality Analyst (QAs), there was a six-month transition period during which they became Software Engineers (SEs). Those who were unable to do so were transferred to a specialty test group within the larger organization. Business Analysts (BAs) became Software Engineers, POs, or were assigned to the specialty test group.

对于担任质量分析师(QAs)的团队成员,有六个月的过渡期可以成为软件工程师(SE)。那些不能这样做的人被转移到上级部门的专业测试组。业务分析师们(BAs)要么成为软件工程师、PO或被分配到专业测试组。

7.6 Testing 测试

With the elimination of QA roles, the testing requirements increased for SEs. Because there was already a bottleneck in testing, the Client Marketing leadership team took corrective action. Automation testing was increased. This reduced the bottleneck and minimized the amount of manual testing. Unit, accessibility, and regression testing were all automated. The only testing still performed manually was visual testing on devices and browsers.

随着QA角色的取消,软件工程师的测试需求增加了。由于测试已经出现了瓶颈,客户营销领导团队采取了纠正措施,增加了自动化测试。这减少了瓶颈,并最大限度地减少了人工测试的数量。单元测试、可访问性测试和回归测试都被自动化了。唯一仍在手工进行的测试是设备和浏览器上的视觉测试。

7.7 Quality Gates 质量门禁

Quality gates were added to the pipeline to ensure the high standards of Rocket Mortgage were maintained. The quality gates verified that increased performance and shorter time to market were not detrimental to quality. Successfully passing the gates meant successfully passing performance, regression, and accessibility testing. The gates added were:

质量门禁被添加到流水线中,以确保Rocket Mortgage的高标准得到保持。质量门禁验证了提高性能、保证缩短上市时间的同时不会对质量造成损害。成功通过这些关卡意味着成功通过性能、回归和可访问性测试。增加的门禁包括:

Accessibility Testing

An automated testing tool that produces a score value which is then compared against a predefined minimum score to be considered passing.

可访问性测试

通过一个自动化测试工具,产生一个分数值,然后与预定的最低分数进行比较,如果大于这个最低分数就可以认为是通过了。

Secrets Scanner

The Secret Scanner looks for plain text secrets within source code. Anything that could grant knowledge or access to items that people should otherwise not have access would be considered a secret.

秘密扫描器

秘密扫描器寻找源代码中的纯文本秘密(译者注:比如密钥)。任何可能授予人们知识或人们不应该访问东西会被认为是秘密。

Security Scanner

Security Scanner is an automated scan that checks for any dependency vulnerabilities.

安全扫描器

安全扫描器是一种自动扫描,可以检查任何依赖性的漏洞。

Content Scanner

A content scanner looks for language-specific files and directories that should not exist within source control.

内容扫描器

内容扫描器寻找不应该存在于源代码控制中的特定语言的文件和目录。

Static Code Analysis

A Static Code Analysis tool measures code quality via unit-test coverage and reports information about code quality to CI tools.

静态代码分析

静态代码分析工具通过单元测试覆盖率度量代码质量,并向CI工具报告代码质量信息。

7.8 Deployments 部署

The ability for each team to autonomously deploy components on demand has been a game-changer in many high-performing organizations like Spotify [34]. Deployment rates were increased from the end of every Sprint to an on-demand Continuous Integration/Continuous Deployment pipeline (CI/CD). Dependencies on other release teams were eliminated to allow for on-demand independent releases by each team. Deployments could occur at any time due to downtime being eliminated through the use of Blue- Green Deployments [35].

每个团队按需自主部署组件的能力已经成了许多高绩效组织(比如Spotify)的制胜法宝 [34]。部署频率从每个Sprint结束时进行一次提高到利用CI/CD流水线按需持续集成和持续部署。客户营销部通过努力消除了发布团队间的依赖性,允许每个团队按需独立发布。由于通过使用蓝绿部署消除了停机时间,部署可以在任何时候发生[35]

Previously, deployments happened at a maximum of once per week, with a typical cadence of once every two weeks. This was because code deployments involved an external team that manually deployed code to production environments, and this could only be done during late evening hours because of downtime that would occur. The deployments took multiple hours of planning and coordination. The actual code deployment would last 2-3 hours, frequently resulting in needing to roll back the deployment if changes caused unanticipated errors in production.

以前,部署工作最多每周一次,典型的节奏是每两周一次。这是因为代码部署涉及到一个外部团队,他们手动将代码部署到生产环境中,而这只能在晚间进行,因为可能会出现宕机的情况。每次部署都需要几个小时的计划和协调。实际的代码部署将持续2-3个小时,如果变更导致生产环境中出现意想不到的错误,需要频繁的回滚部署。

7.9 Infrastructure 基础设施

The current infrastructure dependencies caused bottlenecks and increased cycle times. All Scrum Teams were dependent on an external infrastructure team that would manage on-premises servers. The servers were a black box to the Software Engineers because only a Systems Engineer on the infrastructure team was allowed to access it directly. The Systems Engineers did not sufficiently understand the production code being deployed to these servers because that was the responsibility of the Software Engineer. This often led to hours spent debugging environment issues and constantly having to work towards keeping a consistent state across the testing, staging, and production environments. Software Engineers would develop locally but could not test their code until they promoted it to the test environment, which would lead to situations where code could not be deployed because it was waiting for a test server to become available. Because the test and staging environments were managed separately, there were often differences between the environments, causing additional issues in testing changes before releasing to production.

目前的基础设施依赖性造成了瓶颈并增加了周期时间。所有的Scrum团队都依赖于一个外部的基础设施团队来管理内部的服务器。这些服务器对软件工程师来说是一个黑匣子,因为只有基础设施团队的系统工程师被允许直接访问它。系统工程师并不充分了解部署在这些服务器上的生产代码,因为那是软件工程师的责任。这往往会导致花费数小时来调试环境问题,并不断努力保持测试、预发环境和生产环境的状态一致。软件工程师会在本地开发,但在提交到测试环境之前无法测试他们的代码,这将导致代码无法部署,因为它在等待可用的测试服务器。因为测试环境和预发环境是分开管理的,环境之间经常有差异,导致在发布到生产之前,当测试变化时会引起额外的问题。

To eliminate these dependencies, infrastructure was containerized, and a Docker environment was implemented, including automated migration and setup/termination as needed using terraform scripting. Additionally, a blue/green deployment model was implemented to test before moving traffic. This allowed each Scrum Team to work independently, free from being dependent on the external infrastructure team.

为了消除这些依赖性,基础设施被容器化,并实施了Docker环境,包括自动迁移、设置和终止,根据需要使用terraform 脚本。此外,采用蓝/绿部署模型,在用户使用系统产生流量之前进行测试。这使得每个Scrum团队可以独立工作,不需要依赖外部的基础设施团队。

7.10 OKRs OKR

Objectives and Key Results [36] were established for all Release Trains. The specific relevant Key Results were:

New-New Code Coverage

      • A target coverage rate for automated testing of all new lines of code for either legacy or new applications was established.

Overall Code Coverage

      • A target coverage rate        for automated testing overall (new and legacy code) was established.

Mean Time To Recovery (MTTR)

      • The failure of any individual gate within the CI pipeline must be recovered within 24 hours.

所有的发布火车都建立了OKR[36]。具体的相关关键结果(KR)是:

新代码覆盖率

  • 无论是遗留的的还是新的应用程序,对所有新的代码行确立了自动化测试的目标覆盖率。

总体代码覆盖率

  • 对全部新代码和遗留代码,确立了自动化测试的总体覆盖率目标。

平均恢复时间(MTTR

  • CI流水线中的任何单个故障必须在24小时内恢复。


8 Results 结论

As one of the largest Scaled Agile implementations, Rocket Mortgage was more successful than most large implementations, reducing cycle time by 46.5%. The goal of implementing Scaled Scrum on top of Scaled Agile was further reduction in cycle time, higher quality, and improved customer and team satisfaction.

作为最大的SAFe实施案例之一,Rocket Mortgage比大多数规模化敏捷实施更成功,交付周期减少了46.5%。在 SAFe的基础上实施Scrum@Scale的目标是进一步缩短交付周期,提高质量,并提升客户和团队的满意度。

8.1 Cycle Time Reduction(缩短交付周期)

By leveraging the Scaled Scrum practices described above, in addition to the organizational Scaled Agile rollout, Client Marketing was able further to improve feature delivery cycle times, feature throughput, and planned work (commitment) completion rates. Table 1 shows that Client Marketing reduced the average feature cycle time by 75%, from 86 days to 21 days. The first year saw a 51% reduction, from 86 days to 42 days, with another 50% reduction to 21 days the second year. The result was that the implementation of Scaled Scrum brought down cycle time another 55.12% over Scaled Agile alone. Additionally, feature predictability improved by a factor of 3x with a cycle time standard deviation reduction from 46 days to 14 days. When normalized for 10-week releases, feature throughput also increased by 340% for the train. Commitment completion increased by 91% for Client Marketing, bringing it on par with the rest of the organization.

通过利用上述的Scrum@Scale实践,在原来组织层面SAFe推广的基础上,客户营销部能够进一步改善特性交付周期,特性吞吐量,以及承诺完成率。表1显示,客户营销部将平均特性周期时间减少了75%,从86天减少到21天。第一年减少了51%,从86天减少到42天,第二年又减少了50%,达到21天。结果表明,实施Scrum@Scale后,交付周期比单独实施SAFe又减少了55.12%。此外,需求的可预测性提高了3倍,交付周期的标准差从46天减少到14天。如果将10周的发布规范化,特性吞吐量增加了340%。客户营销部的承诺完成率提高了91%,使其与组织的其他部门持平。

Table 1. Pre- and Post-Scaled Scrum and Agile Metrics

1. Scrum@ScaleSAFe执行前、后的和度量指标


Rocket Mortgage Pre-Scaled Agile End of Q3, 2017 Rocket Mortgage 实施SAFe前, 截止到2017Q3

Rocket Mortgage Post-Scaled Agile End of Q3, 2019   Rocket Mortgage 实施SAFe后, 截止到2019 Q3

Client Marketing Post-Scaled Scrum End of Q3, 2019 客户营销部实施Scrum@Scale 截止到2019 Q3

Feature Cycle Time 特性交付周期时间

71 days/83 days 1

33 days

21 days/11.6 days 1

Feature Throughput特性吞吐量

414 / 5 2

833

22

Feature σ 需求

82 days

42 days

14 days

Commitment Completion 承诺完成率

60% / 46% 3

88%

  1. Feature Cycle Times for Client Marketing were 83 days and11.6 days

  2. Feature throughput for Client Marketing was 5

  3. Commitment completion for Client Marketing was 46%

1. 客户营销部的交付周期为83天和11.6

2. 客户营销部的特性吞吐量为5

3. 客户营销部的承诺完成率为46%

8.2 Improved Administration Efficiency 提升管理效能

Through restructuring roles and modifying its approach to software quality, deployment operations, and product ownership, Client Marketing saw a change in its ratio of non-delivery roles (team leaders, business analysts, quality assurance) to production-focused roles (engineers and developers). The overall change went from ~2:1 in Q3 2017 to ~1:2 by Q3 2019, indicating an increased focus on delivering working software rather than managerial overhead. Figure 3 shows the correlation between cycle time reduction (top chart blue line) and the decrease of admin roles (bottom chart blue line). Section one reveals a negative trend where cycle times and non-delivery roles were both increasing. Section two represents the period of change from admin roles. Section three indicates the onboarding of additional teams and adding PO roles (considered non- delivery roles for these graphs, thus the rise in the bottom chart).

Figure 3. Rocket Mortgage Role Restructuring

3 Rocket Mortgage 角色重组

通过重组角色和修改其对软件质量、部署操作和产品所有权的方法,客户营销部的非交付角色(团队领导、业务分析师、质量保证)与交付角色(工程师和开发人员)的比例发生了变化:总的来说从2017年第三季度Q3的大约2:12019年第三季度Q3的大约1:2. 这表明组织越来越注重交付可工作软件,而不是管理上的开销。图3显示了交付周期缩短(上图蓝线)和管理角色减少(下图蓝线)之间的相关性。第一部分显示了一个消极的趋势,周期时间和非交付角色都在增加。第二部分代表了管理角色的变化时期。第三部分显示了因为有更多团队的加入增加了PO角色(在这些图表中被视为非交付角色,因此底部图表中是上升的)。

8.3 Delivery Focus 聚焦交付

The Scrum Reset brought a new level of clarity to the SM and PO. The MetaScrum and SDS provided frequent cycles in which POs and SMs could receive real-time feedback on their performance, take corrective action, and compare results before the next increment. Through this increased visibility and clarity, SMs and POs typically either saw a dramatic improvement in their effectiveness or were able to determine that the role did not fit their strengths objectively. This led to Client Marketing having the right people in key leadership roles, resulting in a high-performance culture focused on delivering value. The result of having effective SMs and POs, along with delivery teams made up of team members with multiple skills, was a team structure and process that drove shorter cycle times for delivering value to clients.

Scrum重置给SMPO带来了新的清晰度。MetaScrumSDS提供了频繁的循环,POSM可以收到关于他们表现的实时反馈,采取纠正措施,并在下一次增量之前比较结果。通过这种增加的可视化和清晰度,SMPO通常要么看到他们的效率有了极大的提高,要么能够客观地确定这个角色不符合他们的强项。这使得客户营销部在关键的领导岗位上有了合适的人选,从而形成了一种专注于交付价值的高绩效文化。拥有高水平的SMPO,同时配备一个具有多种技能的执行团队,是最好的团队组合方式,它能以最短时间向客户提供有价值的服务。

8.4 OKRs OKR

The tracking of OKRs became a bellwether for Rocket Mortgage technical teams to be proactive in their work. Client Marketing’s code coverage data is presented in Table 2. Other OKRs either had limited or no pre-Agile data, or contained proprietary information.

OKR的跟踪成为Rocket Mortgage技术团队积极主动工作的风向标。客户营销部的代码覆盖率数据如表2所示。其他的OKR要么是有限的或没有敏捷前的数据,要么包含专有信息。

Table 2. Client Marketing overall code coverage

2 客户营销部的整体代码覆盖率

2018

2019

2020

43.73%

53.15%

69.34%

8.5 Deployments 部署

Deployment rates were improved through a careful process. After having all prerequisites in place, releases could be done on-demand using a CI/CD gated pipeline as previously mentioned instead of the previous once per Sprint limitation.

通过一个谨慎的过程,部署频率得到了提高。在具备所有的先决条件后,可以使用之前提到的CI/CD流水线按需发布,而不局限于每个Sprint发布一次。

8.6 Release Planning 发布计划

In addition to faster deployments, release planning frequency was also increased. A survey by Puppet and DORA showed that organizations with high-performing DevOps enjoy 22% less time spent on unplanned work, 3x lower change failure rates, and enhanced employee engagement [37]. Client Marketing incrementally increased their original release planning cadence of once per quarter to five times per year, then once per month, and then eliminated it with the implementation of continuous planning.

除了更快的部署之外,发布计划的频率也得到了提高。PuppetDORA的一项调查显示,拥有高绩效的DevOps的组织在计划外工作上花费的时间减少了22%,变更失败率降低了3倍,员工的敬业度也有所提高[37]。客户营销部逐步将他们原来每季度一次的发布计划节奏增加到每年五次,然后是每月一次,然后随着持续计划的实施将发布计划取消。

9


9 Summary 总结

Rocket Mortgage accelerated feature delivery by aligning teams to business-value stream-focused release trains, providing clarity on organizational objectives, and giving teams autonomy to hit those objectives. By also using Scaled Scrum, Client Marketing implemented a MetaScrum to synchronize product backlogs and stakeholders across teams, and the Scaled Daily Scrum

to synchronize the leadership team on impediment removal. The team has shown that Scaled Scrum and Scaled Agile frameworks are fully compatible -- and companies can achieve great results by leveraging them together.

Rocket Mortgage让团队聚焦在业务价值流发布火车,提供明确的组织目标,并赋予团队实现这些目标的自主权,从而加速了特性的交付。通过使用Scrum@Scale,客户营销部采用MetaScrum来同步产品待办列表和跨团队的利益相关方,通过执行Scaled Daily Scrum同步领导团队,清除障碍。通过团队实践证明,Scrum@ScaleSAFe是完全兼容的,公司可以充分利用它们来取得巨大的成果。

With the caveats noted in the Future Research section of this paper, the authors believe the steps taken by Client Marketing as outlined in this paper to be generalizable to other companies and industries for several reasons. First, Client Marketing’s success was predicated by using Scaled Agile and Scaled Scrum “by the book.” No customizations were made to either framework to match the processes of Client Marketing. Instead, Client Marketing changed to follow the frameworks. Second, Client Marketing implemented standard software development procedures such as CI/CD [38-41]. And third, Client Marketing uses commercial off-the-shelf (COTS) software, such as CirclCI, Docker, AWS, and Sonarqube.

依据本文下一章未来研究部分中提到的注意事项,作者认为本文概述的客户营销部所采取的措施可以推广到其他公司和行业,原因有以下几点:首先,客户营销部的成功是以 "按部就班 "地使用SAFeScrum@Scale为前提的。没有对这两个框架进行定制以匹配客户营销的流程。相反,客户营销部对自身做出了改变来适应这两个框架。第二,客户营销部实施了标准的软件开发流程,如CI/CD[38-41]。第三,客户营销部使用现成的商业化软件(COTS),如CirclCIDockerAWSSonarqube

10


10

Future Research(未来研究)

The learnings from the Client Marketing Release Train apply to the broader organization. A movement is underway towards wider MetaScrum implementation, Scaled Daily Scrums, delivery focus, and fewer roles.

从客户营销发布火车中获得的经验适用于更广泛的组织。目前正朝着更广泛的MetaScrum实施、规模化每日Scrum、聚焦交付和更少的角色发展。

Two significant trends are relevant to Agile transformations in all industries. First, Rocket Mortgage is moving to business units based on value streams where all teams can directly see their effect on organizational performance. Second, expansion of continuous delivery to enable automated rollout and testing of a subgroup followed by an automatic rollback in the event of problems or automated rollout to a broader base in the case of success.

有两个重要的趋势与所有行业的敏捷转型相关。首先,Rocket Mortgage 正在转向基于价值流的业务单元,所有团队可以直接看到他们对组织绩效的影响。第二,扩大持续交付的范围,能够实现分组的自动部署和测试,在出现问题时可以自动回滚,或者在成功时自动地部署到更多的分组。

The effect of value stream organizations with continuous delivery eliminates excessive overhead in release planning and enables the organization to respond more quickly to client requests and market changes.

具有持续交付的价值流组织的效果消除了发布计划中的过度开销,使组织能够更迅速地响应客户的要求和市场变化。

The large number of intervention components makes it difficult to ascertain which strategies from the black box contributed to which delivery improvements (i.e., measuring the role MetaScrum or SDS played in improving cycle times vs. DevOps practices vs. staffing changes). Future research could be directed at understanding the salience of individual strategies in affecting cycle times. There was also limited availability of pre-Agile transformation data at Rocket Mortgage. Proprietary information also limited the amount of shareable data. Finally, there is limited previous research on combining Scaled Agile and Scaled Scrum frameworks. The authors encourage continued study of the findings and practices outlined in this paper and believe they are a strong foundation for future research.

大量的干预成分使得我们很难确定黑箱中的哪些策略有助于改善交付(例如衡量MetaScrum或SDS在提升周期时间方面所起的作用与DevOps实践和人员配置变化)。未来的研究可以着眼于了解个别战略在影响周期时间方面的突出作用。在Rocket Mortgage,敏捷转型前的数据也很有限。专有信息也限制了可共享的数据量。最后,以前对SAFe

和Scrum@Scale框架相结合的研究有限。作者鼓励大家在本文所阐述的发现和实践上继续研究,并相信它们是未来研究的坚实基础。


10. References 参考文献

  1. C. Handscomb, D. Mahadevan, L. Schor, M. Siebere,E. Naidoo, and S. Srinivasan. (2020, 21 May). An Operating Model for the Next Normal: Lessons from Agile Organizations in the Crisis. Available: https://www.mckinsey.com/business- functions/organization/our-insights/an-operating- model-for-the-next-normal-lessons-from-agile- organizations-in-the-crisis

  2. M. Reeves, K. Whitaker, and T. Deegan, "Fighting the Gravity of Average Performance," MIT Sloan Management Review, 2020.

  3. O. Mikhieieva and K. Stephan, "A Retrospective on Agile Transformations: Survey Results on Agility of German Organisations," in 2020 IEEE European Technology and Engineering Management Summit, 2020.

  4. S. Wu. (2021, May 31). Cerno. Available: https://www.scaledagile.com/case_study/cerno/

  5. R. Barnahor. (2021, May 31). SAFe Case Study: EdgeVerve Systems. Available: https://www.scaledagileframework.com/case-study- edgeverve-systems/

  6. S. Jagadeesan. (2021, May 31). Case Study: Royal Philips. Available:https://www.scaledagileframework.com/royal-phillips- case-study/

  7. L. Christopher and M. Vries, "Selecting a scaled Agile approach for a Fin Tech company," South African Journal of Industrial Engineering, vol. 31, pp. 196-208, 2020.

  8. H. Takeuchi and I. Nonaka, "The New New Product Development Game," Harvard Business Review, 1986.

  9. K. Schwaber and J. Sutherland, "The Scrum Guide - The Definitive Guide to Scrum: The Rules of the Game," Scrumguides.orgNovember 2020.

  10. J. Sutherland and ScrumInc, "The Scrum At Scale® Guide: The Definitive Guide to the Scrum@Scale Framework Version 2.3," Scrum Inc., Cambridge, MA2021.

  11. D. Rigby, J. Sutherland, and H. Takeuchi, "Embracing Agile," Harvard Business Review, May 2016.

  12. J. Shook, "How to Change a Culture: Lessons From NUMMI," MIT Sloan Management Review, vol. 51, 2010.

  13. D. K. I. Sobek and A. Smalley, Understaning A3 Thinking: A Critical Component of Toyota’s PDCA Management System. New York: Productivity Press, 2008.

  14. J. Shook and Womack, Managing to Learn: Using the A3 Management Process to Solve Problems, Gain Agreement, Mentor and Lead. Cambridge MA: Lean Enterprises Institute Inc., 2008.

  15. E. Elbourne, Fundamentals of Industrial Administration: An Introduction to Management. London: MacDonald & Evans, 1949.

  16. E. Baker, The Symphony of Profound Knowledge. Bloomington IN: iUniverse, 2016.

  17. T. Ohno, Toyota Production System: Beyond Large- Scale Production: Productivity Press, 1988.

  18. M. Tromans, "DEPD-PKT Software Team Process Improvement (A3)," Toyota Kentucky2016.

  19. A. Smalley, 4 Types of Problems. Cambridge MA: Lean Enterprise Institute, 2018.

  20. ScaledAgile. (2021, June 1). SAFe DevOps Series. Available: https://www.scaledagileframework.com/devops/

  21. G. Hermkes and L. Quintela, Scaling Done Right: How to Achieve Business Agility with Scrum@Scale and Make the Competition Irrelevant. Berlin: Behendigkeit Publishing, 2020.

  22. J. O. Coplien and N. Harrison, Organizational patterns of agile software development. Upper Saddle River, NJ: Pearson Prentice Hall, 2005.

  23. J. O. Coplien, "Borland Software Craftsmanship: A New Look at Process, Quality and Productivity," in 5th Annual Borland International Conference, Orlando, FL, 1994.

  24. C. Verwijs and D. Russo, "A Theory of Scrum Team Effectiveness," IEEE Transactions on Software Engineering, May 26 2021.

  25. S. Daukus, "Test and fix time inside vs outside a sprint.," Cambridge MA: Scrum Inc, 2016.

  26. J. Humble and D. Farley, Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation: Addison Wesley, 2010.

  27. ScaledAgile. (2021, June 3). Case Study: Telstra. Available: www.scaledagileframework.com/telstra- case-study/

  28. F. Verbruggen, J. Sutherland, J. M. van der Werf, andS. Brinkkemper, "Process Efficiency – Adapting Flow to the Agile Improvement Effort," in 52nd Hawaii International Conference on System Sciences, Hawaii, 2019, pp. 6981-6987.

  29. C. R. Jakobsen and J. Sutherland, "Scrum and CMMI Going from Good to Great," in Agile Conference, 2009. AGILE '09., 2009, pp. 333-337.

  30. M. Venema. (2021, 29 August). 6 Scaled Agile Frameworks Which One Is Right For You? Available: https://www.digite.com/blog/scaled-agile-frameworks/

  31. J. Sutherland, "Future of Scrum: Parallel Pipelining of Sprints in Complex Projects," presented at the AGILE 2005 Conference, Denver, CO, 2005.

  32. J. Sutherland and J. Coplien, The Scrum Book: The Spirit of the Game: Pragmatic Bookshelf, 2019.

  33. J. Johnson, Chaos 2020: Beyond Infinity. Boston MA: Standish Group, 2020.

  34. H. Kniberg. (2019, June 7). Spotify: A Scrum@Scale Case Study. Available: https://resources.scrumalliance.org/Article/spotify- scrum@scale-case-study

  35. M. Fowler. (2010, 20 Sep). BlueGreenDeployment. Available: https://martinfowler.com/bliki/BlueGreenDeployment. html

  36. J. Doerr, Measure What Matters: How Google, Bono, and the Gates Foundation Rock the World with OKRs: Portfolio, 2018.

  37. Puppet and DORA, "State of Dev/Ops Report 2016," Puppet2016.

  38. Red Hat. (2018, 29 August). What is CI/CD? Available: https://www.redhat.com/en/topics/devops/what-is-ci- cd

  39. Cisco. (2021, 29 August). What is CI/CD? Available: https://www.cisco.com/c/en/us/solutions/data- center/data-center-networking/what-is-ci-cd.html

  40. A. Crawford. (2019, 29 August). DevOps. Available: https://www.ibm.com/cloud/learn/devops-a-complete- guide

  41. Amazon. (2021, 29 August). What is Dev/Ops? Available: https://aws.amazon.com/devops/what-is- devops/

原文链接: http://mp.weixin.qq.com/s?__biz=MzkyNzMwMTUyNA==&mid=2247483772&idx=1&sn=6f2a38bc511b692412b5de214ba3ce20&chksm=c22b6b87f55ce291b738741cb099fc09d88375e047ad7bdecc78cd77d1c886aa3962ae300429#rd