Tiger Style

发布于 作者: Tigerbeetle

关于

本文是TigerBeetle的官方风格指导,系统阐述了其在安全性、性能和开发者体验之间的设计原则与实践规范。 TigerBeetle 是一个专为金融交易系统设计的高性能分布式数据库。它的目标是提供极高的安全性、性能与可靠性

The Essence Of Style 风格的精髓

“There are three things extremely hard: steel, a diamond, and to know one's self.” — Benjamin Franklin “有三种东西极其困难:钢铁、钻石,以及认识自己。” — 本杰明·富兰克林

TigerBeetle's coding style is evolving. A collective give-and-take at the intersection of engineering and art. Numbers and human intuition. Reason and experience. First principles and knowledge. Precision and poetry. Just like music. A tight beat. A rare groove. Words that rhyme and rhymes that break. Biodigital jazz. This is what we've learned along the way. The best is yet to come. TigerBeetle 的编码风格正在演变。工程与艺术的交汇处,一种集体性的权衡与妥协。数字与人类直觉。理性与经验。第一性原理与知识。精准与诗意。就像音乐。紧凑的节拍。罕见的节奏。押韵的词语和打破韵律的韵脚。生物数字爵士。这就是我们在路上学到的。最好的还在后面。

Why Have Style? 为什么要有风格?

Another word for style is design. 风格的另一个词是设计。

“The design is not just what it looks like and feels like. The design is how it works.” — Steve Jobs “设计不仅仅是看起来和感觉如何。设计是它的工作方式。” — 史蒂夫·乔布斯

Our design goals are safety, performance, and developer experience. In that order. All three are important. Good style advances these goals. Does the code make for more or less safety, performance or developer experience? That is why we need style. 我们的设计目标是安全、性能和开发者体验。按这个顺序。这三个都很重要。良好的风格可以促进这些目标。代码是否提高了或降低了安全性、性能或开发者体验?这就是我们需要风格的原因。

Put this way, style is more than readability, and readability is table stakes, a means to an end rather than an end in itself. 这样说来,风格不仅仅关乎可读性,而可读性只是基本要求,它是一种手段而非目的本身。

“...in programming, style is not something to pursue directly. Style is necessary only where understanding is missing.” ─ Let Over Lambda “...在编程中,风格并非需要直接追求的东西。风格只在理解缺失时才是必要的。” ─ Let Over Lambda

This document explores how we apply these design goals to coding style. First, a word on simplicity, elegance and technical debt. 本文探讨了我们如何将这些设计目标应用于编码风格。首先,谈谈简洁性、优雅性和技术债务。

On Simplicity And Elegance 关于简洁性与优雅性

Simplicity is not a free pass. It's not in conflict with our design goals. It need not be a concession or a compromise. 简洁并非免费通行证。它不与我们的设计目标相冲突。它不必是一种让步或妥协。

Rather, simplicity is how we bring our design goals together, how we identify the “super idea” that solves the axes simultaneously, to achieve something elegant. 相反,简洁是我们如何将设计目标结合起来,如何识别那个能同时解决多个维度的“超级想法”,以实现优雅。

“Simplicity and elegance are unpopular because they require hard work and discipline to achieve” — Edsger Dijkstra “简洁和优雅不受欢迎,因为它们需要艰苦的工作和纪律才能实现”——埃德加·迪科斯彻

Contrary to popular belief, simplicity is also not the first attempt but the hardest revision. It's easy to say “let's do something simple”, but to do that in practice takes thought, multiple passes, many sketches, and still we may have to “throw one away”. 与普遍观念相反,简洁也不是初次尝试,而是最艰难的修订。说“让我们做点简单的事”很容易,但在实践中这样做需要思考、多次迭代、许多草图,我们可能仍然不得不“扔掉一个”

The hardest part, then, is how much thought goes into everything. 那么,最困难的部分就在于,每件事都投入了多少思考。

We spend this mental energy upfront, proactively rather than reactively, because we know that when the thinking is done, what is spent on the design will be dwarfed by the implementation and testing, and then again by the costs of operation and maintenance. 我们提前花费这种精神能量,主动而非被动,因为我们知道当思考完成时,设计上花费的将远小于实施和测试,然后又会被运营和维护的成本所超越。

An hour or day of design is worth weeks or months in production: 设计上的一小时或一天,价值生产中的数周或数月:

“the simple and elegant systems tend to be easier and faster to design and get right, more efficient in execution, and much more reliable” — Edsger Dijkstra “简单而优雅的系统往往更容易、更快地设计和实现,执行效率更高,可靠性也更强” — Edsger Dijkstra

Technical Debt 技术债务

What could go wrong? What's wrong? Which question would we rather ask? The former, because code, like steel, is less expensive to change while it's hot. A problem solved in production is many times more expensive than a problem solved in implementation, or a problem solved in design. 可能会出什么问题?出了什么问题?我们宁愿问哪个问题?前者,因为代码,像钢铁一样,在热的时候改变成本更低。在生产中解决的问题,其成本是实施中解决问题的许多倍,或者是在设计中解决问题的许多倍。

Since it's hard enough to discover showstoppers, when we do find them, we solve them. We don't allow potential memcpy latency spikes, or exponential complexity algorithms to slip through. 既然发现致命问题已经足够困难,当我们发现它们时,我们会解决它们。我们不允许潜在的 memcpy 延迟峰值或指数级复杂度的算法溜走。

“You shall not pass!” — Gandalf “你不可以通行!” —甘道夫

In other words, TigerBeetle has a “zero technical debt” policy. We do it right the first time. This is important because the second time may not transpire, and because doing good work, that we can be proud of, builds momentum. 换句话说,TigerBeetle 有一个“零技术债务”政策。我们一开始就做对。这很重要,因为第二次可能不会发生,而且做好我们引以为傲的工作,能够建立势头。

We know that what we ship is solid. We may lack crucial features, but what we have meets our design goals. This is the only way to make steady incremental progress, knowing that the progress we have made is indeed progress. 我们知道我们发布的东西是可靠的。我们可能缺少关键功能,但我们拥有的符合我们的设计目标。这是实现稳步渐进式进步的唯一方法,因为我们知道我们已经取得的进步确实是进步。

Safety 安全性

“The rules act like the seat-belt in your car: initially they are perhaps a little uncomfortable, but after a while their use becomes second-nature and not using them becomes unimaginable.” — Gerard J. Holzmann “规则就像你车上的安全带:一开始可能有点不舒服,但过一段时间后使用它们就成了一种习惯,不使用它们就变得难以想象。” — Gerard J. Holzmann

NASA's Power of Ten — Rules for Developing Safety Critical Code will change the way you code forever. To expand: NASA 的十倍威力——开发安全关键代码的规则将永远改变你的编码方式。具体来说:

  • Use only very simple, explicit control flow for clarity. Do not use recursion to ensure that all executions that should be bounded are bounded. Use only a minimum of excellent abstractions but only if they make the best sense of the domain. Abstractions are never zero cost. Every abstraction introduces the risk of a leaky abstraction.

  • 仅使用非常简单、明确的控制流以确保清晰。不要使用递归以确保所有应该有界的执行都是有界的。仅使用最少的优秀抽象,但只有当它们对领域最有意义时才使用。抽象是永远不会零成本的。每个抽象都引入了泄漏抽象的风险。

  • Put a limit on everything because, in reality, this is what we expect—everything has a limit. For example, all loops and all queues must have a fixed upper bound to prevent infinite loops or tail latency spikes. This follows the “fail-fast” principle so that violations are detected sooner rather than later. Where a loop cannot terminate (e.g. an event loop), this must be asserted.

  • 对一切设置限制,因为在现实中这就是我们的期望——所有事物都有极限。例如,所有循环和所有队列必须有固定的上限以防止无限循环或尾部延迟峰值。这遵循“快速失败”原则,以便更早地检测违规行为。当循环无法终止时(例如事件循环),必须进行断言。

  • Use explicitly-sized types like u32 for everything, avoid architecture-specific usize.

  • 对所有事物使用明确大小的类型,如u32,避免使用特定于架构的usize

  • Assertions detect programmer errors. Unlike operating errors, which are expected and which must be handled, assertion failures are unexpected. The only correct way to handle corrupt code is to crash. Assertions downgrade catastrophic correctness bugs into liveness bugs. Assertions are a force multiplier for discovering bugs by fuzzing.

  • 断言检测程序员错误。与预期中必须处理的操作错误不同,断言失败是不可预见的。处理损坏代码的唯一正确方式是崩溃。断言将灾难性的正确性错误降级为活性错误。断言通过模糊测试成倍地增加发现错误的能力。

    • Assert all function arguments and return values, pre/postconditions and invariants. A function must not operate blindly on data it has not checked. The purpose of a function is to increase the probability that a program is correct. Assertions within a function are part of how functions serve this purpose. The assertion density of the code must average a minimum of two assertions per function.

    • 断言所有函数参数、返回值、前置/后置条件和不变量。 函数不应盲目地对其未检查的数据进行操作。函数的目的是提高程序正确的概率。函数内的断言是函数实现这一目的的一部分。代码的断言密度必须平均每个函数至少有两个断言。

    • Pair assertions. For every property you want to enforce, try to find at least two different code paths where an assertion can be added. For example, assert validity of data right before writing it to disk, and also immediately after reading from disk.

    • 成对断言 对于你想要强制的每个属性,尝试找到至少两个不同的代码路径可以添加断言。例如,在将数据写入磁盘之前断言其有效性,并在从磁盘读取后立即断言。

    • On occasion, you may use a blatantly true assertion instead of a comment as stronger documentation where the assertion condition is critical and surprising.

    • 有时,您可以使用一个公然真实的断言代替注释,作为更强的文档,其中断言条件是关键且令人惊讶的。

    • Split compound assertions: prefer assert(a); assert(b); over assert(a and b);. The former is simpler to read, and provides more precise information if the condition fails.

    • 分割复合断言:优先使用 assert(a); assert(b); 而不是 assert(a and b);。前者更易于阅读,如果条件失败,能提供更精确的信息。

    • Use single-line if to assert an implication: if (a) assert(b).

    • 使用单行 if 断言一个蕴含关系:if (a) assert(b)

    • Assert the relationships of compile-time constants as a sanity check, and also to document and enforce subtle invariants or type sizes. Compile-time assertions are extremely powerful because they are able to check a program's design integrity before the program even executes.

    • 断言编译时常量的关系 作为一种合理性检查,同时用于记录和强制执行 微妙的不变量类型大小。编译时常量断言非常强大,因为它们能够在程序甚至执行之前检查程序的设计完整性。

    • The golden rule of assertions is to assert the positive space that you do expect AND to assert the negative space that you do not expect because where data moves across the valid/invalid boundary between these spaces is where interesting bugs are often found. This is also why tests must test exhaustively, not only with valid data but also with invalid data, and as valid data becomes invalid.

    • 断言的金规则是断言你期望的正空间,并且断言你不期望的负空间,因为数据在这些空间之间有效/无效边界移动的地方往往是发现有趣错误的地方。这也是为什么测试必须全面测试,不仅用有效数据,还用无效数据,以及当有效数据变成无效数据时。

    • Assertions are a safety net, not a substitute for human understanding. With simulation testing, there is the temptation to trust the fuzzer. But a fuzzer can prove only the presence of bugs, not their absence. Therefore:

      • Build a precise mental model of the code first,
      • encode your understanding in the form of assertions,
      • write the code and comments to explain and justify the mental model to your reviewer,
      • and use VOPR as the final line of defense, to find bugs in your and reviewer's understanding of code.
    • 断言是一个安全网,而不是人类理解的替代品。在模拟测试中,有信任模糊器的诱惑。但模糊器只能证明错误的存在,不能证明其不存在。因此:

      • 首先建立代码的精确心理模型,
      • 将你的理解以断言的形式编码,
      • 编写代码和注释来解释和证明心理模型给审查者看,
      • 并使用 VOPR 作为最后一道防线,以发现你和审查者对代码的理解中的错误。
  • All memory must be statically allocated at startup. No memory may be dynamically allocated (or freed and reallocated) after initialization. This avoids unpredictable behavior that can significantly affect performance, and avoids use-after-free. As a second-order effect, it is our experience that this also makes for more efficient, simpler designs that are more performant and easier to maintain and reason about, compared to designs that do not consider all possible memory usage patterns upfront as part of the design.

  • 所有内存必须在启动时静态分配。初始化后不得动态分配(或释放后重新分配)内存。 这可以避免可能导致性能显著下降的不可预测行为,并防止使用后释放。作为次要效应,我们的经验是,这也有助于设计出更高效、更简单、性能更优且更易于维护和理解的系统,相比之下,那些在设计初期未考虑所有可能的内存使用模式的设计则不然。

  • Declare variables at the smallest possible scope, and minimize the number of variables in scope, to reduce the probability that variables are misused.

  • 最小的可能作用域内声明变量,并尽量减少作用域内的变量数量,以降低变量被误用的概率。

  • Restrict the length of function bodies to reduce the probability of poorly structured code. We enforce a hard limit of 70 lines per function.

  • 限制函数体的长度,以降低代码结构不良的概率。我们强制执行每个函数最多 70 行的硬性限制

    Splitting code into functions requires taste. There are many ways to cut a wall of code into chunks of 70 lines, but only a few splits will feel right. Some rules of thumb: 将代码拆分成函数需要一定的品味。将一堵代码墙切成 70 行代码块的方法有很多,但只有少数拆分方式会感觉正确。一些经验法则:

    • Good function shape is often the inverse of an hourglass: a few parameters, a simple return type, and a lot of meaty logic between the braces.
    • Centralize control flow. When splitting a large function, try to keep all switch/if statements in the "parent" function, and move non-branchy logic fragments to helper functions. Divide responsibility. All control flow should be handled by one function, the rest shouldn't care about control flow at all. In other words, "push ifs up and fors down".
    • Similarly, centralize state manipulation. Let the parent function keep all relevant state in local variables, and use helpers to compute what needs to change, rather than applying the change directly. Keep leaf functions pure.
    • 好的函数形状通常与沙漏相反:少量参数、简单的返回类型,以及大段的逻辑代码块之间。
    • 集中控制流。当拆分大函数时,尽量将所有 switch/if 语句保留在"父"函数中,并将非分支逻辑片段移至辅助函数。明确职责。所有控制流应由一个函数处理,其余部分不应关心控制流。换句话说,if语句上移,将for循环下移
    • 类似地,集中状态管理。让父函数在局部变量中保留所有相关状态,并使用辅助函数来计算需要变更的内容,而不是直接应用变更。保持叶函数纯净。
  • Appreciate, from day one, all compiler warnings at the compiler's strictest setting.

  • 从一开始就重视编译器在严格设置下的所有警告

  • Whenever your program has to interact with external entities, don't do things directly in reaction to external events. Instead, your program should run at its own pace. Not only does this make your program safer by keeping the control flow of your program under your control, it also improves performance for the same reason (you get to batch, instead of context switching on every event). Additionally, this makes it easier to maintain bounds on work done per time period.

  • 每当你的程序需要与外部实体交互时,不要直接对外部事件做出反应。相反,你的程序应该按照自己的节奏运行。这样做不仅通过将程序的控制流置于你的控制之下,使你的程序更安全,还因为同样的原因提高了性能(你可以批量处理,而不是在每次事件上都进行上下文切换)。此外,这还使得更容易在每段时间内的工作量上设置界限。

Beyond these rules: 除此之外:

  • Compound conditions that evaluate multiple booleans make it difficult for the reader to verify that all cases are handled. Split compound conditions into simple conditions using nested if/else branches. Split complex else if chains into else { if { } } trees. This makes the branches and cases clear. Again, consider whether a single if does not also need a matching else branch, to ensure that the positive and negative spaces are handled or asserted.

  • 评估多个布尔值的复合条件会使读者难以验证所有情况都得到了处理。将复合条件拆分为使用嵌套 if/else 分支的简单条件。将复杂的 else if 链拆分为 else { if { } } 树。这样可以使分支和情况更加清晰。同样,考虑是否单个 if 也需要一个匹配的 else 分支,以确保正负空间得到处理或断言。

  • Negations are not easy! State invariants positively. When working with lengths and indexes, this form is easy to get right (and understand):

  • 否定并不容易!用正面状态声明不变式。在处理长度和索引时,这种形式很容易正确(并且易于理解):

    if (index < length) {
      // The invariant holds.
    } else {
      // The invariant doesn't hold.
    }
    

    This form is harder, and also goes against the grain of how index would typically be compared to length, for example, in a loop condition: 这种形式更难,并且也违背了index通常与length比较的方式,例如在循环条件中:

    if (index >= length) {
      // It's not true that the invariant holds.
    }
    
  • All errors must be handled. An analysis of production failures in distributed data-intensive systems found that the majority of catastrophic failures could have been prevented by simple testing of error handling code.

  • 所有错误都必须被处理。一项关于分布式数据密集型系统生产故障的分析(https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf)发现,大多数灾难性故障可以通过简单的错误处理代码测试来预防。

“Specifically, we found that almost all (92%) of the catastrophic system failures are the result of incorrect handling of non-fatal errors explicitly signaled in software.” 具体来说,我们发现几乎所有(92%)的灾难性系统故障都是由于软件中明确标记的非致命错误处理不当造成的。

  • Always motivate, always say why. Never forget to say why. Because if you explain the rationale for a decision, it not only increases the hearer's understanding, and makes them more likely to adhere or comply, but it also shares criteria with them with which to evaluate the decision and its importance.

  • 始终激励,始终说明原因。永远不要忘记说明原因。因为如果你解释决策的原理,不仅会增加听众的理解,使他们更可能遵守或配合,还会与他们分享评估决策及其重要性的标准。

  • Explicitly pass options to library functions at the call site, instead of relying on the defaults. For example, write @prefetch(a, .{ .cache = .data, .rw = .read, .locality = 3 }); over @prefetch(a, .{});. This improves readability but most of all avoids latent, potentially catastrophic bugs in case the library ever changes its defaults.

  • 在函数调用处显式传递选项,而不是依赖默认值。例如,使用 @prefetch(a, .{ .cache = .data, .rw = .read, .locality = 3 }); 而不是 @prefetch(a, .{});。这提高了可读性,最重要的是避免了潜在的、可能灾难性的错误,以防库函数的默认值发生变化。

Performance 性能

“The lack of back-of-the-envelope performance sketches is the root of all evil.” — Rivacindela Hudsoni “缺乏估算性能的草稿是所有问题的根源。” — Rivacindela Hudsoni

  • Think about performance from the outset, from the beginning. The best time to solve performance, to get the huge 1000x wins, is in the design phase, which is precisely when we can't measure or profile. It's also typically harder to fix a system after implementation and profiling, and the gains are less. So you have to have mechanical sympathy. Like a carpenter, work with the grain.

  • 从一开始就要考虑性能。解决性能的最佳时机,获得巨大的 1000 倍提升,就是在设计阶段,而恰恰是在我们无法测量或分析的时候。 实施和分析后修复系统通常更困难,收益也更少。所以你必须有机械同理心。就像木匠一样,顺着纹理工作。

  • Perform back-of-the-envelope sketches with respect to the four resources (network, disk, memory, CPU) and their two main characteristics (bandwidth, latency). Sketches are cheap. Use sketches to be “roughly right” and land within 90% of the global maximum.

  • 针对四种资源(网络、磁盘、内存、CPU)及其两个主要特性(带宽、延迟)进行估算性能的草稿。 草稿很便宜。用草稿来做到“大致正确”,并达到全局最大值的 90%以内。

  • Optimize for the slowest resources first (network, disk, memory, CPU) in that order, after compensating for the frequency of usage, because faster resources may be used many times more. For example, a memory cache miss may be as expensive as a disk fsync, if it happens many times more.

  • 首先针对最慢的资源进行优化(网络、磁盘、内存、CPU),按此顺序,在补偿使用频率后,因为更快的资源可能被使用得多得多。例如,内存缓存未命中可能和磁盘 fsync 一样昂贵,如果它发生的频率高得多。

  • Distinguish between the control plane and data plane. A clear delineation between control plane and data plane through the use of batching enables a high level of assertion safety without losing performance. See our July 2021 talk on Zig SHOWTIME for examples.

  • 区分控制平面和数据平面。通过使用批处理来明确区分控制平面和数据平面,可以在不损失性能的情况下实现高级别的断言安全。查看我们在 Zig SHOWTIME 的2021 年 7 月演讲中的示例。

  • Amortize network, disk, memory and CPU costs by batching accesses.

  • 通过批处理访问来摊销网络、磁盘、内存和 CPU 成本。

  • Let the CPU be a sprinter doing the 100m. Be predictable. Don't force the CPU to zig zag and change lanes. Give the CPU large enough chunks of work. This comes back to batching.

  • 让 CPU 成为跑 100 米的短跑选手。要可预测。不要迫使 CPU 来回变道。给 CPU 足够大的工作块。这又回到了批处理。

  • Be explicit. Minimize dependence on the compiler to do the right thing for you.

  • 要明确。最小化对编译器执行正确操作的依赖。

    In particular, extract hot loops into stand-alone functions with primitive arguments without self (see an example). That way, the compiler doesn't need to prove that it can cache struct's fields in registers, and a human reader can spot redundant computations easier. 特别是,将热循环提取为独立的函数,使用原始类型的参数而不带self(参见示例)。这样,编译器就不需要证明它可以在寄存器中缓存结构的字段,而且人类读者可以更容易地发现冗余计算。

Developer Experience 开发者体验

“There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.” — Phil Karlton “计算机科学中只有两件困难的事情:缓存失效、命名事物和边界错误。” — Phil Karlton

Naming Things 命名事物

  • Get the nouns and verbs just right. Great names are the essence of great code, they capture what a thing is or does, and provide a crisp, intuitive mental model. They show that you understand the domain. Take time to find the perfect name, to find nouns and verbs that work together, so that the whole is greater than the sum of its parts.

  • 名词和动词要准确。 优秀的名字是优秀代码的精髓,它们捕捉事物是什么或做什么,并提供清晰直观的心理模型。它们表明你理解了领域。花时间找到完美的名字,找到能协同工作的名词和动词,使整体大于部分之和。

  • Use snake_case for function, variable, and file names. The underscore is the closest thing we have as programmers to a space, and helps to separate words and encourage descriptive names. We don't use Zig's CamelCase.zig style for "struct" files to keep the convention simple and consistent.

  • 函数、变量和文件名使用 snake_case。下划线是我们程序员最接近空格的东西,有助于分隔单词并鼓励描述性名字。我们不使用 Zig 的 CamelCase.zig 风格来命名 "struct" 文件,以保持约定简单一致。

  • Do not abbreviate variable names, unless the variable is a primitive integer type used as an argument to a sort function or matrix calculation. Use long form arguments in scripts: --force, not -f. Single letter flags are for interactive usage.

  • 不要缩写变量名,除非变量是作为排序函数或矩阵计算参数的原始整数类型。脚本中使用长形式参数:--force,而不是 -f。单个字母的标志用于交互式使用。

  • Use proper capitalization for acronyms (VSRState, not VsrState).

  • 缩写词组使用正确的首字母大写形式 (VSRState,而不是 VsrState)。

  • For the rest, follow the Zig style guide.

  • 其余部分,请遵循 Zig 风格指南。

  • Add units or qualifiers to variable names, and put the units or qualifiers last, sorted by descending significance, so that the variable starts with the most significant word, and ends with the least significant word. For example, latency_ms_max rather than max_latency_ms. This will then line up nicely when latency_ms_min is added, as well as group all variables that relate to latency.

  • 在变量名中添加单位或限定词,并将单位或限定词放在最后,按降序排列重要性,使变量以最重要的词开头,以最不重要的词结尾。例如,使用latency_ms_max而不是max_latency_ms。这样,当添加latency_ms_min时,它们会整齐排列,并且所有与延迟相关的变量也会分组。

  • Infuse names with meaning. For example, allocator: Allocator is a good, if boring name, but gpa: Allocator and arena: Allocator are excellent. They inform the reader whether deinit should be called explicitly.

  • 为名称赋予意义。例如,allocator: Allocator是一个虽然乏味但不错的名称,而gpa: Allocatorarena: Allocator则非常出色。它们能告知读者是否需要显式调用deinit

  • When choosing related names, try hard to find names with the same number of characters so that related variables all line up in the source. For example, as arguments to a memcpy function, source and target are better than src and dest because they have the second-order effect that any related variables such as source_offset and target_offset will all line up in calculations and slices. This makes the code symmetrical, with clean blocks that are easier for the eye to parse and for the reader to check.

  • 在选择相关名称时,尽量找到字符数相同的名称,以便相关变量在源代码中都能对齐。例如,作为 memcpy 函数的参数,sourcetargetsrcdest更好,因为它们具有次级效应,任何相关的变量(如source_offsettarget_offset)在计算和切片时都会对齐。这使得代码对称,块结构清晰,便于视觉解析和读者检查。

  • When a single function calls out to a helper function or callback, prefix the name of the helper function with the name of the calling function to show the call history. For example, read_sector() and read_sector_callback().

  • 当一个函数调用一个辅助函数或回调时,在辅助函数的名称前加上调用函数的名称,以显示调用历史。例如,read_sector()read_sector_callback()

  • Callbacks go last in the list of parameters. This mirrors control flow: callbacks are also invoked last.

  • 回调在参数列表中放在最后。这反映了控制流:回调也是最后被 调用 的。

  • Order matters for readability (even if it doesn't affect semantics). On the first read, a file is read top-down, so put important things near the top. The main function goes first.

  • 顺序 对于可读性很重要(即使它不影响语义)。在第一次阅读时,文件是自上而下读取的,所以把重要的事情放在顶部。main 函数放在第一位。

    The same goes for structs, the order is fields then types then methods: 同样适用于 structs,顺序是字段然后类型然后方法:

    time: Time,
    process_id: ProcessID,
    
    
    const ProcessID = struct { cluster: u128, replica: u8 };
    const Tracer = @This(); // This alias concludes the types section.
    
    
    pub fn init(gpa: std.mem.Alocator, time: Time) !Tracer {
        ...
    }
    
    

    If a nested type is complex, make it a top-level struct. 如果嵌套类型复杂,将其设为顶层结构体。

    At the same time, not everything has a single right order. When in doubt, consider sorting alphabetically, taking advantage of big-endian naming. 同时,并非所有事物都有单一的顺序。如有疑问,可考虑按字母顺序排序,利用大端命名法。

  • Don't overload names with multiple meanings that are context-dependent. For example, TigerBeetle has a feature called pending transfers where a pending transfer can be subsequently posted or voided. At first, we called them two-phase commit transfers, but this overloaded the two-phase commit terminology that was used in our consensus protocol, causing confusion.

  • 不要用具有多重上下文依赖含义的名称。例如,TigerBeetle 有一个名为"待处理转账"的功能,其中待处理转账可以后续"发布"或"作废"。最初,我们称它们为"两阶段提交转账",但这加重了我们在共识协议中使用的"两阶段提交"术语,导致混淆。

  • Think of how names will be used outside the code, in documentation or communication. For example, a noun is often a better descriptor than an adjective or present participle, because a noun can be directly used in correspondence without having to be rephrased. Compare replica.pipeline vs replica.preparing. The former can be used directly as a section header in a document or conversation, whereas the latter must be clarified. Noun names compose more clearly for derived identifiers, e.g. config.pipeline_max.

  • 考虑名称在代码之外如何使用,如在文档或沟通中。例如,名词通常比形容词或现在分词是更好的描述符,因为名词可以直接用于通信而不需要重新措辞。比较replica.pipelinereplica.preparing。前者可以直接用作文档或对话中的章节标题,而后者必须加以说明。名词名称对于派生标识符的组成更为清晰,例如config.pipeline_max

  • Zig has named arguments through the options: struct pattern. Use it when arguments can be mixed up. A function taking two u64 must use an options struct. If an argument can be null, it should be named so that the meaning of null literal at the call site is clear.

  • Zig 通过options: struct模式实现了命名参数。当参数可以混合使用时,请使用它。一个接受两个u64的函数必须使用选项结构体。如果一个参数可以是null,它应该命名,以便在调用位置null字面量的含义清晰。

    Because dependencies like an allocator or a tracer are singletons with unique types, they should be threaded through constructors positionally, from the most general to the most specific.

  • 由于像分配器或跟踪器这样的依赖项是具有唯一类型的单例,它们应该按位置通过构造函数传递,从最一般到最具体。

  • Write descriptive commit messages that inform and delight the reader, because your commit messages are being read.

  • 编写描述性的提交信息,让读者知情并感到愉悦,因为你的提交信息正在被阅读。

  • Don't forget to say why. Code alone is not documentation. Use comments to explain why you wrote the code the way you did. Show your workings.

  • 别忘了说明原因。代码本身不是文档。使用注释来解释你为什么以这种方式编写代码。展示你的工作过程。

  • Don't forget to say how. For example, when writing a test, think of writing a description at the top to explain the goal and methodology of the test, to help your reader get up to speed, or to skip over sections, without forcing them to dive in.

  • 别忘了说明方法。例如,在写测试时,思考在顶部写一个描述来解释测试的目标和方法,帮助读者快速了解,或者让他们选择跳过某些部分,而不是强迫他们深入阅读。

  • Comments are sentences, with a space after the slash, with a capital letter and a full stop, or a colon if they relate to something that follows. Comments are well-written prose describing the code, not just scribblings in the margin. Comments after the end of a line can be phrases, with no punctuation.

  • 注释是句子,斜杠后有空格,以大写字母和句号结束,或者如果与后续内容相关则以冒号结束。注释是描述代码的精心撰写的散文,而不是边角的涂鸦。行末的注释可以是短语,无需标点。

Cache Invalidation 缓存失效

  • Don't duplicate variables or take aliases to them. This will reduce the probability that state gets out of sync.

  • 不要重复变量或取它们的别名。这将降低状态不同步的可能性。

  • If you don't mean a function argument to be copied when passed by value, and if the argument type is more than 16 bytes, then pass the argument as *const. This will catch bugs where the caller makes an accidental copy on the stack before calling the function.

  • 如果你不想在按值传递时复制函数参数,并且参数类型超过 16 字节,那么将参数作为*const传递。这将捕获调用者在调用函数前在栈上意外复制导致的错误。

  • Construct larger structs in-place by passing an out pointer during initialization.

  • 通过在初始化时传递一个out 指针来原地构造较大的结构体。

    In-place initializations can assume pointer stability and immovable types while eliminating intermediate copy-move allocations, which can lead to undesirable stack growth. 原地初始化可以假设指针稳定性不可移动类型,同时消除中间的拷贝移动分配,这可能导致不希望的栈增长。

    Keep in mind that in-place initializations are viral — if any field is initialized in-place, the entire container struct should be initialized in-place as well. 请记住原地初始化具有传染性——如果任何字段被原地初始化,整个容器结构体也应该被原地初始化。

    Prefer:

    fn init(target: *LargeStruct) !void {
      target.* = .{
        // in-place initialization.
      };
    }
    
    
    fn main() !void {
      var target: LargeStruct = undefined;
      try target.init();
    }
    

    Over:

    fn init() !LargeStruct {
      return LargeStruct {
        // moving the initialized object.
      }
    }
    
    
    fn main() !void {
      var target = try LargeStruct.init();
    }
    
  • Shrink the scope to minimize the number of variables at play and reduce the probability that the wrong variable is used.

  • 缩小范围以减少参与变量的数量,并降低使用错误变量的概率。

  • Calculate or check variables close to where/when they are used. Don't introduce variables before they are needed. Don't leave them around where they are not. This will reduce the probability of a POCPOU (place-of-check to place-of-use), a distant cousin to the infamous TOCTOU. Most bugs come down to a semantic gap, caused by a gap in time or space, because it's harder to check code that's not contained along those dimensions.

  • 在使用变量时附近或当时计算或检查变量。在需要之前不要引入变量。 不要在不需要的地方留下它们。这将减少 POCPOU(检查位置到使用位置)的概率,它是臭名昭著的TOCTOU的远亲。大多数错误都源于语义差距,这是由时间或空间的差距引起的,因为不包含在这些维度中的代码更难检查。

  • Use simpler function signatures and return types to reduce dimensionality at the call site, the number of branches that need to be handled at the call site, because this dimensionality can also be viral, propagating through the call chain. For example, as a return type, void trumps bool, bool trumps u64, u64 trumps ?u64, and ?u64 trumps !u64.

  • 使用更简单的函数签名和返回类型来减少调用点的维度性,需要处理的分支数量,因为这种维度性也可能是病毒性的,通过调用链传播。例如,作为返回类型,void优于boolbool优于u64u64优于?u64,而?u64优于!u64

  • Ensure that functions run to completion without suspending, so that precondition assertions are true throughout the lifetime of the function. These assertions are useful documentation without a suspend, but may be misleading otherwise.

  • 确保函数能够完整执行而不被挂起,以便在函数的整个生命周期中,前置条件断言都为真。这些断言在没有挂起的情况下是非常有用的文档,否则可能会产生误导。

  • Be on your guard for buffer bleeds. This is a buffer underflow, the opposite of a buffer overflow, where a buffer is not fully utilized, with padding not zeroed correctly. This may not only leak sensitive information, but may cause deterministic guarantees as required by TigerBeetle to be violated.

  • 注意防范 缓冲区溢出(Heartbleed)。这是一种缓冲区下溢,与缓冲区溢出相反,即缓冲区未被充分利用,且填充未正确清零。这不仅可能泄露敏感信息,还可能违反 TigerBeetle 所要求的确定性保证。

  • Use newlines to group resource allocation and deallocation, i.e. before the resource allocation and after the corresponding defer statement, to make leaks easier to spot.

  • 使用换行符来分组资源分配和释放,即在资源分配之前和相应的defer语句之后,以便更容易发现泄漏。

Off-By-One Errors 偏移量错误

  • The usual suspects for off-by-one errors are casual interactions between an index, a count or a size. These are all primitive integer types, but should be seen as distinct types, with clear rules to cast between them. To go from an index to a count you need to add one, since indexes are 0-based but counts are 1-based. To go from a count to a size you need to multiply by the unit. Again, this is why including units and qualifiers in variable names is important.

  • 导致计算错误通常是由于 indexcountsize 之间的随意交互。 这些都是基本整数类型,但应被视为不同的类型,并具有明确的规则来在它们之间进行类型转换。从 index 转换到 count 需要加一,因为索引是 0-based 的,而计数是 1-based 的。从 count 转换到 size 需要乘以单位。同样,这也是为什么在变量名中包含单位和限定符很重要。

  • Show your intent with respect to division. For example, use @divExact(), @divFloor() or div_ceil() to show the reader you've thought through all the interesting scenarios where rounding may be involved.

  • 在除法操作中明确你的意图。例如,使用 @divExact()@divFloor()div_ceil() 来向读者表明你已经考虑了所有可能涉及舍入操作的有趣场景。

Style By The Numbers 数字风格

  • Run zig fmt. - 运行 zig fmt

  • Use 4 spaces of indentation, rather than 2 spaces, as that is more obvious to the eye at a distance.

  • 使用 4 个空格进行缩进,而不是 2 个空格,因为在远处看时这样更明显。

  • Hard limit all line lengths, without exception, to at most 100 columns for a good typographic "measure". Use it up. Never go beyond. Nothing should be hidden by a horizontal scrollbar. Let your editor help you by setting a column ruler. To wrap a function signature, call or data structure, add a trailing comma, close your eyes and let zig fmt do the rest.

  • 所有行长度必须严格限制,最多 100 列,以获得良好的排版"度量"。充分利用它。不要超出范围。不要有任何内容被水平滚动条隐藏。让你的编辑器通过设置列标尺来帮助你。要换行显示函数签名、调用或数据结构,添加一个尾随逗号,闭上眼睛,让zig fmt来处理剩下的部分。

  • Add braces to the if statement unless it fits on a single line for consistency and defense in depth against "goto fail;" bugs.

  • 对于if语句,除非它适合放在一行内,否则添加大括号,以保持一致性并防御"goto fail;"类型的错误。

Dependencies 依赖项

TigerBeetle has a “zero dependencies” policy, apart from the Zig toolchain. Dependencies, in general, inevitably lead to supply chain attacks, safety and performance risk, and slow install times. For foundational infrastructure in particular, the cost of any dependency is further amplified throughout the rest of the stack. 虎贝(TigerBeetle)除了 Zig 工具链外,遵循 “零依赖”政策。依赖项通常不可避免地会导致供应链攻击、安全和性能风险,以及安装时间缓慢。特别是对于基础基础设施,任何依赖项的成本都会在整个堆栈中进一步放大。

Tooling 工具

Similarly, tools have costs. A small standardized toolbox is simpler to operate than an array of specialized instruments each with a dedicated manual. Our primary tool is Zig. It may not be the best for everything, but it's good enough for most things. We invest into our Zig tooling to ensure that we can tackle new problems quickly, with a minimum of accidental complexity in our local development environment. 同样,工具也有成本。一个小型标准化的工具箱比一组具有专用手册的专业仪器更容易操作。我们的主要工具是 Zig。它可能不是所有事情的最好选择,但对于大多数事情来说已经足够好了。我们投资于我们的 Zig 工具,以确保我们能够快速解决新问题,并在本地开发环境中最大限度地减少意外的复杂性。

“The right tool for the job is often the tool you are already using—adding new tools has a higher cost than many people appreciate” — John Carmack “为工作选择合适的工具通常是您已经在使用的工具——添加新工具的成本高于许多人所理解的” — John Carmack

For example, the next time you write a script, instead of scripts/*.sh, write scripts/*.zig. 例如,下次你编写脚本时,与其使用 scripts/*.sh,不如使用 scripts/*.zig

This not only makes your script cross-platform and portable, but introduces type safety and increases the probability that running your script will succeed for everyone on the team, instead of hitting a Bash/Shell/OS-specific issue. 这不仅使你的脚本跨平台且可移植,还引入了类型安全,并增加了脚本对所有团队成员都能成功运行的可能性,而不是遇到特定于 Bash/Shell/OS 的问题。

Standardizing on Zig for tooling is important to ensure that we reduce dimensionality, as the team, and therefore the range of personal tastes, grows. This may be slower for you in the short term, but makes for more velocity for the team in the long term. 将工具标准化为 Zig 至关重要,以确保随着团队的增长,我们减少复杂性。这可能在短期内对你来说较慢,但从长远来看,将使团队的速度更快。

The Last Stage 最后阶段

At the end of the day, keep trying things out, have fun, and remember—it's called TigerBeetle, not only because it's fast, but because it's small! 最终,不断尝试,享受乐趣,记住——它被称为 TigerBeetle,不仅因为速度快,还因为它体积小!

You don’t really suppose, do you, that all your adventures and escapes were managed by mere luck, just for your sole benefit? You are a very fine person, Mr. Baggins, and I am very fond of you; but you are only quite a little fellow in a wide world after all!”

“Thank goodness!” said Bilbo laughing, and handed him the tobacco-jar.

你真的以为,所有你的冒险和逃脱都是纯粹运气管理,仅仅为了你的个人利益吗?你是个非常出色的人,巴金斯先生,我非常喜欢你;但在广阔的世界里,你终究只是一个很小的人。”
“谢天谢地!”比尔博笑着说,并将烟斗递给了他。