Swift – New Diagnostic Architecture Overview - 11 minutes read


New Diagnostic Architecture Overview

Diagnostics play a very important role in a programming language experience. It’s vital for developer productivity that the compiler can produce proper guidance in any situation, especially incomplete or invalid code.

In this blog post we would like to share a couple of important updates on improvements to diagnostics being worked on for the upcoming Swift 5.2 release. This includes a new strategy for diagnosing failures in the compiler, originally introduced as part of Swift 5.1 release, that yields some exciting new results and improved error messages.

Swift is a very expressive language with a rich type system that has many features like class inheritance, protocol conformances, generics, and overloading. Though we as programmers try our best to write well-formed programs, sometimes we need a little help. Luckily, the compiler knows exactly what Swift code is valid and invalid. The problem is how best to tell you what has gone wrong, where it happened, and how you can fix it.

Many parts of the compiler ensure the correctness of your program, but the focus of this work has been improving the type checker. The Swift type checker enforces rules about how types are used in source code, and it is responsible for letting you know when those rules are violated.

For example, the following code:

While this diagnostic points out a genuine error, it’s not helpful because it is not specific or actionable. This is because the old type checker used to guess the exact location of an error. This worked in many cases, but there were still numerous kinds of programming mistakes that users would write which it could not accurately identify. In order to address this, a new diagnostic infrastructure is in the works. Rather than guessing where an error occurs, the type checker attempts to “fix” problems right at the point where they are encountered, while remembering the fixes it has applied. This not only allows the type checker to pinpoint errors in more kinds of programs, it also allows it to surface more failures where previously it would simply stop after reporting the first error.

Since the new diagnostic infrastructure is tightly coupled with the type checker, we have to take a brief detour and talk about type inference. Note that this is a brief introduction; for more details please refer to the compiler’s documentation on the type checker.

Swift implements bi-directional type inference using a constraint-based type checker that is reminiscent of the classical Hindley-Milner type inference algorithm:

For diagnostics, the only interesting stages are Constraint Generation and Solving.

Given an input expression (and sometimes additional contextual information), the constraint solver generates:

The most common type of constraint is a binary constraint, which relates two types and is denoted as:

Once constraint generation is complete, the solver attempts to assign concrete types to each of the type variables in the constraint system and form a solution that satisfies all of the constraints.

Let’s consider the following example function:

For a human, it becomes apparent pretty quickly that there is a problem with the expression and where that problem is located, but the inference engine can only rely on a constraint simplification algorithm to determine what is wrong.

As we have established previously, the constraint solver starts by generating constraints (see Constraint Generation stage) for , and . Each distinct sub-element of the input expression, like , is represented either by:

After the Constraint Generation stage completes, the constraint system for the expression will have a combination of type variables and constraints. Let’s look at those now.

Note that all constraints and type variables are linked with particular locations in the input expression:

The inference algorithm attempts to find suitable types for all type variables in the constraint system and test them against associated constraints. In our example, could get a type of or because both of these types satisfy the protocol conformance requirement. However, simply enumerating through all of the possible types for each of the “empty” type variables in the constraint system is very inefficient since there could be many types to try when a particular type variable is under-constrained. For example, has no restrictions, so it could potentially assume any type. To work around this problem, the constraint solver first tries disjunction choices, which allows the solver to narrow down the set of possible types for each type variable involved. In the case of , this brings the number of possible types down to only the result types associated with overloads choices of instead of all possible types.

Now, it’s time to run the inference algorithm to determine types for and .

We can see that the error location would be determined by the solver as it executes inference algorithm. Since none of the possible types match for it should be considered an error location (because it cannot be bound to any type). Complex expressions could have many more than one such location because existing errors result in new ones as the inference algorithm progresses. To narrow down error locations in situations like that, the solver would only pick solutions with the smallest possible number thereof.

At this point it’s more or less clear how error locations are identified, but it’s not yet obvious how to help the solver make forward progress in such scenarios so it can derive a complete solution.

The new diagnostic infrastructure employs what we are going to call a constraint fix to try and resolve inconsistent situations where the solver gets stuck with no other types to attempt. The fix for our example is to ignore that doesn’t conform to the protocol. The purpose of a fix is to be able to capture all useful information about the error location from the solver and use that later for diagnostics. That is the main difference between current and new approaches. The former would try to guess where the error is located, where the new approach has a symbiotic relationship with the solver which provides all of the error locations to it.

As we noted before, all of the type variables and constraints carry information about their relationship to the sub-expression they have originated from. Such a relation combined with type information makes it straightforward to provide tailored diagnostics and fix-its to all of the problems diagnosed via the new diagnostic framework.

In our example, it has been determined that the type variable is an error location, so the diagnostic can examine how is used in the input expression: represents an argument at position #2 in call to operator , and it’s known that the problem is related to the fact that doesn’t conform to protocol. Based on all this information it’s possible to form either of the two following diagnostics:

with a note about the second argument not conforming to the protocol, or the simpler:

with the diagnostic referring to the second argument.

We picked the first alternative and produce a diagnostic about the operator and a note for each partially matching overload choice. Let’s take a closer look at the inner workings of the described approach.

When a constraint failure is detected, a constraint fix is created that captures information about a failure:

The constraint solver accumulates these fixes. Once it arrives at a solution, it looks at the fixes that were part of a solution and produces actionable errors or warnings. Let’s take a look at how this all works together. Consider the following example:

The problem here is related to an argument which cannot be passed as an argument to parameter without an explicit .

Let’s now look at the type variables and constraints for this constraint system.

There are three type variables:

The three type variables have the following constraint:

The inference algorithm is going to try and match to , which results in the following new constraints:

cannot be converted into , so the constraint solver records the failure as a missing & and ignores the constraint.

With that constraint ignored, the remainder of the constraint system can be solved. Then the type checker looks at the recorded fixes and emits an error that describes the problem (a missing ) along with a Fix-It to insert the :

This example had a single type error in it, but this diagnostics architecture can also account for multiple distinct type errors in the code. Consider a slightly more complicated example:

While solving this constraint system, the type checker will again record a failure for the missing on the first argument to . Additionally, it will record a failure for the missing argument label . Once both failures have been recorded, the remainder of the constraint system is solved. The type checker then produces errors (with Fix-Its) for the two problems that need to be addressed to fix this code:

Recording every specific failure and then continuing on to solve the remaining constraint system implies that addressing those failures will produce a well-typed solution. That allows the type checker to produce actionable diagnostics, often with fixes, that lead the developer toward correct code.

Consider the following invalid code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

Consider the following invalid code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

Consider the following invalid code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

Consider the following invalid code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

Consider the following invalid code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

Consider the following invalid code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

Consider the following invalid SwiftUI code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

Consider the following invalid SwiftUI code:

Previously, this used to be diagnosed as a completely unrelated problem:

The new diagnostic now correctly points out that there is no such color as :

Consider the following invalid SwiftUI code:

Previously, this resulted in the following diagnostic:

This is now diagnosed as:

The new diagnostic infrastructure is designed to overcome all of the shortcomings of the old approach. The way it’s structured is intended to make it easy to improve/port existing diagnostics and to be used by new feature implementors to provide great diagnostics right off the bat. It shows very promising results with all of the diagnostics we have ported so far, and we are hard at work porting more every day.

Please feel free to post questions about this post on the associated thread on the Swift forums.

Source: Swift.org

Powered by NewsAPI.org

Keywords:

Computer architectureOutline (list)Programming languageExperienceVitalismSoftware developmentProductivity improving technologiesCompilerSubsetGödel's incompleteness theoremsSource codeBlogSwift (programming language)Software release life cycleCompilerSwift (programming language)Swift (programming language)Type systemInheritance (object-oriented programming)Protocol (object-oriented programming)Generic programmingFunction overloadingProgrammerComputer programCompilerISO 9362ValidityProblem solvingCompilerComputer programType systemSwift (programming language)Type systemRule-based systemSource codeType systemComputer programmingType systemType systemSoftware bugComputer programSoftware bugInfrastructureMultiprocessingType systemType inferenceCompilerType systemSwift (programming language)ImplementationType systemHindley–Milner type systemConstraint programmingData typeBinary constraintAbstract and concreteType theoryType theoryVariable (mathematics)Constraint programmingSystemShapeSatisfiabilityFunction (mathematics)Problem solvingExpression (mathematics)Problem solvingInference engineSymbolic computationAlgorithmConstraint programmingGeneration stageConstraint programmingGeneration stageConstraint programmingSystemCombinationData typeVariable (mathematics)Data typeVariable (mathematics)InformationInferenceAlgorithmVariable (mathematics)Constraint (mathematics)SystemStatistical hypothesis testingData typeCommunications protocolData typeBottom typeVariable (mathematics)Constraint (mathematics)SystemData typeConstraint programmingLogical disjunctionSet (abstract data type)Data typeNumberData typeData typeData typeInferenceAlgorithmInferenceAlgorithmData typeData typeComplex numberErrors and residualsInferenceAlgorithmInformationVariable (mathematics)InformationCharles Sanders PeirceInformationMedical diagnosisMedical diagnosisMedical diagnosisInformationParameterProblem solvingFactCommunications protocolInformationParameterCommunications protocolMedical diagnosisConstraint programmingObservational errorProblem solvingParameterParameterParameterData typeVariable (mathematics)SystemData typeVariable (mathematics)Data typeVariable (mathematics)InferenceAlgorithmConstraint programmingType systemType safetyComputer architectureMachine codeType systemType systemSoftware bugMachine codeSystemType systemType systemSoftware developmentMedical diagnosisMedical diagnosisMedical diagnosisPorting