2 Application in R language

2.1 A simple case

Consider following R class implementation which provides some basic mathematics operations.

Let’s not argue about the design and relevancy of the approach. Instead, can you tell the scopes of each function, and identify/inventorize the implementations flaws ?

2.2 Defensive programming

In standard R, the provided implementation might behave correctly, erroneously, or even generate errors, depending on the inputs you provide.

Is it bad code ? Not at all according to me. The class name mentions clearly the intent that is to encapsulate some math operations. There are 3 operations. They can take any argument that can be accepted by operators ‘+’, ’*’ or ‘/’. So, providing, integers, doubles, and complex numbers should work. If you use an external package like gmp, it is also an acceptable input for any of the needed parameters. Any combination of this types will provide a correct result, using scalars or vectors.

From my point of view, main issues are the followings

issue number issue description issue severity
1 few seconds for creation, several quarters of an hour for testing, and hours for documentation UNACCEPTABLE
2 does it complies with maths sets? Not at all, this is software engineering implementation, not a math compliant one SEVERE
3 high sensitivity to input values did you consider that NaN, NA, Inf, -Inf, 0 could be valid input values here?. Indeed R is naturally great on this part LOW
4 natural polymorphism of returned types, that brings again software engineering whereas reliable math ops are needed. From a mathematical point of view, input belong to a predefined mathematical set, and output belongs also to a predefined mathematical set. Not the case with provided implementations HIGH
5 unreliable implementation as input might return numeric output, warning or errors HIGH

2.3 Offensive programming

Consider same R class implementation with a little bit instrumentation.

2.3.1 What are the differences?

Compare to previously shown implementation, here are the two main differences

  1. arguments are renamed according to a pattern
  2. a variable named function_return_types has been added. It holds a data.table that defines expected function return types.

That’s it. Function implementation is exactly the same. No change done elsewhere. Everything is there and should be sufficient to solve many of the faced issues.

2.3.2 Semantic argument naming

Arguments have been renamed from x to x_r. What does that mean? Syntactically, it changes nothing for R. For us humans, it changes a lot of things, as this follows a pattern that allows to specify several intents in a short, concise, and reliable way.

The pattern is simple to understand. Refer to 5 to know more about syntax, and to discover many illustrative examples.

2.4 Back to definition

So now, you know the variable x_r is just a vector of real values, unconstrained in length. Using this parameter name implies that the the developer is responsible for testing cases of various length and has to prevent weirdness propagation.

For example following R code shows results that require decisions

This code provides both an output and a warning, because of R recycling on various length vectors. What decision should be taken ? Allow or deny this behavior ? It depends of your usage.

If you are creating a real math library, I would recommend to duplicate the code and create two functions named addRCompliant and addMathCompliant. Later should enforce arguments length control in his body, while former should keep the body as is or instrument it with an encapsulating suppressWarning call. That way, you should easily meet your end-users expectations, either mathematicians or software engineers.

Note that in the later case, added controls are not defensive programming but functional scope verification.