FILTER function
Values
The values to filter.
Inclusions
Either an array or a formula fragment. If this parameter is an array, it should be a logical array with the same size as the array of values to filter. The returned array only contains values where the corresponding elements in this array are TRUE.
If this parameter is a formula fragment, it is run once per array
element and is expected to return TRUE only if the corresponding array
element should be included in the returned array. To do its work, it
has access to the three named values Element
, the array
element under consideration, Index
, the position of the
array element (1 for the first element, 2 for the second element, etc)
and finally Source
, which is a reference to the array to
filter, given as the first parameter to FILTER.
FallbackValues
The values to return if no values matched the filter. If there are no fallback values, an empty array is returned.
Returns
A filtered version of the given values. If no values match the filter,
an empty array is returned, unless the fallbackValues
parameter is provided, in which case those values are returned instead.
Filters an array using a logical array. FILTER({ 1, 2, 3 }, { FALSE, TRUE, TRUE })FILTER({ 1; 2; 3 }; { FALSE; TRUE; TRUE }) returns { 2, 3 }{ 2; 3 }. 1 is not part of the returned array, as the corresponding element in the second array is FALSE. 2 and 3 are both part of the returned array, as the corresponding elements in the second array are both TRUE.
In other words, the returned array contains an element found in the first array only if the corresponding element in the second array is TRUE. The two arrays must be equal in size.
Filtering based on logical operations
The second array is often not provided explicitly. Rather, an operation is typically used that returns a logical array.
This formula returns fields whose values are greater than 5:
(The Field1:Field3Field1:Field3 range is a short-hand way of expressing an array containing Field1 and Field3, as well as any fields that appear between them, such as Field2. If only Field2 appears between the other two fields, Field1:Field3Field1:Field3 and { Field1, Field2, Field3 }{ Field1; Field2; Field3 } are equivalent.)
Filtering students based on their grades
The second array does not need to reference the same array as the first array. Consider this formula:
The first array of the formula above contains student names and the second array contains their test scores. The formula returns an array of names of the students who scored more than 85: { "Sally", "Luke" }{ "Sally"; "Luke" }.
Counting the filtered elements
Use the SIZE function to determine the number of elements returned by FILTER:
The formula above returns the number of fields, in the Field1:Field100Field1:Field100 range, whose values are greater than 5.
Combining multiple conditions
Use the logical operators || to express "logical or" (disjunction) and && to express "logical and" (conjunction). This formula returns those fields whose values are greater than 5 or less than 2:
This formula returns only those fields whose values are both greater than 5 and are odd numbers:
Traditionally, spreadsheets use the + operator to express "or" and the * operator to express "and." Calcapp supports these operators too. (The functions OR and AND cannot be used in this context, as they return a single logical value, not a logical array.)
Cutting down on repetition with LET
FILTER often requires an array of elements to be given twice: as the first parameter, which provides the array to filter, and as the second parameter, which provides the logical array determining which elements to return.
In the examples above, the range Field1:Field3Field1:Field3 is used for demonstration purposes. However, formulas often need to devote much more space to specifying the array elements, which makes the repetition more of an issue.
Formulas with long, repeated sections are often harder to read and run more slowly than those without repetition. Moreover, when parts of a formula are repeated, errors can creep in when the formula is updated if only some of those repeated parts are updated.
The LET function solves this issue, by allowing values used in a single formula to be assigned names. Consider this formula, where the fields to filter are repeated twice:
With LET, := is used to assign the name
Fields
to the array of fields, which makes the actual FILTER
invocation much simpler:
Had the definition of Fields
been more verbose, the benefits
brought by LET would have been even greater.
Filtering all fields of a form group, form screen or the entire app
Ranges, such as Field1:Field10Field1:Field10, are a concise way to build the array of elements to pass to FILTER.
However, there is an alternative if you want to refer to all fields of a form
group, a form screen or even the entire app: the Fields
property, which is provided by form groups, form screens
and the app
itself.
This formula returns all visible fields of the app:
There are also properties that return only some fields, such as NumberFields of form screens and DateTimeFields of form groups.
By using one of these properties, you ensure that all fields of a certain form group, form screen or the entire app are referenced. Ranges, on the other hand, may need to be updated when new fields are added, so that the new fields are not erroneously left out of calculations.
Using a formula fragment to filter elements
Instead of passing a logical array as the second parameter, you can also use a formula fragment which is expected to return TRUE if the element should be included and FALSE otherwise. These formulas are equivalent:
The formula fragment you supply as the second parameter is run once per
element. The element can be accessed as the Element
value.
Moreover, the position of the element (1 for the first element, 2 for the
second element, etc) can be accessed as the Index
value and the
source array — given as the first parameter to FILTER — can be accessed as
the Source
value.
Use this formula to only return the last three elements of an array:
Using a formula fragment is useful if you need to compare array elements only against themselves. It has the benefit of saving you from having to repeat the source array in the formula. That particular benefit can also be had using the LET function (see above). Use whatever style suits you best.
Using a formula fragment to filter an array is a feature specific to Calcapp and is not found in spreadsheets.
Ranges and FILTER with formula fragments: a catch
There is one catch to using a formula fragment with FILTER. Consider the following formulas:
Both formulas perform identically if the range Field1:Field10Field1:Field10 only includes fields. If a text box appears between Field1 and Field10, however, the second version using a formula fragment generates an error. The reason is that the text box is included in the Field1:Field10Field1:Field10 range, and a text box cannot be "greater than 10." (TextBox1 > 10TextBox1 > 10 would generate an identical error.)
The Field1:Field10 > 10Field1:Field10 > 10 construct works around the problem by assuming that the text box is a blank value in this context. Of course, a blank value is considered to be equivalent to zero by >, meaning that the logical array returned by the Field1:Field10 > 10Field1:Field10 > 10 will actually be TRUE for elements that correspond to text boxes.
Another way of working around this issue is to reference fields using a property such as Fields of a form screen, instead of a range. As those properties only include fields, this issue is not relevant.
Using FILTER instead of the *IF functions
When combined with other functions, FILTER can be used as a replacement for functions like AVERAGEIF, COUNTIF and SUMIF. These functions all boil down to filtering an array before applying an additional operation.
SUMIF filters an array based on a condition and then adds the resulting numbers together, returning the result. These formulas are equivalent and return 1100, because only 100 and 1000 are greater than 25:
COUNTIF filters an array based on a condition and then returns the number of matching elements. These formulas are equivalent and return 2, because exactly two array elements are greater than 25:
The FILTER invocation in the second formula above returns { 100, 1000 }{ 100; 1000 }. The SIZE invocation then returns the size of that array, 2.
AVERAGEIF filters an array based on a condition and then returns the average of the elements matching the condition. These formulas are equivalent and return 550, because that is the average value between 100 and 1000:
When given a third parameter, AVERAGEIF, COUNTIF and SUMIF apply the condition to the first array, while applying the operation (averaging, counting and summing) to the first array.
These equivalent formulas add together 40 and 50, but not 30, as these numbers are associated with the text string "Sally":
Using FILTER instead of the *IFS functions
AVERAGEIF, COUNTIF and SUMIF also come in versions designed to be used with multiple conditions: AVERAGEIFS, COUNTIFS and SUMIFS. There are also two *IFS functions with no *IF counterpart: MINIFS and MAXIFS, which return the smallest and largest number, respectively, from an array, provided that a number of conditions are met.
Consider this AVERAGEIFS formula:
The formula above returns an average of the test scores obtained by female students who reside in Arkansas (abbreviated "AK"). Here, only one test score matches (82) and is returned, because while two female students appear in the data, only one of the students resides in Arkansas.
Here is the equivalent formula using FILTER:
The FILTER version is far more versatile. AVERAGEIFS can only be used with "logical and" and not "logical or" (or any other combination of logical operations). What if we want the average of all test scores associated either with female students or students residing in Arkansas? AVERAGEIFS does not provide this feature, but using FILTER, it's as easy as turning && into ||:
FILTER, MAP and REDUCE
FILTER, MAP and REDUCE are commonly used together, as a data processing pipeline to transform data. FILTER is used to remove irrelevant elements from an array, MAP is used to transform the remaining elements and REDUCE, finally, is used to transform the resulting array to a single value.
Instead of REDUCE, a simpler function can be used which reduces an array of values to a single value. The most popular such functions are SUM (which adds all array elements together and returns the result) and AVERAGE (which returns an average of all the array elements).
A single formula can use multiple invocations of MAP and FILTER. For instance, an innermost FILTER invocation can filter the raw array once, and then hand this data to MAP, which transforms the filtered data. This data can then, once more, be given to FILTER, which filters out additional elements, and so on.
Consider this formula:
The formula above filters the text array { "$326.60", "€402.80", "$290.00", "$128", "3002 SEK" }{ "$326.60"; "€402.80"; "$290.00"; "$128"; "3002 SEK" }, which lists amounts in various currencies, leaving only dollar amounts. It then converts these amounts to numbers, keeps only those amounts exceeding $200 and finally returns the sum of these amounts.
Working our way outwards from the text array, FILTER is applied to it and uses the formula fragment STARTSWITH(Element, "$")STARTSWITH(Element; "$") to only include array elements which start with a dollar sign. That leaves { "$326.60", "$290.00", "$128" }{ "$326.60"; "$290.00"; "$128" }.
MAP is applied to this array, with the formula fragment PARSENUMBER(Element)PARSENUMBER(Element), which converts the text array with textual amounts to a number array holding the same amounts: { 326.6, 290, 128 }{ 326,6; 290; 128 }.
FILTER is then applied to this array using this formula fragment, Element > 200Element > 200, which filters out all elements which are not greater than 200. That leaves the array { 326.6, 290 }{ 326,6; 290 }. Finally, SUM is applied to this array, returning the grand total 616.60.
Examples
Returns { 2, 3 }{ 2; 3 }. 1 is not part of the returned array, as the corresponding element in the second array is FALSE. 2 and 3 are both part of the returned array, as the corresponding elements in the second array are both TRUE.
Returns an array containing the elements which are greater than one, { 2, 3 }{ 2; 3 }. { 1, 2, 3 } > 1{ 1; 2; 3 } > 1 returns the logical array { FALSE, TRUE, TRUE }{ FALSE; TRUE; TRUE }.
Returns an array containing the elements which are greater than one,
{ 2,
3 }{ 2;
3 }. This variant uses the LET function to cut down on repetition, by
assigning the name Array
to the elements to filter. By doing
so, those array elements do not have to be repeated in the formula.
Returns an array containing the elements which are greater than one, { 2, 3 }{ 2; 3 }. This variant uses a formula fragment, which is run once per array element to determine whether the element should be part of the result. The formula fragment is expected to return TRUE if that is the case.
Returns the number of array elements which are greater than one. As only 2 and 3 are greater than 1, 2 is returned. This variant uses a formula fragment, which is run once per array element to determine whether the element should be part of the result. The formula fragment is expected to return TRUE if that is the case.
Returns an array containing the elements which are greater than one,
{ 2,
3 }{ 2;
3 }. This variant uses a formula fragment, which is run once
per array element to determine whether the element should be part of the
result. The formula fragment is expected to return TRUE if that is the
case. This fragment renames the Element
value, which
references the array element under consideration, to V. For longer
formula fragments, renaming a value can make the resulting formula
fragment much smaller. Alternatively, renaming a value can be used to
give it a name that more accurately describes the value.
Returns { 30 }{ 30 }, as only the last array element has an position greater than or equal to 3. (The first element has a position of 1, the second a position of 2, etc.)
Returns { 30 }{ 30 }, as only the last
array element has an position greater than or equal to 3. (The first
element has a position of 1, the second a position of 2, etc.) Here, both
the Element
and Index
values are renamed to
V
and I
, respectively. Even if the
Element
value is not needed, it still needs to be renamed,
as it appears before the Index
value.
Returns array elements which are evenly divisible by 10, meaning that { 10, 20 }{ 10; 20 } is returned.
Returns all array elements which are evenly divisible by Field1.ValueField1,Value, provided by the user.
Returns array elements which are evenly divisible by 50. As there are no such array elements, an empty array is returned.
Returns array elements which are evenly divisible by 50. As there are no such array elements, the third parameter is returned, { -50 }{ -50 }.
Returns { "Sally", "Luke" }{ "Sally"; "Luke" }, which are the names of the students who scored higher than 85, provided that the first array contains student names and the second array contains their scores.
Returns the background colors of the fields whose values are greater than 3 as an array.
Returns the background colors of the fields whose values are greater than 3. Field1:Field100Field1:Field100 is short-hand for an array which includes Field1, Field100 and all fields which appear between them.
Returns the background colors of all number fields of the app whose values are greater than 3. The NumberFields property of the app object returns all number fields of the app.
Returns a text string containing a comma-separated list of the labels of fields whose values are greater than 3. Field1:Field5Field1:Field5 is short-hand for an array which includes Field1, Field5 and all items which appear between them. TEXTJOIN takes the array returned from FILTER and joins the text strings together, separating them with a comma and a space. The second parameter, FALSE, ensures that blank values are not included.
Returns the values of every fifth field, among those listed. Field1:Field100Field1:Field100 is short-hand for an array which includes Field1, Field100 and all fields which appear between them.
Returns { 8, 16 }{ 8; 16 }, which includes only those elements of the { 8, 8, 16, 4, 2, 1 }{ 8; 8; 16; 4; 2; 1 } array which are greater than 7. UNIQUE ensures that there are no duplicate values, and SORT ensures that the returned array is sorted.