04. Formulas¶
Structure¶
Formulas have the following attributes:
- Name: A unique human-readable name
- Description: A description of what the formula does
- Variables Accepted: This is the list of variables needed for a formula to work. Variables have the following attributes:
- Name: A unique name (within the formula)
- Description: A description of how the variable fits into the formula
- Type: The type of data accepted.
- Value: The value that this variable is set to.
- Data Editable?: Whether data in cells generated by this formula is editable.
- Data Type: The data type of the column that applying this formula will result in.
Finding Formulas¶
In order to make finding the right formula easier, we should categorize formulas by the type of variable they accept.
- For example, if the user is trying to add a formula to a
DATE
column, we would only show them the formulas that accept theDATE
data type. - When the user starts by adding a formula column, we should provide an interface that helps users find the formula they want easier. We could provide an autocomplete interface based on formula name, categorize formulas based on type, and/or consider other options.
See below for a very rough representation of the idea.
Editing Formula Data¶
When a query is turned into a View, we will allow users to edit data from the View in some cases. Each formula that is editable will define its own editing behavior, but here are some general guidelines:
- Formulas that transform the original data don’t result in editable data.
- e.g. if we apply a “Extract Year” formula to a
DATE
column to transform, say,2020-01-03
into2020
, then the2020
will not be editable. - e.g. if we apply a “Count” formula to count the number of genres a movie has, we will not be able to edit the count.
- e.g. if we apply a “Extract Year” formula to a
- Formulas that directly represent the original data may be editable.
- e.g. if we apply a “List” formula to list the genres related to a given movie, we can add and remove genres to that list, which will create/delete records in the appropriate underlying tables.
- in the future, we may also support editing the genre names themselves.
Variable Types¶
To reduce repetition in the formula definitions, variable types are defined here and only the type name is referenced in the lists of formulas.
Single Record Column Reference¶
A column which has only one record related to the query’s reference point. This could be a column in the query, a column from the query’s base table, or a column that the base table has an FK relationship to (up to three levels deep).
If there are multiple ways to get to the column (i.e. there are multiple relationships between the query’s base table and the column), then the column reference should also include the relationship to use.
Multiple Record Column Reference¶
A column which has multiple records related to the query’s reference point, This is generally a column that has a reverse FK relationship (up to three levels deep) to the query’s base table.
If there are multiple ways to get to the column (i.e. there are multiple relationships between the query’s base table and the column), then the column reference should also include the relationship to use.
Column references to multiple related records also accept input filters, in case the query only needs to use a subset of the related records.
Query Column Reference¶
A Single Record Column Reference that’s a column in the same query.
Filters¶
A set of filters that can be applied to a column.
Single Record Text-Like¶
Either a Single Record Column Reference or a literal string.
If using a column reference, it can be of any data type but the data in the column will be treated like text.
Multiple Record Text-Like¶
A Multiple Record Column Reference. The column can be of any data type but the data in the column will be treated like text.
Single Record Number-Like¶
Either a Single Record Column Reference or a literal number.
If using a column reference, the data type must be representable as a number (e.g. Number, Money, Duration UI types).
Multiple Record Number-Like¶
A Multiple Record Column Reference. The data type of the column must be representable as a number (e.g. Number, Money, Duration UI types).
Single Record Boolean-Like¶
Either a Single Record Column Reference of data type boolean or a literal boolean.
Single Record Date-Like¶
Either a Single Record Column Reference of data type date or a literal date.
Single Record Datetime-Like¶
Either a Single Record Column Reference of data type datetime or a literal datetime.
Single Record Time-Like¶
Either a Single Record Column Reference of data type time or a literal time.
Single Record Date or Time-Like¶
Either a Single Record Column Reference of data type date/datetime/time or a literal of one of the same types.
Single Record Duration-Like¶
Either a Single Record Column Reference of data type duration or a literal duration.
Single Record List-Like¶
Either a Single Record Column Reference of data type list or a literal list.
Integer¶
Either a Single Record Column Reference of data type integer or a literal integer.
Choice¶
A selection from a pre-determined list of choices.
Regular Expression¶
Literal text expressing a regular expression.
List of Formulas¶
This is a list of view formulas that we should support in the alpha release of Mathesar, categorized into types.
- Record Aggregations Formulas that aggregate a column across multiple records
- Random Generators Formulas that generate random data
- Text Formulas Formulas that work on text data
- Number Formulas Formulas that work on number data
- Boolean Formulas Formulas that work on boolean data
- Date. Time, and Duration Formulas Formulas that work on date, time, and duration data
- List Formulas Formulas that work on list data
- Cumulative Formulas Formulas building on previous rows in the query
- Regular Expression Formulas Formulas that use regular expressions
- Custom Formulas An interface to support more complicated formulas