‘Data Explorer,’ Mathesar’s visual query builder, is described in this document. Users can analyze data, link data from other tables, and transform it using this feature.
The queries generated in the ‘Data Explorer’ will be listed and saved as queries within Mathesar. In contrast to standard database queries, “Mathesar queries” are not stored as SQL commands. This Mathesar-specific object was created because otherwise, it would be impossible to reproduce user configurations.
Read the Views Product Spec for a more in-depth explanation of this feature.
‘Data Explorer’ will be accessible in several ways because it is an essential step in numerous user flows.
There are many ways to access ‘Data Explorer.’ The navigation updates specification goes into detail about each one.
The base table determines which columns and links are available as input columns. When a user adds columns from linked tables, the base table also determines the automatic joins performed.
Tables can be linked to the base table using foreign keys that can be located in the base table or in other tables. The columns from all linked tables are shown in the column list and grouped according to the corresponding source table.
For each foreign key column, a collapsible list is automatically added to the list. Once expanded, a list of all the columns in the source table is displayed.
Although SQL has additional join options, the system will restrict joins to ‘LEFT JOIN’ type alone, in order to simplify things for less technical users and make it easier for them to learn the new concepts.
The user should set a base table at the start of the query building process. Changing it after the input columns have been selected will result in losing all progress.
A warning should be issued to users, so they know the implications of making this change.
The system will show all schema tables in the base table selector.
To choose a table, the user must first find it in the list and then click on it. They can also use the search box in the table selection menu to locate the table by filtering the list by table name.
After making a selection, the name of the base table is displayed in the selector, and previously disabled interface elements in ‘data explorer’ become interactive.
After selecting a base table, the user will be directed to the next stage, column selection.
The input table is built by selecting columns from the base table or generating new ones using formulas.
To add columns to a query configuration panel, the user must first select the
Add Column button in the
Columns portion of the query configuration panel. Due to this action, the
Column Selector component will be displayed in the inspector panel. The
Column Selector will display a list of all available columns and those already in use. In addition, a complete list of all formulas will be available.
The columns in
Column Selector are listed in a hierarchical structure based on the links that exist in both directions between the base table and other tables. The icons indicate whether the linked records are a single or multiple record collection.
Links are presented as nested sections containing the columns from the linked tables.
When adding input columns, the user can add them using the list controls or drop the columns directly into the result table.
Depending on the interaction, columns may be selected in a variety of ways:
|Single Selection||Click a list item|
|Drop Single Column||Click and drag list item to drop area|
|Multiple Selection||Click + Shift Key to select first and all items in between. For non-contiguous items, Click + Control Key.|
The system will default name columns to indicate their source table and column using the naming convention.
Depending on how tables are linked, some columns might contain multiple records associated with a record in the base table. Specific functionality to manage those records should be available to users.
|Filter||Show only a subset of the linked records|
|Aggregate||Display the multiple records as a combined record|
|Display Formats||Change the way each record item is displayed|
|Sort||Change the order in which the linked records are displayed|
|Limit||Limit the number of linked records|
When columns are added to the input table, their data type may differ from the source. A ‘Text Type’ column, for example, will change to a ‘Text List Type’ or ‘Number Type’ when added as a multiple record column due to automatic aggregation.
Once the input columns have been added, the user can inspect them by selecting each from the result table or the list of columns in the query configuration panel. The panel will include a section titled ‘Source’ in the details for each column. The source column, table, and links will all be listed in this panel portion.
The source will include
Links only for data sourced via tables connected to or from the base table.
Formulas are used to generate new columns based on different parameters. To access the formulas list, the user starts the
Add Column process and selects the option
From Formula at the top of the inspector panel. Selecting a formula will open a form that users can fill out to determine the values of the new column.
Depending on the selected formula, different settings will be available.
More on formulas and specific details for each will be covered in a separate issue. Columns generated from formulas will display a formula icon indicator in the column header.
Columns that link to multiple records can have filters applied to them to retrieve only values that match user-specified criteria. Multiple filters are allowed for each input column.
Filter options will be determined by the data type of the input column.
Users can add filters by clicking on the
Add Filter option from the query configuration panel or directly from the column header menu in the results table.
Columns with filters applied will display a filter icon indicator in the column header.
In cases where a link may refer to multiple values, input columns can be aggregated. In such cases, the system will perform automatic aggregation based on the data type of the referenced column. Users can change the type of aggregation at any time.
Aggregated columns will have an aggregation icon indicator in the column header.
Direct input columns can also be transformed into formulas. The available formulas will depend on the input column type.
The user will select the
Apply Formula option in the column header menu to transform the input column into a formula.
The result table refers to the resulting table from all input columns selected and added, including filters and aggregations. Users can transform this output by applying output filters, sorts, and summarizations to the result table.
Users can filter the output table by adding a filter step to any column from the result table. The column selector, in this scenario, will only allow users to select input columns rather than the whole column list from the column selection stage.
Summarization step lets users generate a summarized version of the result table. When added, the system will group the table values based on a summary column of their choice. The summary can be done using the summary column’s exact values or the data type’s available grouping options.
To summarize the table, the user selects the
Summarize option from the result transformations menu and adds the step. Once added, the remaining columns are automatically assigned an aggregation function, which users can change at any time. The data type of the output column is used to infer aggregates.
Users can sort the output table by applying a sort to any result table columns.
To sort the table, the user selects the
Sort option from the result transformations menu and adds it to the list. Once added, the user can set a column and a direction for the sort.
If the result table has transformations applied, new columns will be added automatically to the summarization steps and an aggregation set by default.
To delete a result transformation step, the user can click on the
Delete button present in each step item. Deleting a step means that the system will undo all transformations made through that step, and the result table will reflect the updated output.
When subsequent steps rely on columns resulting from a deleted step, the system will display error warnings for every failed step. The table output will only reflect the output generated by those stages without errors.
A preview, or query result table, should be visible while the user is in
Data Explorer. The result table will change based on the different configurations. For example, if a user applies a filter, the system should refresh the table to show the output with the filter applied.
Save Options, the user can name the resulting query and choose to save it. This would save the view as a query the system would run every time the user opens it.
The result table could be empty if a filter returns no results or if the base table is empty. Since queries will run every time the user opens a view, it is possible that the data has changed or that the filter condition returns no results.
Providing a count of rows and columns from the original base table would eliminate confusion around the data source (the table is not empty).
Additionally, users should be able to identify which input columns are filtered, see the applied criteria for each filter and remove the filters if needed.
If a user adds a column with multiple records without an aggregation, values from other columns will be duplicated. Adding aggregations by default for columns with multiple records would help users discover this functionality.
The system can determine the aggregation formula based on the column’s data type.
If a user deletes an input column utilized in a formula, that formula column must display the error and prompt the user to resolve it. The user can replace the affected column with another one or delete the formula column.
Input column names, when constructed, should have generated names that reference the source of the columns. For example, a column
full_name from the
Person table, linked by the column
Movie, would be named
When column names fit the patterns employed in Mathesar, we might detect unnecessary suffixes like
id and strip them out to shorten the names (e.g.,
sequel title instead of
Additionally, the generated name can include a link hierarchy. For example
sequel_prequel_sequel title for a column named
title added through Movie’s
Users can alter the default names at any time and still see the source information under the column details.
The inspector panel can be displayed in various modes depending on the currently active and selected objects. For example, clicking on a column will change the inspector’s content to show the details for that column.
Inspector displays query details and saves options by default.
The column selection mode is activated by clicking on the
Add Column option in the column selection step. When active, this inspector mode will list columns: