The Data Studio¶
The data studio provides a way to construct a report element from raw data and apply transformations, filters and visual styling in a single interface.
The left panel of the data studio contains datasets you can use to build with. There are three tabs:
The data tab shows the datasets defined in the current account as well as a reference to the local in-browser database (which will be empty unless you have inserted data into it).
Not all datasets may be shown if you have a large number defined. If so, search for them by name in the search bar.
Lists all public datasets syndicated by Rational BI.
Displays any datasets already in use by the report. Datasets will be added here as you use them in code or visualizations.
The left pane of the data studio window also contains a list of tables and table columns found in the datasets selected for use in the current report.
You can scroll the list vertically and filter it by searching by column name. The source columns can be used for building reports by dragging them into into the column areas in the top middle of the data studio.
Drag columns into any of these areas to build the report.
A dimension is the lens through which you are looking at your data. It is the way you “categorize” the data. The most common dimension is, arguably, time. For example, if I am looking at sales by month in 2015, then the dimension is the months of the year 2015. However, you can categorize data in non-chronological ways as well: for example, if I am looking at GDP by country, the dimension is country.
The dimension adds context to your measure and helps you to better understand what your data means.
It is entirely possible to look at data through different dimensions at the same time.
Fields containing qualitative, categorical information are typically dimensions.
Measures are the numerical values that quantify the data set that you are digging into to understand better.
Fields containing numeric (quantitative) information are typically defined as mesaures.
Break Down By (Pivot Columns)¶
Pivoted columns help you create a table that shows summarized numerical data grouped by multiple categories.
When you work with a pivot view, it’s easy to move categories from the rows (listed down the side on the left) to the columns (listed across the top) and vice-versa.
You can think of it as a pivoting action centered on the top-left corner of the view.
The ability to pivot the view around one or more columns allows you to dynamically rearrange, group and summarize data for easy analysis of large sets of data.
You can transform data in a table into interactive and meaningful summaries easily by dragging dimensional columns into the area for breaking down data.
Do not pivot columns with large numbers of unique values
Since pivoting around a column creates a new column per unique row of data, you'll want to only pivot around columns with low cardinality, in other words, columns with only few distinct values. This is even more important when pivoting around multiple columns at the same time and the number of total columns can be very large.
The user interface has a limit of one thousand visible colums in order to ensure that browser performance is not impacted.
Editing Column Properties¶
Click on any column or measure to edit it. The right side pane in the data studio will show column properties. Scroll up and down to see all properties.
Arranging column order¶
Rearrange the order of the columns by dragging them. Dimensions appear to the left of measures in the list.
Setting column sort order¶
Set column sort order by clicking on the table headers. It is possible to sort by multiple columns. To sort on multiple columns, hold down shift while clicking the column header.
Column sort order is as follows:
ascending -> descending -> none.
Wwhen you click a column that is not sorted, it will sort ascending. The next click will make it sort descending. Another click will remove the sort.
Joining Multiple Tables¶
The data studio will attempt to perform a database join when columns from multiple tables are dragged into the table. The system will attempt to find one or more common columns by looking for shared ontologies. If no shared ontologies can be found, a custom join can be specified or the default join can be overridden.
Select the columns from the source and destination table that should be used to combine the tables.
Two types of joins are suppoorted:
left outer join, in other words, data in the second table will be included if it can be found for a given row. If no corresponding data can be found, the row will be contain empty slots.
inner joins, where a corresponding row must be found in the second table for the combined row to be included in the resultset.
Other joins and join conditions are possible and can be implemented through SQL queries.
The top right section of the data studio allows you to set the appearance of the report component. New components start out as tables, but other types can also be selected based on the suitability of the configured data.
The table is the default visualization and displays the data as it is retrieved from the data source. Use the table to preview the data before picking a final visualization.
Charts come in various forms with line and bar charts being the most common. Each type of chart has specific data requirements where you need to provide suitable and sufficient data to the chart in order for it to display.
The details of each type of visualization is described in the section about Available Visualizations
In order to visualize data on a geographical map, you'll need to configure both a latitude and a longitude column. These coordinates will be used to render a marker into the map for each row in the data.
The view into the database can be adjusted to control the data that is retrieved from the database.
SQL Query Mode¶
Query the database by editing SQL directly by selecting
SQL Query Mode. The table and column list will be adjusted to be more appropriate for SQL entry and certain query building user interface features will be disabled.
Learn about building SQL queries in Writing SQL Queries.
The data studio will attempt to reverse-engineer the query to map resulting columns to appropriate metadata and data types.
Filter the resultset by editing the view-level data filter.
View-level data filters will be added to the main SQL
WHERE clause and will apply to the entire resultset.
Limit the number of rows returned from the database. This feature corresponds to the SQL
Summarize Remaining Values¶
Enabled when row limit is set and summarizes any rows past the row limit into a single
summary row. The summary row will be aggregated based on the aggregation settings of the individual measure columns.