blue-connected.png

Documentation

Introduction

Tesser Insights provides a robust yet easy-to-use self-service portal to onboard your data and derive insights from your data.

The portal is coupled with the capability to integrate custom data with enterprise data through a set of features. They are

  1. Data Catalog – the one place where users can view all objects that have been onboarded onto the Data Lake either through the self-service portal or through backend processes that load into the Data Lake. The objects could be tables, jobs / routines created for data wrangling, analytical models, reports, and dashboards

  2. Data Ingestion – from either files or other destinations that are supported by Azure. Azure supports close to 80 destinations across various clouds, databases, and file types. Data can be ingested from any of these sources and loaded into the Data Lake. While ingesting the data, sensitive data can be masked to restrict access.

  3. Data preparation for Analysis – most often data that is onboarded is not consumable in its native structure or format. Using our Cleanse and Transform features that have a powerful yet easy to use interface, you can cleanse and transform data to suit the analytical requirements.

  4. Analyse – The analyse feature enables a power user to perform some predictions / forecasting on the data. A unique feature that is offered as part of Analyse is the TAI (Tesser Actionable Insights) – Tesser Actionable Insights – a feature that automatically analyses and profiles the data while ingestion. This feature looks for data associations, anomalies in the data and predicts outcomes. All these happen as soon as the data is ingested into the self-service portal and is available as part of “Tesser Actionable Insights.” Along with this, the user can also perform custom analysis, look for associations in the data manually, and handle missing values / outliers manually.

  5. Visualize – Reports and Dashboards are an integral part of analysis and insights. An integrated Power BI reporting environment that can be plugged onto the dataset used for analysis from within the portal enables the users to build reports from within the self-service portal without switching interfaces.

All these features can be driven in a workflow-based fashion through intelligent role based / data driven recommendations provided.

Landing Page

On typing the URL for the self-service portal, the user in the browser, the user is provided will a landing page.

Landing Page1.png

After logging into the portal using the Tesser Insights Azure id, the user can type a search text and click search or simply click on the search button. The user is then redirected to the Data Lake page. From here, the user can perform any of the following operations described below.

Landing Page2.png
 

Features

Ingest

This feature provides the user with the ability to bring his/her custom data needed for analysis into the data lake.

After a file is ingested, it is visible in the files section of the data lake. The contents of the file can be previewed in the data catalog, the file can be shared with other users, complete data of the file viewed. Some other features are to copy files, download files that are owned by self. 

Feature-Ingest1.png

Note : 

Currently the platform has the ability to ingest CSV files upto 15 MB in size. In the next release we plan to extend the ingestion capability to other standard file formats such as xls, text files with standard delimiters.

Currently the platform has the ability to ingest readable text files of extensions - .CSV, .XLS, .XLSX, .TXT with delimiters ( pipe(|),comma(,),semicolon(; ), tab(\t) upto 15 MB in size.

Convert to Table

Once a file is ingested, it needs to be converted into a SQL table for further processing and analysis. This capability is made available by this feature. An ingested file can be converted to table by clicking the “Convert to table” icon in the right side lower section of the page. On clicking on this icon, a pop-up opens, showing the first 100 records. At this point, the datatypes of the attributes are automatically identified based on the data and suggested to the user. The user is given the option to change the datatype to a more appropriate datatype.

The created table is available in the “Datasets” section of the data lake

Feature-Covert to table1.png

Note :

These datatypes are kept generic. Currently the supported datatypes are

  1. Nvarchar – For alphanumeric values

  2. Bit – For True /False  or 1/0  values (Boolean values)

  3. Int – For Integer values

  4. Bigint – For integer values that are more than 10 digit long

  5. Decimal – For decimal values. Currently defaulted to (18,2) – 2 digits after decimal point.

  6. Datetime -For date datatypes that may or may not include time.

Feature-Covert to table2.png

There are few other features that are available in this page

  1. Load into existing table – a user can choose to load data into a table that can be created during this process or into an already existing table. If loading into an existing table, the name of the schema and the table will be a dropdown, else the user can choose the schema in which the table has to be created and provide a name for the table

  2. When loading into a table, an option to choose the type of load is provided

    1. Complete refresh -  this is relevant when a user wants to keep loading data into a table but wants to flush out existing data and load new data. In this scenario, when loading into an existing table, the file structure has to match the table structure into which the file will be loaded. The user has an option to view the table structure in the “Datasets” section of the Data Catalog.

    2. Append Only – when new data needs to be appended to an existing dataset.

    3. Incremental Load – when new data needs to be appended to an existing dataset and/or existing data needs to be modified.

  3. Primary Key – Given that the user is aware of the data that he/she is dealing with, the option to choose the column that contains unique values is provided. This selection of primary key is optional in a complete refresh load type, not allowed in an append only load type and mandatory in an incremental load.

  4. Choose column – The user can choose to select all columns or restrict columns for converting into table

  5. Rename a column – modify the column name of a column

Note : 

  1. Table name should not contain spaces or special characters

  2. Column name should not contain special characters

  3. The ingested file should be of ‘UTF-8’ for format

  4. The ingested file cannot have scientific format in any column.

  5. The first record in the file should be the column name / attribute name

 
 
 

Prepare

All datasets can be modified / cleanses/ massaged for data pre-processing. They are achieved through cleanse and transform routines. These routines are available in the “Prepared” section of the Data Catalog.

Cleanse

Most often there are scenarios where the incoming data is not clean and might need some massaging / cleansing. Data Cleansing is possible on a dataset that is available in the “Datasets” section of the Data Catalog. A user can select a dataset and click on the cleanse button. This opens up the cleanse feature that can be applied on the dataset.

Prepare-Cleanse1.png

Based on the datatype of each attribute a certain set of cleanse operations are possible. This list of operations will be expanded in subsequent releases. Apart from these, custom functions / logic /logged in username / current timestamp can be added to the dataset.

User can preview the output of the operation by clicking on the eye icon.

Convert to Table

Once a file is ingested, it needs to be converted into a SQL table for further processing and analysis. This capability is made available by this feature. An ingested file can be converted to table by clicking the “Convert to table” icon in the right side lower section of the page. On clicking on this icon, a pop-up opens, showing the first 100 records. At this point, the datatypes of the attributes are automatically identified based on the data and suggested to the user. The user is given the option to change the datatype to a more appropriate datatype.

The created table is available in the “Datasets” section of the data lake

Feature-Covert to table1.png

Note :

These datatypes are kept generic. Currently the supported datatypes are

  1. Nvarchar – For alphanumeric values

  2. Bit – For True /False  or 1/0  values (Boolean values)

  3. Int – For Integer values

  4. Bigint – For integer values that are more than 10 digit long

  5. Decimal – For decimal values. Currently defaulted to (18,2) – 2 digits after decimal point.

  6. Datetime -For date datatypes that may or may not include time.

Feature-Covert to table2.png

There are few other features that are available in this page

  1. Load into existing table – a user can choose to load data into a table that can be created during this process or into an already existing table. If loading into an existing table, the name of the schema and the table will be a dropdown, else the user can choose the schema in which the table has to be created and provide a name for the table

  2. When loading into a table, an option to choose the type of load is provided

    1. Complete refresh -  this is relevant when a user wants to keep loading data into a table but wants to flush out existing data and load new data. In this scenario, when loading into an existing table, the file structure has to match the table structure into which the file will be loaded. The user has an option to view the table structure in the “Datasets” section of the Data Catalog.

    2. Append Only – when new data needs to be appended to an existing dataset.

    3. Incremental Load – when new data needs to be appended to an existing dataset and/or existing data needs to be modified.

  3. Primary Key – Given that the user is aware of the data that he/she is dealing with, the option to choose the column that contains unique values is provided. This selection of primary key is optional in a complete refresh load type, not allowed in an append only load type and mandatory in an incremental load.

  4. Choose column – The user can choose to select all columns or restrict columns for converting into table

  5. Rename a column – modify the column name of a column

Note : 

  1. Table name should not contain spaces or special characters

  2. Column name should not contain special characters

  3. The ingested file should be of ‘UTF-8’ for format

  4. The ingested file cannot have scientific format in any column.

  5. The first record in the file should be the column name / attribute name