
Spotfire provides out-of-box functionality to apply Line Similarity and K-Means clustering to visualizations from the tools menu. This fit can then be used to identify extreme deviate points- outliers! Lines and curves in Spotfire visualization properties lets you insert a curve fit or a line fit to the data.

The y-variables for visualization types available in Spotfire can be aggregated to display outlier counts, percent outliers, percentiles and quartiles. For such cases, Spotfire allows you the flexibility to insert lines from custom expressions without depending entirely on predefined methods of outlier detection. But this is just the nature of the distribution that the points follow. Consider the case of data from a standard normal distribution, about 5% of the data falls beyond 2 standard deviations and thus will be picked up as outliers by common statistical tests. User can also insert custom lines for isolating outliers in multimodal data. Figure shows histogram with outliers identified as points beyond 2 standard deviations from mean The overview also contains measures such as standard deviance and mean, which when inserted as lines onto the histogram smartly identify outliers for distributions. The column overview data panel for in-memory as well as in-db data shows a histogram of distribution for numerical columns. Parallel Coordinate Plot (PCP) multivariate analysis for outlier detection.Combination plot in Pareto chart configuration to identify outliers based on cumulative value.Scatter plot in QQ plot configuration to identify bivariate outliers in distributions.Bar chart in histogram configuration to identify univariate outliers.

Other plots from Spotfire quick access menu that are commonly used to identify outliers: The tooltip label includes additional information about the outlier which is different compared to all other data points in the plot.

Any point falling outside of LAV and UAV are marked as outliers. In addition to the above, Spotfire provides lower adjacent value (LAV) and upper adjacent value (UAV) defined as follows: You can use TIBCO Spotfire to smartly identify and label outliers in 10 ways.īox and whisker plot (box plot) shows the relationship between a numerical y-variable and a grouping x-variable by using the five number summary-minimum, first quartile (Q1), median, third quartile (Q3), maximum. It becomes essential to detect and isolate outliers to apply the corrective treatment. In practice, outliers could come from incorrect or inefficient data gathering, industrial machine malfunctions, fraud retail transactions, etc. Mathematically, any observation far removed from the mass of data is classified as an outlier.
