Data Anomaly Detection
Boost your business systems’ reliability with an AI prediction software
The Time Series Analysis (TSA) module is an automated and real time anomaly detection system that uses statistics and machine learning to identify deviations from normal behavior in time series data.
TSA ingests time-series data of all types and selects the best-fitting detection model for your data to ensure a high accuracy of irregularity detection. Customize the service to detect any level of anomaly and monitor your alerts on a clean dashboard.
Anomaly detection refers to the problem of finding data instances that do not conform to expected behavior. The importance of anomaly detection is due to the fact that anomalies in data translate to significant actionable information in a wide variety of application domains. In finance for example, an error in a security pricing can lead to a huge profit or huge penalties.
State of the art detection engine uses our patented technology to leverage huge amounts of data in the fastest way and automatically identifies critical anomalies in historical data or in real time.
Customizable detection lets you set your own detection levels and define what should be considered an anomaly and what should not.
Automatic detection does not require annotating training data. Save time by providing unlabeled data to the system and only focus on fixing the problems ASAP.
Dynamic system adapts to the changes in behavior of your system. Since the normal behavior changes as the system evolves TSA automatically adapts and keeps the level of false alarms low.
How it works
TSA uses a powerful engine that finds anomalies in the time series data:
- It starts by computing similarities between the time series uploaded to the app
- It trains a predictive model that forecasts future values of the time series
- It builds a customizable confidence envelope around the predictions that triggers alerts based on the position of the actual value
Figure 1: TSA system overview
2OS TSA enables you to leverage state of the art AI to monitor and identify anomalous behaviour in your time series data without prior knowledge of machine learning. Regardless of the industry or scenario, the TSA module can ingest a large volume of data to build normality margins around the time series and hence identify irregularities.
Figure 2: Normality margins visualisation
More on our technology
We use a simple and fast method for automated and real time detection of anomalies in time series data.
A time series is a sequence of data points being recorded at specific times. Time series data have a natural temporal ordering. This makes it distinct from other data problems, where there is no natural ordering of the observations, and from spatial data analysis, where the observations typically relate to geographic locations. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as being derived in some way from past values, rather than future values.
Anomaly detection refers to the problem of finding samples in data that do not conform to expected behavior. The importance of anomaly detection is due to the fact that abnormalities in data translate to significant actionable information in a wide variety of application domains. In finance for example, an anomaly in security pricing can lead to a huge profit if detected early enough.
The algorithm works as follows:
- Preprocessing: Fetch the time series of interest from the database, split into windows and normalize.
- Peer selection: Compute similarity between the time series of interest and the others in the database. Select the time series with the highest similarity scores.
- Build the regression model: Train a regressor on a window of data points starting from the beginning of the time series. When new data points are retrieved, retrain periodically.
- Time series prediction: Predict the next value of the time series using the regressor trained in step C.
- Anomaly detection: Build a confidence interval around the predicted value and trigger an anomaly alert if the value is outside the constructed confidence interval.
TSA can be customized to fit any type of time series data with 3 parameters:
- Number of peers: number of time series to use as alternative data.
- Minimum similarity level: minimum similarity score of a peer.
- Confidence margin: field specific margin to add on top of the confidence interval predicted by the algorithm.
Extremely simple to start with: Drop your data in the app via a CSV file or an API and you are set to get your first insights.
Data adaptive: depending on the amount of data, TSA automatically selects the appropriate method between a statistical model and a neural network to best fit your data. Our technology is patented.
No code needed: once your app is set up in the studio everyone can use it. No need to be a machine learning practitioner to tune the system to your needs.
Applicable to virtually any scenario: as long as your problem provides time series data, TSA will handle your scenario and will ensure a high detection rate with a low false alarm rate: Fraud detection (credit cards, insurance, etc.), stock market analysis, manufacturing predictive maintenance, etc.
Extremely low processing time: TSA training time is in the order of seconds even with time series spanning over decades thanks to our custom models.
Investment fund valuation errors detection
One of the many activities of a depositary bank is the monitoring of the daily published net asset value of all of its funds and making sure that they are correct. To ensure that, a check is performed twice a year by the accounting team. In addition, the bank uses a thresholding system that triggers alerts whenever a daily variation exceeds a threshold. Although this system is enough to find most of the errors, it generates a lot of false alarms and thus induces high operational costs since the accountant has to verify the valuation manually. With TSA, we integrated over 25,000 share classes with time series data ranging from 1 month to over 10 years of daily valuations. On average, the training takes less than 10 seconds and the generated model greatly reduces the number of false alarms per year and hence the operational cost.
Step 1: Feeding data
To upload time series data to a TSA app, you can either send it through an API or upload a CSV file.
Figure 3: CSV file example to upload to your app
To POST data to the app through an API, the body of the request must follow the structure below:
Step 2: Parametrize the model
You can either opt for the default parameters or tune your own. Depending on your application and your data, you will have to calibrate the model to get the best results.
Figure 4: Visualization of the parameters setup
Step 3: Visualize the analysis
Once your parameters and data are ready, you can start the detection. The model is extremely fast so you will get the results in a matter of seconds.
Figure 5: Visualization of peers and alerts
Step 4: Export your analysis
You can export the analysis to a CSV format or feed it to another application with the Studio API. You can also set up your workflow to trigger actions like sending emails or slack notifications to investigate the errors.