# Knolar's Anomaly Detector (KAD)

Successfully implementing predictive maintenance requires using the specific data collected from all of your machine sensors, under your unique operating conditions and then applying machine learning (ML) to enable highly accurate predictions. However, implementing an ML solution for your equipment can be difficult and time-consuming.

Knolar's Anomaly Detector(KAD, from here on) analyzes the data from the sensors on your equipment (e.g. pressure in a generator, flow rate of a compressor, revolutions per minute of fans), to automatically train a machine learning model based on just your data, for your equipment – with no ML expertise required. KAD uses your unique ML model to analyze incoming sensor data in real-time and accurately identify early warning signs that could lead to machine failures. This means you can detect equipment abnormalities with speed and precision, quickly diagnose issues, take action to reduce expensive downtime, and reduce false alerts.

For this, Knolar leverages the power of Amazon Web Services (AWS) infrastructure, as well as its Machine Learning capabilities, to provide you with best in class modeling capabilities. And all this with Cepsa's extensive experience in predictive maintenance.

# Key features

Here are some of the key characteristics of Knolar's Anomaly Detector:

# Automated Machine Learning

The Anomaly Detector will automatically leverage data from up to 300 sensors at once, as well as maintenance history, in order to search through up to 28,000 possible algorithm combinations and determine the optimal multi-variate model that best learns the normal behavior of the specified equipment.

# Serverless architecture and one-click deployment

Thanks to the use of AWS capabilities and Knolar integration, you won't have to worry about creating or operating any infrastructure; everything is automatically handled for you under the hood. Furthermore, once you have have validated a model and decided you want to use it for real-time monitoring of your equipment, you will only have to click once to deploy it, and create the inference scheduler. New data will automatically be redirected to the scheduler API endpoint, so that you can start enjoying the continuous monitoring of your equipment, all in one click!

# Model diagnostics

For each detected abnormal behavior, KAD will understand the behavior and indicate to a user which sensors are impacting the issue and what is happening in each of those sensors. Customers can use this information to diagnose the problem and take corrective action.

# Continuous monitoring

Once you have setup your inference scheduler, KAD will start monitoring your equipment. You will able to deep dive into the alarms, and see which sensors are behaving abnormally, so that you can take action.

# How it works?

The basic steps you need to follow to start monitoring your equipment are:

  • Ingest your data into Knolar, and select the sensors of your asset.
  • Train and evaluate your model.
  • Setup the inference scheduler.

Let's see each of these points in more detail.

# Data ingestion

Here, we assume you have created a data ingestion. For more details, see ??. It's is crucial that you provide the metadata of the data source, as we will need it to filter on the sensors you want to use.

From the main page of Knolar's Anomaly Detector, go to create dataset. There, you would have to choose first which components your equipment consists in (there must be at least one component). So, in case of a simple fluid pump, you might want to create three components: motor, bearing and pump itself. For each of this components, you have to select the timestamp (mandatory field) and at least one sensor. For sure, you might want to put all sensors in just one component (called pump, for instance). See Formatting data for details on sensor selection, formatting, and so on.

Once you have created all components, the selected data will be extracted and ingested into the service. This will typically take 15-30 minutes at most.

# Model training

After ingesting the data, you will be able to train your own models. For this, you only have to select the train and validation periods, specify the anomalous intervals (labels), and the sampling frequency for the training. Let's see those inputs in more detail:

  • Anomalous intervals (labels) - The anomaly detector tries to learn the normal behavior of your equipment from your data, and provide an alarm whenever new data is out of the expected range/distribution. Figure below depicts an example of a pump with two sensors, flow rate and revolutions per minute (RPM). Normal operation regimes entail lower flow rates for lower RPMS, and higher flow rate extractions for faster motor rotation. This is what the anomaly detector will learn as normal behavior. Thus, whenever it sees large RPMs accompanied by small flow rates (red dots in the figure), it will detect it as abnormal, and throw an alarm.

In order to let know the model that such operation regimes are undesired, we have to provide a file with all the time intervals in which you know/suspect there has been an abnormal operating mode. The file must have one line per abnormal interval, with the estimated start and end dates separated by a comma. For instance, our csv file might look as follow:

2020-06-12 00:00:00, 2020-07-12 00:00:00
2021-01-09 13:00:00, 2021-01-09 14:00:00

Here, we would have two periods of abnormal behavior: one if June 2020, lasting a month; and a second one in January 2021, lasting just one hour.

Note that these labeled periods are removed internally from training data. Thus, you must assure that you have 3 months worth of training data after removing the anomalous periods. Needless to say, the more data your provide, the better.

  • Train and validation periods - Any machine learning model requires some data to be trained on, and a different period to see if the model is generalizing properly (i.e., whether if it's learning anything useful that allows extrapolation outside the train data). KAD is no different in that respect, and so we allow you to input train and validation periods to first train the model, and then validate/measure it in a different time interval. These periods must be disjoint.

Model training is performed automatically for you, using up to ten machines in parallel. It will typically be finished in an hour, but it can take up to 7 hours. It mostly depends on the amount of data provided, the sampling frequency, and the number of labels passed to the system. Too many labels require lots of calculations and fine tune of model's internal parameters. Thus, it's better to group some of them if you have several short periods, as long as you meet data requirements.

  • Sampling frequency - The frequency at which you would like to train the model. Your data may come, for instance, every minute, but you might want to train it with 5 minutes sampling, to remove some of the noise associated with it, reduce costs, and so on. Note that the smaller the sampling frequency is, the longer it will take to train the model.

# Inference scheduler

After training and evaluating your model, you might decide to deploy it and start monitoring your equipment with that model. For this, you only need to input an inference frequency, that is, how often will the scheduler be making predictions (every 15 min, 30 min, etc.). Data for inference will come from the same sources you ingest, so make sure new data is arriving there in real-time. Next, you could start monitoring the alarms provided by the Anomaly Detector, see which sensors are causing it, and so on.

# Pricing

Knolar's Anomaly Detector enables you to detect abnormal equipment behavior using three simple steps. First, the service enables you to ingest historical data generated from sensors on industrial equipment. Second, it trains a custom machine learning model using that data to assess healthy patterns for your equipment. Third, it uses that model to infer abnormal patterns from incoming sensor data for continuous monitoring of your equipment. You are charged based on the amount of data ingested, the compute hours used to train your models using the ingested data, and the number of hours of inferencing run using your model. With KAD, you pay only for what you use, and there are no minimum fees and no upfront commitments.

# Data ingestion

You are charged per GB of data ingested into KAD. This data ingestion charge is for the historical data used to train your models. There is no ingestion charge for the data used for inferencing. Ingestion charges are prorated to the nearest MB. Costs of data ingestion is $0.20 per GB.

# Training hours

KAD will train a custom model with your data. You pay for the number of hours it takes to train your model. Knolar may provision more compute resources in order to quickly train multiple models and pick the best one. For example, it may provision a compute resource that is greater than the baseline (e.g. KAD could provision a compute resource that is 9x the baseline in order to complete training), and this means that if your model is trained and ready to use within the hour, the number of hours billed will be equivalent to amount provisioned above the baseline, so in our example above of 9x, the number of hours billed will be 9 hours. All model training jobs are charged for a minimum of one hour of elapsed time and then prorated to the nearest second. Costs of model traing are $0.24 per training hour.

# Inference hours

After you train a model, you can use the trained model to get results on new data coming from your equipment (receiving results from your model on new data is also known as inferencing). KAD allows you to set a schedule so that inference results are generated automatically for continuous equipment monitoring. You can schedule the frequency of inferencing to be once every 5 minutes, 10 minutes, 15 minutes, 30 minutes or 60 minutes. You are charged by the hour regardless of the set frequency. If scheduled inferencing is stopped, then the inference charges will be rounded up to the nearest hour. Inference cost are $0.25 per inference hour.

# Quotas and data selection

Your dataset should contain time-series data that's generated from an industrial asset such as a pump, compressor, motor, and so on. Each asset should be generating data from one or more sensors. The data that Lookout for Equipment uses for training should be representative of the condition and operation of the asset. Making sure that you have the right data is crucial. We recommend that you work with a Subject Matter Expert (SME). A SME can help you make sure that the data is relevant to the aspect of the asset that you're trying to analyze. We recommend that you remove unnecessary sensor data. With data from too few sensors, you might miss critical information. With data from too many sensors, your model might overfit the data and it might miss out on key patterns.

# Data selection

Use these guidelines to choose the right data:

  • Use only numerical data – Remove nonnumerical data. KAD can't use non-numerical data for analysis.

  • Use only analog data – Use only analog data (that is, many values that vary over time). Using digital values (also known as categorical values, or values that can be only one of a limited number of options), such as valve positions or set points, can lead to inconsistent or misleading results.

  • Remove continuously increasing data – Remove data that is just an ever-increasing number, such as operating hours or mileage.

  • Use data for the relevant component or subcomponent – You can use KAD to monitor an entire asset (such as a pump) or just a subcomponent (such as a pump motor). Determine where your downtime issues occur and choose the component or subcomponent that has the greater effect on that.

When formatting a predictive maintenance problem, consider these guidelines:

  • Data size – Although Lookout for Equipment can ingest more than 50 GB of data, it can use only 7 GB with a model. Factors such as the number of sensors used, how far back in history the dataset goes, and the sample rate of the sensors can all determine how many measurements this amount of data can include. This amount of data also includes the missing data imputed by Lookout for Equipment.

  • Missing data – Lookout for Equipment automatically fills in missing data (known as imputing). It does this by forward filling previous sensor readings. However, if too much original data is missing, it might affect your results.

  • Sample rate – Sample rate is the interval at which the sensor readings are recorded. Use the highest frequency sample rate possible without exceeding the data size limit. The sample rate and data size might also increase your ML model training time. Lookout for Equipment handles any timestamp misalignment.

  • Number of sensors – Lookout for Equipment can train a model with data from up to 300 sensors. However, having the right data is more important than the quantity of data. More is not necessarily better.

  • Vibration – Although vibration data is usually important for identifying potential failure, Lookout for Equipment does not work with raw high-frequency data. When using high-frequency vibration data, first generate the key values from the vibration data, such as RMS and FFT.

# Quotas

Service quotas, also referred to as limits, are the maximum number of service resources for your Knolar account.

Description Quota
Data Ingestion
Maximum number of components per dataset 3000
Maximum number of datasets per account 15
Maximum number of pending data ingestion jobs per account 5
Maximum number of models per account 15
Maximum number of columns across components per dataset (excluding timestamp) 3000
Maximum number of files per component (per dataset) 1000
Maximum length of component name 200 characters
Maximum size per dataset 50 GB
Maximum size per file 5GB
Maximum number of pending models per account 5
Maximum number of inference schedulers per model 1
Training and evaluation
Maximum number of rows in training data (after resampling) 1.5 million
Maximum number of rows in evaluation data (after resampling) 1.5 million
Maximum number of components in training data 300
Maximum number of columns across components in training data (excluding timestamp) 300
Minimum timespan of training data 180 days, 6 months recommended
Timestamp format for labels %Y-%m-%d %H:%M:%S
Inference
Maximum size of raw data in inference input data (5-min scheduling frequency) 5 MB
Maximum size of raw data in inference input data (10-min scheduling frequency) 10 MB
Maximum size of raw data in inference input data (15-min scheduling frequency) 15 MB
Maximum size of raw data in inference input data (30-min scheduling frequency) 30 MB
Maximum size of raw data in inference input data (1-hour scheduling frequency) 60 MB
Maximum number of rows in inference input data, after resampling (5-min scheduling frequency) 300
Maximum number of rows in inference input data, after resampling (10-min scheduling frequency) 600
Maximum number of rows in inference input data, after resampling (15-min scheduling frequency) 900
Maximum number of rows in inference input data, after resampling (30-min scheduling frequency) 1800
Maximum number of rows in inference input data, after resampling (1-hour scheduling frequency) 3600
Maximum number of files per component (per inference execution) 60

# Resources